JP2021197584A

JP2021197584A - Multiple signal conversion device and program thereof, and receiver

Info

Publication number: JP2021197584A
Application number: JP2020101091A
Authority: JP
Inventors: 侑輝河村; Yuki Kawamura; 知也楠; Kazuya Kusunoki; 裕靖永田; Hiroyasu Nagata; 悠喜山上; Yuki Yamagami; 浩一郎今村; Koichiro Imamura
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2020-06-10
Filing date: 2020-06-10
Publication date: 2021-12-27

Abstract

To provide an MMT conversion device that can play video and audio normally regardless of the suitability of CMAF.SOLUTION: A MMT conversion device 1 includes a packet filter 10 that separates MPU metadata, movie fragment metadata, a control message, and a media fragment unit from CMAF-applied MMT, a message buffer 11, a descriptor conversion unit 12 that converts DTS-PTS difference information of the movie fragment metadata to an extended MPU time stamp descriptor, a descriptor addition unit 13 that adds an extended MPU time stamp descriptor to the control message, a parameter set extraction unit 14 that extracts a parameter set from MPU metadata, a parameter set addition unit 15 that adds the parameter set to the media fragment unit, an MPU buffer 16 that outputs the media fragment unit in MPU units, and a packet mixing unit 17 that mixes the control message and the media fragment unit.SELECTED DRAWING: Figure 2

Description

本発明は、多重信号変換装置及びそのプログラム、並びに、受信機に関する。 The present invention relates to a multiplex signal converter, a program thereof, and a receiver.

従来のデジタル放送で用いられているＭＰＥＧ−２ＴＳ（Transport Stream）に代わる、ＩＰ（Internet Protocol）ベースの新たなメディアトランスポート方式の国際標準規格として、ＭＭＴ（MPEG Media Transport）が策定されている（非特許文献１）。また、日本国内のデジタル放送サービスにおけるＭＭＴの利用方法が規格化され（非特許文献２）、ＭＭＴを採用した新４Ｋ８Ｋ衛星放送が２０１８年１２月に開始された。 MMT (MPEG Media Transport) has been established as an international standard for a new IP (Internet Protocol) -based media transport method that replaces the MPEG-2 TS (Transport Stream) used in conventional digital broadcasting. (Non-Patent Document 1). In addition, the method of using MMT in digital broadcasting services in Japan has been standardized (Non-Patent Document 2), and a new 4K8K satellite broadcasting using MMT was started in December 2018.

ＭＭＴで規定されるアプリケーション層のパケットフォーマット（パケットヘッダのデータ構造）をＭＭＴＰ（MMT Protocol）と呼ぶ。ＭＭＴＰパケットは、ＵＤＰ（User Diagram Protocol）／ＩＰパケットのペイロードとして、図８（ａ）に示すように、放送や通信の伝送路上を片方向に伝送される。 The packet format (data structure of the packet header) of the application layer defined by MMT is called MMTP (MMT Protocol). As shown in FIG. 8A, the MMTP packet is transmitted in one direction on a transmission path for broadcasting or communication as a payload of a UDP (User Diagram Protocol) / IP packet.

ＭＭＴでは、映像・音声コーデックの処理単位をＭＰＵ（Media Processing Unit）と呼ぶ。ＭＰＵの先頭データは、過去に送信されたデータに依存せずに処理が可能なランダムアクセスポイントである必要がある。ＭＭＴの放送利用を規定する非特許文献２では、映像符号化のイントラ（Intra）フレーム（フレーム内圧縮を行うフレーム）を先頭とするＧＯＰ（Group Of Picture）をＭＰＵとして扱う。なお、映像符号化方式の一例として用いられるＨＥＶＣの規格上ではＧＯＰという用語は使用されていないが、ＭＰＥＧ−２Ｖｉｄｅｏなどの従来方式にならい、イントラフレームを先頭とするフレームの集合を便宜上、ＧＯＰと呼ぶことがある。 In MMT, the processing unit of the video / audio codec is called MPU (Media Processing Unit). The head data of the MPU needs to be a random access point that can be processed independently of the data transmitted in the past. In Non-Patent Document 2 that regulates the use of MMT for broadcasting, a GOP (Group Of Picture) headed by an intra frame (frame for in-frame compression) of video coding is treated as an MPU. Although the term GOP is not used in the HEVC standard used as an example of the video coding method, the GOP is a set of frames starting with the intra frame, following the conventional method such as MPEG-2 Video. May be called.

図８（ｂ）に示すように、放送サービスでは、受信チャンネル変更時のランダムアクセス性を確保するため、０．５秒程度の周期でＧＯＰが構成される。具体例として、ＨＥＶＣでは、３２フレームをＧＯＰとする場合がある。音声符号化方式の一例として用いられるＡＡＣ（Advanced Audio Coding）では、例えば、音圧をサンプリング周波数４８ｋＨｚでサンプリングした音声サンプルについて、１０２４サンプルごとに独立して符号化処理を行ったデータブロックをＡＵ（Access Unit）として扱う。一般に、各ＡＵの先頭がランダムアクセスポイントとなるが、ＭＰＵの中に複数のランダムアクセスポイントがあっても構わないため、１つ以上の音声ＡＵの集合をＭＰＵとして扱うことができる。ＭＭＴＰにより伝送されるＭＰＵは、ＭＰＵシーケンス番号によって一意に特定ができる。ＭＰＵシーケンス番号は、該当ＭＰＵをペイロードとして格納するＭＭＴＰパケットのＭＭＴＰペイロードヘッダ部に記載される。 As shown in FIG. 8B, in the broadcasting service, the GOP is configured at a cycle of about 0.5 seconds in order to ensure random accessibility when the receiving channel is changed. As a specific example, in HEVC, 32 frames may be set as GOP. In AAC (Advanced Audio Coding), which is used as an example of a voice coding method, for example, a data block in which sound pressure is sampled at a sampling frequency of 48 kHz and data blocks are independently coded for each 1024 samples is AU (). Access Unit). Generally, the head of each AU is a random access point, but since a plurality of random access points may be included in the MPU, a set of one or more voice AUs can be treated as an MPU. The MPU transmitted by MMTP can be uniquely identified by the MPU sequence number. The MPU sequence number is described in the MMTP payload header portion of the MMTP packet that stores the corresponding MPU as a payload.

非特許文献１によれば、本来、ＭＰＵは、ＩＳＯＢＭＦＦ（ISO Base Media File Format）形式をベースとして規定されている。また、非特許文献１では、ＭＭＴＰにより、ＩＳＯＢＭＦＦのメタデータ部分をＭＰＵメタデータ及びムービーフラグメントメタデータとして送信する方法が規定されている。しかし、非特許文献２で規定される放送用途では、処理の低遅延化を図るためムービーフラグメントメタデータの生成・伝送を省略しており、ＨＥＶＣエンコーダが生成するＮＡＬ（Network Abstraction Layer）ユニットをそのままメディアフラグメントユニットとして、ＭＭＴＰ／ＵＤＰ／ＩＰパケットに多重して送信している。 According to Non-Patent Document 1, MPU is originally defined based on the ISOBMFF (ISO Base Media File Format) format. Further, Non-Patent Document 1 defines a method of transmitting the metadata portion of ISOBMFF as MPU metadata and movie fragment metadata by MMTP. However, in the broadcasting application specified in Non-Patent Document 2, the generation / transmission of movie fragment metadata is omitted in order to reduce the processing delay, and the NAL (Network Abstraction Layer) unit generated by the HEVC encoder is used as it is. As a media fragment unit, it is multiplexed and transmitted to MMTP / UDP / IP packets.

ＩＳＯＢＭＦＦは、基本のデータ構造が非特許文献３で規定されており、ＭＰＥＧ−４規格の一部であることから、一般的にはＭＰ４（.mp4）と呼ばれることがある。ＩＳＯＢＭＦＦで定義されるメタデータ記述方法であるＢｏｘ形式は、拡張性があり、アプリケーションの要求に応じて、新たなメタデータのデータ構造の追加や、より詳細な運用方法を規定できる。例えば、非特許文献４では、ＨＴＴＰ（Hypertext Transfer Protocol）／ＴＣＰ（Transmission Control Protocol）通信を用いた映像ストリーミング配信方式であるＭＰＥＧ−ＤＡＳＨ（Dynamic Adaptive Streaming over HTTP）において、詳細なＩＳＯＢＭＦＦメタデータの運用方法やデータ構造を規定している。ＭＰＥＧ−ＤＡＳＨでは、数秒から数十秒程度の映像をファイル化したセグメントをＷＥＢサーバ上に公開し、再生端末はマニュフェストファイルに従ってセグメントを連続的にダウンロードして映像再生を行う。 ISOBMFF is generally called MP4 (.mp4) because its basic data structure is defined in Non-Patent Document 3 and is a part of the MPEG-4 standard. The Box format, which is a metadata description method defined by ISOBMFF, is extensible, and can add a new metadata data structure or specify a more detailed operation method according to the request of the application. For example, in Non-Patent Document 4, detailed ISOBMFF metadata operation is performed in MPEG-DASH (Dynamic Adaptive Streaming over HTTP), which is a video streaming distribution method using HTTP (Hypertext Transfer Protocol) / TCP (Transmission Control Protocol) communication. It defines the method and data structure. In MPEG-DASH, a segment in which a video of several seconds to several tens of seconds is filed is published on a WEB server, and a playback terminal continuously downloads the segment according to a manifest file and plays back the video.

ＭＰＥＧ−ＤＡＳＨと同様にＨＴＴＰ／ＴＣＰを用いる動画ストリーミング配信方式として、非特許文献５に規定されているＨＬＳ（HTTP Live Streaming）が知られている。現在、ＨＬＳは、ＭＰＥＧ−ＤＡＳＨと並んで広範に使用されている。ＨＬＳでは、当初、ＭＰＥＧ−２ＴＳ形式のセグメントを採用していたが、ＭＰＥＧ−ＤＡＳＨと同じＩＳＯＢＭＦＦ形式のセグメントを使用できるように改定された。これにより、ＭＰＥＧ−ＤＡＳＨとＨＬＳでは、再生時に用いるマニュフェストファイルが異なるもののセグメントを共通化することで、ＣＤＮ（Contents Delivery Network）等を用いた映像配信の効率化が可能となった。 HLS (HTTP Live Streaming) specified in Non-Patent Document 5 is known as a moving image streaming distribution method using HTTP / TCP as in MPEG-DASH. Currently, HLS is widely used alongside MPEG-DASH. Initially, HLS adopted the MPEG-2 TS format segment, but it was revised so that the same ISOBMFF format segment as MPEG-DASH can be used. As a result, MPEG-DASH and HLS can improve the efficiency of video distribution using a CDN (Contents Delivery Network) or the like by sharing segments of different manifest files used during playback.

ＭＰＥＧ−ＤＡＳＨとＨＬＳで共通に使用できるＩＳＯＢＭＦＦベースのセグメント形式については、非特許文献６でＣＭＡＦとして規定されている。なお、ＣＭＡＦは、非特許文献１及び非特許文献２よりも新しい規格である。 An ISOBMFF-based segment format that can be commonly used in MPEG-DASH and HLS is defined as CMAF in Non-Patent Document 6. CMAF is a newer standard than Non-Patent Document 1 and Non-Patent Document 2.

ＣＭＡＦでは、セグメント形式の共通化の他、映像ストリーミング配信の低遅延化を図る技術として、セグメント構造をさらに細分化するチャンク構造が定義されている。セグメントが一般的に数秒から数十秒であるのに対して、チャンクは数フレームなどより短い時間の映像データである。一般的なＨＴＴＰによるファイル単位の転送では、一つのセグメント全体が完成してからファイルを転送するため、セグメントの時間分（数秒から数十秒）の映像遅延が原理上避けられない。実際には、再生を安定化させるために受信機でも数個のセグメントをバッファに蓄えることから、通算の映像遅延は数十秒から数分になる場合がある。一方、ＣＭＡＦのチャンク構造（数フレーム）を、ＨＴＴＰの拡張技術であるＣｈｕｎｋｅｄＴｒａｎｓｆｅｒを使用して受信機に伝送する場合では、セグメント全体の完成を待つことなく、数フレームのチャンク単位で伝送することで、数秒の遅延での映像ストリーミング配信が実現可能とされている。ＩＳＯＢＭＦＦの規格当初、数フレーム単位でのメタデータの生成は考慮されていなかったが、通信技術の発展に伴う新たなアプリケーションの要求に応じて、機能が拡張されたと言える。 In CMAF, a chunk structure that further subdivides the segment structure is defined as a technique for reducing the delay of video streaming distribution in addition to standardizing the segment format. While segments are generally several seconds to several tens of seconds, chunks are video data with a shorter time such as several frames. In general file-based transfer by HTTP, since the file is transferred after the entire one segment is completed, a video delay of the time of the segment (several seconds to several tens of seconds) is unavoidable in principle. In reality, the receiver also stores several segments in the buffer to stabilize playback, so the total video delay may be tens of seconds to minutes. On the other hand, when the chunk structure (several frames) of CMAF is transmitted to the receiver using the Extended Transfer, which is an extension technology of HTTP, it is transmitted in chunk units of several frames without waiting for the completion of the entire segment. Therefore, it is said that video streaming distribution with a delay of several seconds is feasible. At the beginning of the ISOBMFF standard, the generation of metadata in units of several frames was not considered, but it can be said that the functions have been expanded in response to the demands of new applications with the development of communication technology.

ＭＭＴによる映像伝送においても、セグメントに相当するＭＰＵ全体の符号化が終わった後にメタデータを生成し、それをＭＰＵメタデータ及びムービーフラグメントメタデータとして伝送する場合には、ＭＰＥＧ−ＤＡＳＨ等でセグメントをファイル化するのと同様の遅延が原理上避けられない。そこで、非特許文献２では、放送における映像伝送の低遅延化を図るために、ＩＳＯＢＭＦＦ形式のメタデータ生成と、ＭＰＵメタデータ及びムービーフラグメントメタデータとしての伝送とを省略している。しかし、ムービーフラグメントメタデータの伝送を省略すると、ＤＴＳ−ＰＴＳ差分情報（dts_pts_offset）を受信機に伝送できないという問題があった。このＤＴＳ−ＰＴＳ差分情報は、映像符号化のフレーム間参照構造に伴う、映像フレームの復号タイミングを指示するＤＴＳ（Decoding Timestamp）と映像フレームの提示タイミングを指示するＰＴＳ（Presentation Timestamp）との差分値を指示する情報である。そこで、非特許文献２では、このＤＴＳ−ＰＴＳ差分情報を別途記述する「拡張ＭＰＵタイムスタンプ記述子」を定義し、制御メッセージであるＭＰテーブル（MMT Package Table）内の記述子として伝送することを規定した。ここで、記述子とは、制御メッセージで様々な補助的な情報を多重して伝送するために、制御メッセージを拡張するためのデータ構造の一般的な名称である。例えば、ＭＭＴでは、制御メッセージであるＭＰテーブルの構造が非特許文献１で規定されるのに対して、それを拡張する各種記述子は非特許文献２で規定されるなど、各国の標準化機関やサービス事業者が独自に記述子を追加定義することができる。「拡張ＭＰＵタイムスタンプ記述子」は、ＭＰＵを構成する全フレームのＤＴＳ−ＰＴＳ差分情報を列挙した構造体であり、ＭＰＵの先頭フレームのＰＴＳを指示する「ＭＰＵタイムスタンプ記述子」とは別に伝送される。なお、「ＭＰＵタイムスタンプ記述子」は、非特許文献１に別途規定されている。 Even in video transmission by MMT, when metadata is generated after the encoding of the entire MPU corresponding to the segment is completed and it is transmitted as MPU metadata and movie fragment metadata, the segment is transmitted by MPEG-DASH or the like. In principle, the same delay as creating a file is inevitable. Therefore, in Non-Patent Document 2, in order to reduce the delay of video transmission in broadcasting, the generation of metadata in ISOBMFF format and the transmission as MPU metadata and movie fragment metadata are omitted. However, if the transmission of the movie fragment metadata is omitted, there is a problem that the DTS-PTS difference information (dts_pts_offset) cannot be transmitted to the receiver. This DTS-PTS difference information is a difference value between DTS (Decoding Timestamp), which indicates the decoding timing of the video frame, and PTS (Presentation Timestamp), which indicates the presentation timing of the video frame, according to the inter-frame reference structure of the video coding. It is information to instruct. Therefore, Non-Patent Document 2 defines an "extended MPU time stamp descriptor" that separately describes this DTS-PTS difference information, and transmits it as a descriptor in the MP table (MMT Package Table) which is a control message. Prescribed. Here, the descriptor is a general name of a data structure for extending a control message in order to multiplex and transmit various auxiliary information in the control message. For example, in MMT, the structure of the MP table, which is a control message, is defined in Non-Patent Document 1, while various descriptors that extend it are defined in Non-Patent Document 2. Service providers can define additional descriptors on their own. The "extended MPU time stamp descriptor" is a structure that enumerates the DTS-PTS difference information of all the frames constituting the MPU, and is transmitted separately from the "MPU time stamp descriptor" that indicates the PTS of the first frame of the MPU. Will be done. The "MPU time stamp descriptor" is separately defined in Non-Patent Document 1.

また、非特許文献６においては、実際のサービス（高度広帯域衛星デジタル放送）を対象として、より詳細なＭＭＴの運用方法が規定されている。非特許文献６では、「拡張ＭＰＵタイムスタンプ記述子」について、該当するＭＰＵよりも早いタイミングで伝送することを要求している。このとき、図８（ｂ）に示すように、「ＭＰＵタイムスタンプ記述子」に記載するＰＴＳは、ＭＰＵ（ＧＯＰ）の先頭で決定する（符号α）。その一方、「拡張ＭＰＵタイムスタンプ記述子」は、ＭＰＵ全体の映像符号化が終わった後でなければ生成できない（符号β）。このため、ＭＰＵの映像符号データの伝送をＭＰＵ長の時間以上に遅延させる必要が生じた。 Further, Non-Patent Document 6 defines a more detailed operation method of MMT for an actual service (advanced wideband satellite digital broadcasting). Non-Patent Document 6 requires that the "extended MPU time stamp descriptor" be transmitted at an earlier timing than the corresponding MPU. At this time, as shown in FIG. 8B, the PTS described in the “MPU time stamp descriptor” is determined at the beginning of the MPU (GOP) (reference numeral α). On the other hand, the "extended MPU time stamp descriptor" can be generated only after the video coding of the entire MPU is completed (reference numeral β). Therefore, it has become necessary to delay the transmission of the video code data of the MPU beyond the time of the MPU length.

このような映像遅延を回避するために、ＭＭＴにおいても、図９（ａ）に示すように、新たにＣＭＡＦで規定されたチャンク構造のＩＳＯＢＭＦＦメタデータを使用することが考えられる。ＣＭＡＦチャンクは、ＩＳＯＢＭＦＦをベースにした構造であるため、非特許文献１の規定に従ってＭＭＴＰでの多重が可能である。ＣＭＡＦチャンクのＩＳＯＢＭＦＦメタデータでは、チャンク内の各フレームの提示タイミングと復号タイミングとの差分値を指示するＢｏｘ形式のメタデータが規定されているため、チャンク長の遅延でメタデータを生成できる。Ｂｏｘ形式のメタデータをＭＭＴに適用した場合、図９（ｂ）に示すように、チャンク長の遅延でメタデータを生成して、ＭＰＵを送信することが可能である。これにより、ＭＭＴを用いた映像伝送において、メタデータの生成に伴うＭＰＵ長の遅延を回避できる。図９（ｂ）では、一例として、３２フレームのＧＯＰを４分割した８フレームの集合をチャンクとして構成している。また、ＩＳＯＢＭＦＦ及びＣＭＡＦで規定されたＢｏｘ形式のメタデータ構造でＤＴＳ−ＰＴＳ差分情報を記述し、ＭＭＴＰのムービーフラグメントメタデータとして伝送することで、「拡張ＭＰＵタイムスタンプ記述子」を使用せずとも、デコーダに対して必要なメタデータを伝送できる。具体的に、ＣＭＡＦでは、「TrackFragmentRunBox（‘trun’）」の「sample_composition_time_offset」により、チャンク内の各フレームにＤＴＳ−ＰＴＳ差分情報を記述することができる。 In order to avoid such video delay, it is conceivable to use ISOBMFF metadata having a chunk structure newly defined by CMAF in MMT as shown in FIG. 9A. Since the CMAF chunk has a structure based on ISOBMFF, it can be multiplexed with MMTP according to the provisions of Non-Patent Document 1. Since the ISOBMFF metadata of the CMAF chunk defines the Box format metadata that indicates the difference value between the presentation timing and the decoding timing of each frame in the chunk, the metadata can be generated with the delay of the chunk length. When the Box format metadata is applied to the MMT, it is possible to generate the metadata with a chunk length delay and transmit the MPU, as shown in FIG. 9B. This makes it possible to avoid a delay in the MPU length due to the generation of metadata in video transmission using MMT. In FIG. 9B, as an example, a set of 8 frames obtained by dividing a 32 frame GOP into 4 is configured as a chunk. In addition, by describing the DTS-PTS difference information in the Box format metadata structure defined by ISOBMFF and CMAF and transmitting it as MMTP movie fragment metadata, it is not necessary to use the "extended MPU time stamp descriptor". , Can transmit the required metadata to the decoder. Specifically, in CMAF, DTS-PTS difference information can be described in each frame in the chunk by "sample_composition_time_offset" of "TrackFragmentRunBox ('trun')".

以下、映像符号化のパラメータセットについて説明する。ＩＳＯＢＭＦＦでは、コーデックの識別にアルファベット４文字で定義されるＦｏｕｒＣＣ（Four Character Code）を用いており、例えば、ＨＥＶＣでは、「ｈｅｖ１」と「ｈｖｃ１」の２種類が規定されている。ここで、「ｈｅｖ１」は、ＨＥＶＣで規定される映像符号化のパラメータセットであるＶＰＳ（Video Parameter Set）、ＳＰＳ（Sequence Parameter Set）、ＰＰＳ（Picture Parameter Set）をメディアフラグメントユニットの中に含む形式であることを示す。また、「ｈｖｃ１」は、パラメータセットをＭＰＵメタデータに含む形式であることを示す。 Hereinafter, the parameter set for video coding will be described. In ISOBMFF, FourCC (Four Character Code) defined by four letters of the alphabet is used for identification of a codec. For example, in HEVC, two types of "have1" and "hvc1" are defined. Here, "hev1" is a format in which VPS (Video Parameter Set), SPS (Sequence Parameter Set), and PPS (Picture Parameter Set), which are video coding parameter sets defined by HEVC, are included in the media fragment unit. Indicates that. Further, "hvc1" indicates that the parameter set is included in the MPU metadata.

つまり、ＣＭＡＦを適用して映像符号化データ等を多重したＭＭＴＰパケットは、図９（ｂ）に示すように、パラメータセットをＭＰＵメタデータ（「ｈｖｃ１」の場合）又はメディアフラグメントユニット（「ｈｅｖ１」の場合）に含んでいる。以後、ＣＭＡＦを適用して映像符号化データ等を多重したＭＭＴＰパケットを「ＣＭＡＦ適用ＭＭＴ」と略記する場合がある。つまり、ＣＭＡＦ適用ＭＭＴは、「ｈｖｃ１」の場合と、「ｈｅｖ１」の場合とがある。なお、特殊な例ではあるが、各パラメータセットを、メディアフラグメントユニットとＭＰＵメタデータとの両方で伝送することも技術的には可能である。 That is, as shown in FIG. 9B, the MMTP packet to which CMAF is applied and the video-encoded data or the like is multiplexed has a parameter set of MPU metadata (in the case of “hvc1”) or a media fragment unit (“hev1””. In the case of). Hereinafter, the MMTP packet to which CMAF is applied and the video coded data or the like is multiplexed may be abbreviated as “CMAF applied MMT”. That is, the CMAF-applied MMT may be "hvc1" or "hev1". Although it is a special case, it is technically possible to transmit each parameter set by both the media fragment unit and the MPU metadata.

一方、ＣＭＡＦを適用せずに映像符号化データ等を多重したＭＭＴでは、図８（ｂ）に示すように、ＭＰＵメタデータを伝送せず、パラメータセットをメディアフラグメントユニットに含んだ形式で伝送するため、「ｈｅｖ１」のみに対応する。以後、ＣＭＡＦを適用せずに映像符号化データ等を多重したＭＭＴＰパケットを「ＣＭＡＦ非適用ＭＭＴ」と略記する場合がある。 On the other hand, in the MMT in which video coded data and the like are multiplexed without applying CMAF, as shown in FIG. 8 (b), MPU metadata is not transmitted and the parameter set is transmitted in a format included in the media fragment unit. Therefore, only "hev1" is supported. Hereinafter, an MMTP packet in which video-encoded data or the like is multiplexed without applying CMAF may be abbreviated as “CMAF non-applicable MMT”.

なお、ＭＰＵメタデータは、ＩＳＯＢＭＦＦで規定される「MovieBox（’moov’）」を含むため、一般的には、ムービーメタデータと呼ばれる。また、ムービーフラグメントメタデータについては、ＩＳＯＢＭＦＦで規定される「MovieFragmentBox（’moof’）」を含むため、一般的にも、ムービーフラグメントメタデータと呼ばれる。また、メディアフラグメントユニットは、ＩＳＯＢＭＦＦで規定される「MediaDataBox（’mdat’）」を含むため、一般的には、メディアデータと呼ばれる。また、ＣＭＡＦでは、ＭＰＵのようなランダムアクセスポイントを先頭に持つ処理単位を、フラグメントと呼ぶ。 Since the MPU metadata includes the "MovieBox ('moov')" defined by ISOBMFF, it is generally called movie metadata. Further, since the movie fragment metadata includes the "MovieFragmentBox ('moof')" defined by ISOBMFF, it is generally called the movie fragment metadata. Further, since the media fragment unit includes the "MediaDataBox ('mdat')" defined by ISOBMFF, it is generally called media data. Further, in CMAF, a processing unit having a random access point at the head, such as an MPU, is called a fragment.

“High efficiency coding and media delivery in heterogeneous environments: MPEG media transport”、ＩＳＯ／ＩＥＣ２３００８−１“High efficiency coding and media delivery in heterogeneous environments: MPEG media transport”, ISO / IEC 23008-1 “デジタル放送におけるMMTによるメディアトランスポート方式”、ＡＲＩＢＳＴＤ−Ｂ６０"Media transport method by MMT in digital broadcasting", ARIB STD-B60 “ISO/IEC base media file format”、ＩＳＯ／ＩＥＣ１４４９６−１２"ISO / IEC base media file format", ISO / IEC 14496-12 “Dynamic adaptive streaming over HTTP (DASH)-Part 1:Media presentation description and segment formats”、ＩＳＯ／ＩＥＣ２３００９−１“Dynamic adaptive streaming over HTTP (DASH)-Part 1: Media presentation description and segment formats”, ISO / IEC 23009-1 “Common media application format (CMAF) for segmented media”、ＩＳＯ／ＩＥＣ２３０００−１９“Common media application format (CMAF) for segmented media”, ISO / IEC 23000-19 “高度広帯域衛星デジタル放送運用規定（第三分冊）”、ＡＲＩＢＴＲ−Ｂ３９"Advanced Wideband Satellite Digital Broadcasting Operation Regulations (Volume 3)", ARIB TR-B39

前記したＣＭＡＦ適用ＭＭＴを、ＣＭＡＦに対応していない受信機に入力した場合、正常に映像・音声を再生できないことや、処理エラーにより異常終了することがある。これと同様、ＣＭＡＦ非適用ＭＭＴを、ＣＭＡＦに対応した受信機に入力した場合も、正常に映像・音声を再生できないことや、処理エラーにより異常終了することがある。 When the above-mentioned CMAF-applied MMT is input to a receiver that does not support CMAF, video / audio may not be reproduced normally or abnormal termination may occur due to a processing error. Similarly, when a CMAF non-applied MMT is input to a receiver compatible with CMAF, video / audio may not be reproduced normally or abnormal termination may occur due to a processing error.

そこで、本発明は、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる多重信号変換装置及びそのプログラム、並びに、受信機を提供することを課題とする。 Therefore, it is an object of the present invention to provide a multiplex signal conversion device and a program thereof, and a receiver so that the receiver can normally reproduce video and audio regardless of the suitability of CMAF.

前記課題を解決するため、本発明に係る多重信号変換装置は、ＣＭＡＦを適用した多重信号であるＣＭＡＦ適用多重信号を、ＣＭＡＦを適用していない多重信号であるＣＭＡＦ非適用多重信号に変換する多重信号変換装置であって、分離部と、記述子変換部と、記述子追加部と、出力部と、混合部とを備える構成とした。 In order to solve the above problems, the multiplex signal conversion device according to the present invention converts a CMAF applied multiplex signal, which is a multiplex signal to which CMAF is applied, into a CMAF non-applicable multiplex signal, which is a multiplex signal to which CMAF is not applied. The signal conversion device is configured to include a separation unit, a descriptor conversion unit, a descriptor addition unit, an output unit, and a mixing unit.

かかる構成によれば、分離部は、ＣＭＡＦ適用多重信号からムービーメタデータとムービーフラグメントメタデータと制御メッセージとメディアデータとを分離する。
記述子変換部は、ムービーフラグメントメタデータのＤＴＳ−ＰＴＳ差分情報を記述子に変換する。
記述子追加部は、その記述子を制御メッセージに追加する。
出力部は、制御メッセージの出力タイミングに従って、フラグメント単位でメディアデータを出力する。
混合部は、記述子追加部からの制御メッセージと出力部からのメディアデータとを混合し、ＣＭＡＦ非適用多重信号として出力する。 According to such a configuration, the separation unit separates the movie metadata, the movie fragment metadata, the control message, and the media data from the CMAF-applied multiplex signal.
The descriptor conversion unit converts the DTS-PTS difference information of the movie fragment metadata into a descriptor.
The descriptor addition section adds the descriptor to the control message.
The output unit outputs media data in fragment units according to the output timing of the control message.
The mixing unit mixes the control message from the descriptor addition unit and the media data from the output unit, and outputs the signal as a CMAF non-applicable multiplex signal.

このように、多重信号変換装置は、ＣＭＡＦ適用多重信号をＣＭＡＦ非適用多重信号に変換できるので、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる。 As described above, since the multiplex signal conversion device can convert the CMAF-applied multiplex signal into the CMAF-non-applicable multiplex signal, the receiver can normally reproduce the video / audio regardless of the suitability of CMAF.

また、前記課題を解決するため、本発明に係る多重信号変換装置は、ＣＭＡＦを適用していない多重信号であるＣＭＡＦ非適用多重信号を、ＣＭＡＦを適用した多重信号であるＣＭＡＦ適用多重信号に変換する多重信号変換装置であって、分離部と、記述子抽出・削除部と、変換部と、出力部と、混合部とを備える構成とした。 Further, in order to solve the above-mentioned problems, the multiplex signal conversion device according to the present invention converts a CMAF non-applicable multiplex signal, which is a multiplex signal to which CMAF is not applied, into a CMAF-applied multiplex signal, which is a multiplex signal to which CMAF is applied. It is a multi-signal conversion device that includes a separation unit, a descriptor extraction / deletion unit, a conversion unit, an output unit, and a mixing unit.

かかる構成によれば、分離部は、ＣＭＡＦ非適用多重信号から制御メッセージとメディアデータとを分離する。
記述子抽出・削除部は、制御メッセージからＤＴＳ−ＰＴＳ差分情報を含む記述子を抽出すると共に、制御メッセージの記述子を削除する。
変換部は、記述子のＤＴＳ−ＰＴＳ差分情報をムービーフラグメントメタデータに変換する。
出力部は、ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位でメディアデータを出力する。
混合部は、記述子抽出・削除部からの制御メッセージと変換部からのムービーフラグメントメタデータと出力部からのメディアデータとを混合し、ＣＭＡＦ適用多重信号として出力する。 According to such a configuration, the separation unit separates the control message and the media data from the CMAF non-applicable multiplex signal.
The descriptor extraction / deletion unit extracts the descriptor including the DTS-PTS difference information from the control message and deletes the descriptor of the control message.
The conversion unit converts the DTS-PTS difference information of the descriptor into movie fragment metadata.
The output unit outputs media data in chunk units according to the output timing of the movie fragment metadata.
The mixing unit mixes the control message from the descriptor extraction / deletion unit, the movie fragment metadata from the conversion unit, and the media data from the output unit, and outputs the CMAF application multiplex signal.

このように、多重信号変換装置は、ＣＭＡＦ非適用多重信号をＣＭＡＦ適用多重信号に変換できるので、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる。 As described above, since the multiplex signal conversion device can convert the CMAF non-applicable multiplex signal into the CMAF applied multiplex signal, the receiver can normally reproduce the video / audio regardless of the suitability of CMAF.

なお、本発明は、コンピュータを、前記した多重信号変換装置として機能させるためのプログラムで実現することもできる。
また、本発明は、前記した多重信号変換装置を備える受信機で実現することもできる。 The present invention can also be realized by a program for making a computer function as the above-mentioned multiplex signal conversion device.
Further, the present invention can also be realized by a receiver provided with the above-mentioned multiplex signal conversion device.

本発明によれば、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる。 According to the present invention, the receiver can normally reproduce video and audio regardless of the suitability of CMAF.

各実施形態に係る放送システムの概略構成図である。It is a schematic block diagram of the broadcasting system which concerns on each embodiment. 第１実施形態に係るＭＭＴ変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the MMT conversion apparatus which concerns on 1st Embodiment. 第１実施形態に係るＭＭＴ変換装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the MMT conversion apparatus which concerns on 1st Embodiment. 変形例１に係るＭＭＴ変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the MMT conversion apparatus which concerns on modification 1. FIG. 第２実施形態に係るＭＭＴ変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the MMT conversion apparatus which concerns on 2nd Embodiment. 第２実施形態に係るＭＭＴ変換装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the MMT conversion apparatus which concerns on 2nd Embodiment. 変形例２に係るＭＭＴ変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the MMT conversion apparatus which concerns on modification 2. 従来技術として、（ａ）はＣＭＡＦ非適用ＭＭＴの多重を説明する説明図であり、（ｂ）はＣＭＡＦ非適用ＭＭＴのデータ構造を説明する説明図である。As a prior art, (a) is an explanatory diagram for explaining the multiplexing of CMAF-non-applicable MMT, and (b) is an explanatory diagram for explaining the data structure of CMAF-non-applicable MMT. 従来技術として、（ａ）はＣＭＡＦ適用ＭＭＴの多重を説明する説明図であり、（ｂ）はＣＭＡＦ適用ＭＭＴのデータ構造を説明する説明図である。As a prior art, (a) is an explanatory diagram illustrating multiplexing of CMAF-applied MMT, and (b) is an explanatory diagram illustrating a data structure of CMAF-applied MMT.

以下、本発明の各実施形態について図面を参照して説明する。但し、以下に説明する実施形態は、本発明の技術思想を具体化するためのものであって、特定的な記載がない限り、本発明を以下のものに限定しない。また、各実施形態において、同一の手段には同一の符号を付し、説明を省略することがある。 Hereinafter, each embodiment of the present invention will be described with reference to the drawings. However, the embodiments described below are for embodying the technical idea of the present invention, and the present invention is not limited to the following unless otherwise specified. Further, in each embodiment, the same means may be designated by the same reference numerals and description thereof may be omitted.

（第１実施形態）
［放送システムの概略］
図１を参照し、第１実施形態に係る放送システム１００の概略について説明する。
図１に示すように、放送システム１００は、デジタル放送を行うものであり、符号化装置２と、送出装置３と、受信機４とを備える。また、受信機４は、後記するＭＭＴ変換装置（多重信号変換装置）１を内蔵している。 (First Embodiment)
[Outline of broadcasting system]
With reference to FIG. 1, the outline of the broadcasting system 100 according to the first embodiment will be described.
As shown in FIG. 1, the broadcasting system 100 performs digital broadcasting, and includes a coding device 2, a transmitting device 3, and a receiver 4. Further, the receiver 4 has a built-in MMT conversion device (multiple signal conversion device) 1 described later.

符号化装置２は、所定の映像符号化方式で放送番組の映像を符号化し、符号化した映像を送出装置３に出力するものである。本実施形態では、映像符号化方式がＨＥＶＣであることとする。 The coding device 2 encodes the video of the broadcast program by a predetermined video coding method, and outputs the coded video to the transmission device 3. In this embodiment, it is assumed that the video coding method is HEVC.

送出装置３は、所定の多重方式で放送番組の映像や音声を多重し、受信機４に送出するものである。本実施形態では、多重方式がＭＭＴであることとする。つまり、送出装置３は、符号化装置２から入力した映像や音声を多重し、ＭＭＴＰパケット列として受信機４に送出する。 The transmission device 3 multiplexes the video and audio of the broadcast program by a predetermined multiplexing method and transmits the video and audio to the receiver 4. In this embodiment, it is assumed that the multiplex method is MMT. That is, the transmission device 3 multiplexes the video and audio input from the coding device 2 and sends them to the receiver 4 as an MMTP packet string.

受信機４は、送出装置３が送出したＭＭＴＰパケット列を受信・多重分離し、放送番組の映像や音声を復号・再生するものである。例えば、受信機４としては、一般的なテレビ、スマートフォン、タブレットがあげられる。なお、図１では、図面を見やすくするために受信機４を１台のみ図示したが、通常、受信機４は複数台である。 The receiver 4 receives and multiplexes the MMTP packet sequence transmitted by the transmitting device 3, and decodes and reproduces the video and audio of the broadcast program. For example, the receiver 4 includes a general television, a smartphone, and a tablet. In FIG. 1, only one receiver 4 is shown for easy viewing of the drawing, but usually, there are a plurality of receivers 4.

ここで、ＣＭＡＦに対応した送出装置３が、ＣＭＡＦに対応していない受信機４に対し、ＣＭＡＦ適応ＭＭＴ（ＣＭＡＦ適用多重信号）を送出することがある。そこで、受信機４は、内蔵したＭＭＴ変換装置１によって、ＣＭＡＦ適応ＭＭＴをＣＭＡＦ非適応ＭＭＴ（ＣＭＡＦ非適用多重信号）に変換する。
なお、本実施形態では、ＨＥＶＣによる映像信号をアセットとして伝送するＣＭＡＦ適応ＭＭＴが「ｈｖｃ１」に対応し、パラメータセットをＭＰＵメタデータに含むこととする。 Here, the CMAF-compatible transmission device 3 may transmit a CMAF-adaptive MMT (CMAF-applied multiplex signal) to a receiver 4 that does not support CMAF. Therefore, the receiver 4 converts the CMAF-adaptive MMT into a CMAF-non-adaptive MMT (CMAF-non-applicable multiplex signal) by the built-in MMT conversion device 1.
In this embodiment, the CMAF adaptive MMT that transmits the video signal by HEVC as an asset corresponds to "hvc1", and the parameter set is included in the MPU metadata.

［ＭＭＴ変換装置の構成］
図２を参照し、ＭＭＴ変換装置１の構成について説明する。
ＭＭＴ変換装置１は、図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｖｃ１）を図８（ｂ）のＣＭＡＦ非適応ＭＭＴに変換するものである。図２に示すように、ＭＭＴ変換装置１は、パケットフィルタ（分離部）１０と、メッセージバッファ１１と、記述子変換部１２と、記述子追加部１３と、パラメータセット抽出部１４と、パラメータセット追加部１５と、ＭＰＵバッファ（出力部）１６と、パケット混合部（混合部）１７とを備える。 [Configuration of MMT converter]
The configuration of the MMT conversion device 1 will be described with reference to FIG.
The MMT conversion device 1 converts the CMAF-adaptive MMT (hvc1) of FIG. 9B into the CMAF non-adaptive MMT of FIG. 8B. As shown in FIG. 2, the MMT conversion device 1 includes a packet filter (separation unit) 10, a message buffer 11, a descriptor conversion unit 12, a descriptor addition unit 13, a parameter set extraction unit 14, and a parameter set. An additional unit 15, an MPU buffer (output unit) 16, and a packet mixing unit (mixing unit) 17 are provided.

パケットフィルタ１０は、ＣＭＡＦ適用ＭＭＴから、ムービーフラグメントメタデータと、ＭＰＵメタデータ（ムービーメタデータ）と、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニット（メディアデータ）と、その他のパケットとを分離するものである。 The packet filter 10 separates movie fragment metadata, MPU metadata (movie metadata), control message (MP table), media fragment unit (media data), and other packets from the CMAF-applied MMT. It is a thing.

図９（ｂ）に示すように、パケットフィルタ１０は、ＣＭＡＦ適用ＭＭＴ（ＭＭＴＰパケット）のＰＩＤ及びフラグメントタイプ（fragment_type）を参照し、ムービーフラグメントメタデータ等の分離を行う。具体的には、パケットフィルタ１０は、ＰＩＤ＝０のＭＭＴＰパケットを制御メッセージ（ＭＰテーブル）、ＰＩＤ＝Ｘかつフラグメントタイプ＝０のＭＭＴＰパケットをＭＰＵメタデータ、ＰＩＤ＝Ｘかつフラグメントタイプ＝１のＭＭＴＰパケットをムービーフラグメントメタデータ、ＰＩＤ＝Ｘかつフラグメントタイプ＝２のＭＭＴＰパケットをメディアフラグメントユニットとして、ＣＭＡＦ適用ＭＭＴから分離する。また、パケットフィルタ１０は、前記したムービーフラグメントメタデータ等以外のデータ（例えば、ＭＰテーブル以外の制御メッセージ）をその他のパケットとして、ＣＭＡＦ適用ＭＭＴから分離する。 As shown in FIG. 9B, the packet filter 10 refers to the PID and the fragment type (fragment_type) of the CMAF-applied MMT (MMTP packet), and separates the movie fragment metadata and the like. Specifically, the packet filter 10 uses a control message (MP table) for MMTP packets with PID = 0, MPU metadata for MMTP packets with PID = X and fragment type = 0, and MMTP with PID = X and fragment type = 1. Separate the packet from the CMAF-applied MMT by using the movie fragment metadata and the MMTP packet having PID = X and fragment type = 2 as the media fragment unit. Further, the packet filter 10 separates data other than the above-mentioned movie fragment metadata and the like (for example, a control message other than the MP table) as other packets from the CMAF-applied MMT.

ここで、パケットフィルタ１０は、ＭＰテーブル内のアセットロケーション情報を参照することで、変換対象のアセットを伝送するＰＩＤ（＝Ｘ）を特定できる。なお、エントリポイントであるＰＩＤ＝０の制御メッセージにはパッケージリストテーブルが含まれ、パッケージリストテーブルから参照される別のＰＩＤでＭＰテーブルが伝送される場合がある。この場合、パケットフィルタ１０は、パッケージリストテーブルを参照することで制御メッセージ（ＭＰテーブル）を伝送するＰＩＤを特定し、制御メッセージ（ＭＰテーブル）を分離できる。 Here, the packet filter 10 can specify the PID (= X) for transmitting the asset to be converted by referring to the asset location information in the MP table. The control message with PID = 0, which is an entry point, includes a package list table, and the MP table may be transmitted by another PID referenced from the package list table. In this case, the packet filter 10 can specify the PID for transmitting the control message (MP table) by referring to the package list table, and can separate the control message (MP table).

そして、パケットフィルタ１０は、制御メッセージ（ＭＰテーブル）をメッセージバッファ１１に出力し、ムービーフラグメントメタデータを記述子変換部１２に出力し、ＭＰＵメタデータをパラメータセット抽出部１４に出力する。さらに、パケットフィルタ１０は、メディアフラグメントユニットをパラメータセット追加部１５に出力し、その他のパケットをパケット混合部１７に出力する。 Then, the packet filter 10 outputs the control message (MP table) to the message buffer 11, outputs the movie fragment metadata to the descriptor conversion unit 12, and outputs the MPU metadata to the parameter set extraction unit 14. Further, the packet filter 10 outputs the media fragment unit to the parameter set addition unit 15, and outputs other packets to the packet mixing unit 17.

メッセージバッファ１１は、パケットフィルタ１０から入力した制御メッセージ（ＭＰテーブル）を蓄積するバッファである。また、メッセージバッファ１１は、記述子変換部１２からの出力指示に従って、制御メッセージ（ＭＰテーブル）を記述子追加部１３に出力する。 The message buffer 11 is a buffer for accumulating control messages (MP table) input from the packet filter 10. Further, the message buffer 11 outputs a control message (MP table) to the descriptor addition unit 13 according to an output instruction from the descriptor conversion unit 12.

記述子変換部１２は、パケットフィルタ１０から入力したムービーフラグメントメタデータのＤＴＳ−ＰＴＳ差分情報を、制御メッセージに多重して伝送するための記述子に変換するものである。本実施形態では、記述子が拡張ＭＰＵタイムスタンプ記述子であることとする。具体的には、記述子変換部１２は、ムービーフラグメントメタデータを解析して、ＤＴＳ−ＰＴＳ差分情報を拡張ＭＰＵタイムスタンプ記述子の形式に変換する。そして、記述子変換部１２は、ＭＰＵ１個分のＤＴＳ−ＰＴＳ差分情報の変換が完了すると、メッセージバッファ１１に出力指示を行うと共に、記述子追加部１３に拡張ＭＰＵタイムスタンプ記述子を出力する。記述子変換部１２からメッセージバッファ１１への出力指示は、例えば、ＭＰＵシーケンス番号を指定し、そのＭＰＵに対応する記述子（例えば、ＭＰＵタイムスタンプ記述子）を含む制御メッセージ（ＭＰテーブル）を出力させるものである。 The descriptor conversion unit 12 converts the DTS-PTS difference information of the movie fragment metadata input from the packet filter 10 into a descriptor for multiplexing and transmitting the control message. In this embodiment, the descriptor is an extended MPU time stamp descriptor. Specifically, the descriptor conversion unit 12 analyzes the movie fragment metadata and converts the DTS-PTS difference information into the format of the extended MPU time stamp descriptor. Then, when the conversion of the DTS-PTS difference information for one MPU is completed, the descriptor conversion unit 12 issues an output instruction to the message buffer 11 and outputs an extended MPU time stamp descriptor to the descriptor addition unit 13. The output instruction from the descriptor conversion unit 12 to the message buffer 11 specifies, for example, an MPU sequence number, and outputs a control message (MP table) including a descriptor corresponding to the MPU (for example, an MPU time stamp descriptor). It is something that makes you.

記述子追加部１３は、記述子変換部１２から入力した拡張ＭＰＵタイムスタンプ記述子を、メッセージバッファ１１から入力した制御メッセージ（ＭＰテーブル）に追加するものである。つまり、この制御メッセージは、拡張ＭＰＵタイムスタンプ記述子を追加したＭＰテーブルを有する。そして、記述子追加部１３は、この制御メッセージ（ＭＰテーブル）をパケット混合部１７に出力する。さらに、記述子追加部１３は、パケット混合部１７に制御メッセージ（ＭＰテーブル）を出力した後、ＭＰＵバッファ１６に対し、拡張ＭＰＵタイムスタンプ記述子に対応するＭＰＵの出力指示を行う。 The descriptor addition unit 13 adds the extended MPU time stamp descriptor input from the descriptor conversion unit 12 to the control message (MP table) input from the message buffer 11. That is, this control message has an MP table with an extended MPU timestamp descriptor added. Then, the descriptor addition unit 13 outputs this control message (MP table) to the packet mixing unit 17. Further, the descriptor addition unit 13 outputs a control message (MP table) to the packet mixing unit 17, and then instructs the MPU buffer 16 to output the MPU corresponding to the extended MPU time stamp descriptor.

パラメータセット抽出部１４は、パケットフィルタ１０より入力したＭＰＵメタデータから、映像符号化のパラメータセットを抽出するものである。具体的には、パラメータセット抽出部１４は、ＭＰＵメタデータのＢｏｘ形式のメタデータからＨＥＶＣのパラメータセットを抽出し、抽出したパラメータセットをパラメータセット追加部１５に出力する。 The parameter set extraction unit 14 extracts the video coding parameter set from the MPU metadata input from the packet filter 10. Specifically, the parameter set extraction unit 14 extracts the HEVC parameter set from the Box format metadata of the MPU metadata, and outputs the extracted parameter set to the parameter set addition unit 15.

パラメータセット追加部１５は、パラメータセット抽出部１４から入力したパラメータセットを、パケットフィルタ１０から入力したメディアフラグメントユニットに追加するものである。具体的には、パラメータセット追加部１５は、ＨＥＶＣのパラメータセットを、ＭＰＵの先頭フレームのメディアフラグメントユニットの先頭に追加する。そして、パラメータセット追加部１５は、このメディアフラグメントユニットをＭＰＵバッファ１６に出力する。 The parameter set addition unit 15 adds the parameter set input from the parameter set extraction unit 14 to the media fragment unit input from the packet filter 10. Specifically, the parameter set addition unit 15 adds the HEVC parameter set to the beginning of the media fragment unit of the first frame of the MPU. Then, the parameter set addition unit 15 outputs this media fragment unit to the MPU buffer 16.

ＭＰＵバッファ１６は、パラメータセット追加部１５から入力したメディアフラグメントユニットを蓄積するバッファである。また、ＭＰＵバッファ１６は、制御メッセージ（ＭＰテーブル）の出力タイミングに従って、パラメータセットが追加されたメディアフラグメントユニットをＭＰＵ（フラグメント）単位で出力する。つまり、ＭＰＵバッファ１６は、記述子追加部１３からの出力指示で指定されたＭＰＵのメディアフラグメントユニットをパケット混合部１７に出力する。記述子追加部１３からＭＰＵバッファ１６への出力指示は、例えば、ＭＰＵシーケンス番号によりＭＰＵを指定して出力させるものである。 The MPU buffer 16 is a buffer for accumulating the media fragment unit input from the parameter set addition unit 15. Further, the MPU buffer 16 outputs the media fragment unit to which the parameter set is added in MPU (fragment) units according to the output timing of the control message (MP table). That is, the MPU buffer 16 outputs the media fragment unit of the MPU specified by the output instruction from the descriptor addition unit 13 to the packet mixing unit 17. The output instruction from the descriptor addition unit 13 to the MPU buffer 16 is, for example, to specify the MPU by the MPU sequence number and output it.

パケット混合部１７は、記述子追加部１３から入力した制御メッセージ（ＭＰテーブル）と、ＭＰＵバッファ１６から入力したメディアフラグメントユニットと、パケットフィルタ１０から入力したその他のパケットとを混合し、ＣＭＡＦ非適用ＭＭＴとして出力するものである。 The packet mixing unit 17 mixes the control message (MP table) input from the descriptor addition unit 13, the media fragment unit input from the MPU buffer 16, and other packets input from the packet filter 10, and does not apply CMAF. It is output as MMT.

ここで、メッセージバッファ１１、記述子変換部１２、記述子追加部１３、パラメータセット抽出部１４、パラメータセット追加部１５及びＭＰＵバッファ１６の各処理においては、ＭＭＴＰパケットの形式を維持して処理してもよいし、又は、ＭＭＴＰパケットのペイロードである処理対象データを一旦抽出した形式で処理してもよい。前者の場合、パケット混合部１７は、複数のＭＭＴＰパケット列を入力として、それらを混合した単一のＭＭＴＰパケット列として出力する。後者の場合、パケット混合部１７は、制御メッセージ（ＭＰテーブル）、メディアフラグメントユニットをペイロードとして含むＭＭＴＰパケットを生成し、それらをその他のパケットとして入力されるＭＭＴＰパケット列に混合して、単一のＭＭＴＰパケット列として出力する。さらに、パケット混合部１７は、必要に応じて、出力するＭＭＴＰパケット列についてパケットシーケンス番号の連続性を修正するなど、ヘッダ部を書き換えてもよい。 Here, in each processing of the message buffer 11, the descriptor conversion unit 12, the descriptor addition unit 13, the parameter set extraction unit 14, the parameter set addition unit 15, and the MPU buffer 16, the MMTP packet format is maintained and processed. Alternatively, the data to be processed, which is the payload of the MMTP packet, may be processed in a format once extracted. In the former case, the packet mixing unit 17 takes a plurality of MMTP packet sequences as inputs and outputs them as a single MMTP packet sequence in which they are mixed. In the latter case, the packet mixing unit 17 generates an MMTP packet containing a control message (MP table) and a media fragment unit as a payload, and mixes them into an MMTP packet string input as another packet to form a single packet. Output as an MMTP packet string. Further, the packet mixing unit 17 may rewrite the header unit, for example, by correcting the continuity of the packet sequence numbers for the MMTP packet string to be output, if necessary.

［ＭＭＴ変換装置の動作］
図３を参照し、ＭＭＴ変換装置１の動作について説明する。
図３に示すように、ステップＳ１において、パケットフィルタ１０は、ＣＭＡＦ適用ＭＭＴから、ＭＰＵメタデータと、ムービーフラグメントメタデータと、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニットと、その他のパケットとを分離する。また、メッセージバッファ１１は、パケットフィルタ１０が分離した制御メッセージ（ＭＰテーブル）を蓄積する。 [Operation of MMT converter]
The operation of the MMT conversion device 1 will be described with reference to FIG.
As shown in FIG. 3, in step S1, the packet filter 10 performs MPU metadata, movie fragment metadata, control message (MP table), media fragment unit, and other packets from the CMAF-applied MMT. To separate. Further, the message buffer 11 stores the control message (MP table) separated by the packet filter 10.

ステップＳ２において、パラメータセット抽出部１４は、ＭＰＵメタデータからパラメータセットを抽出する。
ステップＳ３において、パラメータセット追加部１５は、パラメータセットをメディアフラグメントユニットに追加する。メディアフラグメントユニットは、記述子追加部１３から出力指示があるまでＭＰＵバッファ１６にバッファされる。 In step S2, the parameter set extraction unit 14 extracts the parameter set from the MPU metadata.
In step S3, the parameter set addition unit 15 adds the parameter set to the media fragment unit. The media fragment unit is buffered in the MPU buffer 16 until an output instruction is given from the descriptor addition unit 13.

ステップＳ４において、記述子変換部１２は、ムービーフラグメントメタデータのＤＴＳ−ＰＴＳ差分情報を拡張ＭＰＵタイムスタンプ記述子に変換し、ＭＰＵ１個分のＤＴＳ−ＰＴＳ差分情報の変換が完了すると、メッセージバッファ１１に出力指示を行う。すると、メッセージバッファ１１は、記述子変換部１２からの出力指示に従って、制御メッセージ（ＭＰテーブル）を記述子追加部１３に出力する。 In step S4, the descriptor conversion unit 12 converts the DTS-PTS difference information of the movie fragment metadata into the extended MPU time stamp descriptor, and when the conversion of the DTS-PTS difference information for one MPU is completed, the message buffer 11 Give an output instruction to. Then, the message buffer 11 outputs the control message (MP table) to the descriptor addition unit 13 according to the output instruction from the descriptor conversion unit 12.

ステップＳ５において、記述子追加部１３は、拡張ＭＰＵタイムスタンプ記述子を制御メッセージ（ＭＰテーブル）に追加し、制御メッセージ（ＭＰテーブル）を出力する。さらに、記述子追加部１３は、ＭＰＵバッファ１６に対し、拡張ＭＰＵタイムスタンプ記述子に対応するＭＰＵの出力指示を行う。 In step S5, the descriptor addition unit 13 adds the extended MPU time stamp descriptor to the control message (MP table) and outputs the control message (MP table). Further, the descriptor addition unit 13 instructs the MPU buffer 16 to output the MPU corresponding to the extended MPU time stamp descriptor.

ステップＳ６において、ＭＰＵバッファ１６は、記述子追加部１３からの出力指示で指定されたＭＰＵのメディアフラグメントユニットをパケット混合部１７に出力する。
ステップＳ７において、パケット混合部１７は、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニットと、その他のパケットとを混合し、ＣＭＡＦ非適用ＭＭＴとして出力する。 In step S6, the MPU buffer 16 outputs the media fragment unit of the MPU specified by the output instruction from the descriptor addition unit 13 to the packet mixing unit 17.
In step S7, the packet mixing unit 17 mixes the control message (MP table), the media fragment unit, and other packets, and outputs them as a CMAF non-applicable MMT.

なお、ステップＳ１〜Ｓ７の処理は、図３の順序で逐次的に実行せずとも、入力されたパケットの種別や順序に応じて、各ステップの処理順序を入れ替えたり、各ステップの処理を同時並列に実行してもよい。 Even if the processes of steps S1 to S7 are not sequentially executed in the order shown in FIG. 3, the processing order of each step can be changed or the processing of each step can be performed simultaneously according to the type and order of the input packets. It may be executed in parallel.

［作用・効果］
以上のように、ＭＭＴ変換装置１は、ＣＭＡＦ適応ＭＭＴをＣＭＡＦ非適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４が正常に映像・音声を再生できる。このように、受信機４は、ＣＭＡＦで規定されたチャンク構造のＩＳＯＢＭＦＦメタデータに対応しない場合でも、ＭＭＴ変換装置１によって、正常に映像・音声を再生できる。 [Action / Effect]
As described above, since the MMT conversion device 1 converts the CMAF-adaptive MMT into the CMAF non-adaptive MMT, the receiver 4 can normally reproduce the video / audio regardless of the suitability of the CMAF. As described above, even if the receiver 4 does not correspond to the ISOBMFF metadata of the chunk structure defined by CMAF, the video / audio can be normally reproduced by the MMT conversion device 1.

（変形例１）
図４を参照し、変形例１に係るＭＭＴ変換装置１Ｂについて、第１実施形態と異なる点を説明する。
変形例１では、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応し、パラメータセットをメディアフラグメントユニットに含むこととする。つまり、ＭＭＴ変換装置１Ｂは、パラメータセットが入力のメディアフラグメントユニットに元々含まれており、そのまま出力すればよいので、パラメータセットをＭＰＵメタデータから抽出してメディアフラグメントユニットに追加する必要がない。 (Modification 1)
With reference to FIG. 4, the MMT conversion device 1B according to the first modification will be described as different from the first embodiment.
In the first modification, the CMAF adaptive MMT corresponds to "hev1", and the parameter set is included in the media fragment unit. That is, in the MMT conversion device 1B, since the parameter set is originally included in the input media fragment unit and may be output as it is, it is not necessary to extract the parameter set from the MPU metadata and add it to the media fragment unit.

ＭＭＴ変換装置１Ｂは、図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｅｖ１）を図８（ｂ）のＣＭＡＦ非適応ＭＭＴに変換するものである。図４に示すように、ＭＭＴ変換装置１Ｂは、パケットフィルタ（分離部）１０Ｂと、メッセージバッファ１１と、記述子変換部１２と、記述子追加部１３と、ＭＰＵバッファ（出力部）１６Ｂと、パケット混合部（混合部）１７とを備える。 The MMT conversion device 1B converts the CMAF-adaptive MMT (hev1) of FIG. 9 (b) into the CMAF non-adaptive MMT of FIG. 8 (b). As shown in FIG. 4, the MMT conversion device 1B includes a packet filter (separation unit) 10B, a message buffer 11, a descriptor conversion unit 12, a descriptor addition unit 13, and an MPU buffer (output unit) 16B. A packet mixing unit (mixing unit) 17 is provided.

パケットフィルタ１０Ｂは、ＣＭＡＦ適用ＭＭＴから分離したムービーフラグメントメタデータを記述子変換部１２に出力する。また、パケットフィルタ１０Ｂは、ＣＭＡＦ適用ＭＭＴから分離したメディアフラグメントユニットをＭＰＵバッファ１６Ｂに出力する。なお、ＭＰＵメタデータが入力された場合、パケットフィルタ１０Ｂは、そのＭＰＵメタデータを破棄して出力しない。この他、パケットフィルタ１０Ｂは、第１実施形態と同様のため、説明を省略する。 The packet filter 10B outputs the movie fragment metadata separated from the CMAF-applied MMT to the descriptor conversion unit 12. Further, the packet filter 10B outputs the media fragment unit separated from the CMAF-applied MMT to the MPU buffer 16B. When MPU metadata is input, the packet filter 10B discards the MPU metadata and does not output it. In addition, since the packet filter 10B is the same as that of the first embodiment, the description thereof will be omitted.

ＭＰＵバッファ１６Ｂは、制御メッセージ（ＭＰテーブル）の出力タイミングに従って、パケットフィルタ１０Ｂから入力したメディアフラグメントユニットをＭＰＵ単位で出力する。この他、ＭＰＵバッファ１６Ｂは、第１実施形態と同様のため、説明を省略する。 The MPU buffer 16B outputs the media fragment unit input from the packet filter 10B in MPU units according to the output timing of the control message (MP table). In addition, since the MPU buffer 16B is the same as that of the first embodiment, the description thereof will be omitted.

［作用・効果］
以上のように、ＭＭＴ変換装置１Ｂは、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応する場合でも、第１実施形態と同様にＣＭＡＦ適応ＭＭＴをＣＭＡＦ非適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４が正常に映像・音声を再生できる。 [Action / Effect]
As described above, the MMT conversion device 1B converts the CMAF-adaptive MMT into the CMAF-non-adaptive MMT as in the first embodiment even when the CMAF-adaptive MMT corresponds to "hev1". Therefore, regardless of the suitability of the CMAF. , The receiver 4 can normally reproduce video and audio.

（第２実施形態）
［放送システムの概略］
図１を参照し、第２実施形態に係る放送システム１００Ｃの概略について説明する。
図１に示すように、放送システム１００Ｃは、デジタル放送を行うものであり、符号化装置２と、送出装置３Ｃと、受信機４Ｃとを備える。 (Second Embodiment)
[Outline of broadcasting system]
The outline of the broadcasting system 100C according to the second embodiment will be described with reference to FIG.
As shown in FIG. 1, the broadcasting system 100C performs digital broadcasting, and includes a coding device 2, a transmitting device 3C, and a receiver 4C.

本実施形態では、ＣＭＡＦに対応していない送出装置３Ｃが、ＣＭＡＦに対応している受信機４Ｃに対し、ＣＭＡＦ非適応ＭＭＴを送出することとする。そこで、受信機４Ｃは、内蔵したＭＭＴ変換装置（多重信号変換装置）５によって、ＣＭＡＦ非適応ＭＭＴをＣＭＡＦ適応ＭＭＴに変換する。 In the present embodiment, the transmission device 3C that does not support CMAF transmits the CMAF non-adaptive MMT to the receiver 4C that supports CMAF. Therefore, the receiver 4C converts the CMAF non-adaptive MMT into the CMAF-adaptive MMT by the built-in MMT conversion device (multiple signal conversion device) 5.

［ＭＭＴ変換装置の構成］
図５を参照し、ＭＭＴ変換装置５の構成について説明する。
ＭＭＴ変換装置５は、図８（ｂ）のＣＭＡＦ非適応ＭＭＴを図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｖｃ１）に変換するものである。図５に示すように、ＭＭＴ変換装置５は、パケットフィルタ（分離部）５０と、記述子抽出・削除部５１と、パラメータセット抽出・削除部５２と、メタデータ変換部（変換部）５３と、ＭＰＵバッファ（出力部）５４と、パケット混合部（混合部）５５とを備える。 [Configuration of MMT converter]
The configuration of the MMT conversion device 5 will be described with reference to FIG.
The MMT conversion device 5 converts the CMAF non-adaptive MMT of FIG. 8 (b) into the CMAF-adaptive MMT (hvc1) of FIG. 9 (b). As shown in FIG. 5, the MMT conversion device 5 includes a packet filter (separation unit) 50, a descriptor extraction / deletion unit 51, a parameter set extraction / deletion unit 52, and a metadata conversion unit (conversion unit) 53. , MPU buffer (output unit) 54 and packet mixing unit (mixing unit) 55.

パケットフィルタ５０は、ＣＭＡＦ非適用ＭＭＴから、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニット（メディアデータ）と、その他のパケットとを分離するものである。 The packet filter 50 separates the control message (MP table), the media fragment unit (media data), and other packets from the CMAF non-applied MMT.

図８（ｂ）に示すように、パケットフィルタ５０は、ＣＭＡＦ非適用ＭＭＴ（ＭＭＴＰパケット）のＰＩＤを参照し、メディアフラグメントユニット等の分離を行う。具体的には、パケットフィルタ５０は、ＰＩＤ＝０のＭＭＴＰパケットを制御メッセージ（ＭＰテーブル）、ＰＩＤ＝ＸのＭＭＴＰパケットをメディアフラグメントユニットとして、ＣＭＡＦ非適用ＭＭＴから分離する。また、パケットフィルタ５０は、制御メッセージ（ＭＰテーブル）及びメディアフラグメントユニット以外のデータをその他のパケットとして、ＣＭＡＦ非適用ＭＭＴから分離する。 As shown in FIG. 8B, the packet filter 50 refers to the PID of the CMAF-non-applied MMT (MMTP packet) and separates the media fragment unit and the like. Specifically, the packet filter 50 separates the MMTP packet with PID = 0 as a control message (MP table) and the MMTP packet with PID = X as a media fragment unit from the CMAF-non-applied MMT. Further, the packet filter 50 separates data other than the control message (MP table) and the media fragment unit as other packets from the CMAF-unapplied MMT.

ここで、パケットフィルタ５０は、ＭＰテーブル内のアセットロケーション情報を参照することで、変換対象のアセットを伝送するＰＩＤ（＝Ｘ）を特定できる。なお、エントリポイントであるＰＩＤ＝０の制御メッセージにはパッケージリストテーブルが含まれ、パッケージリストテーブルから参照される別のＰＩＤでＭＰテーブルが伝送される場合がある。この場合、パケットフィルタ５０は、パッケージリストテーブルを参照することで制御メッセージ（ＭＰテーブル）を伝送するＰＩＤを特定し、制御メッセージ（ＭＰテーブル）を分離できる。 Here, the packet filter 50 can specify the PID (= X) for transmitting the asset to be converted by referring to the asset location information in the MP table. The control message with PID = 0, which is an entry point, includes a package list table, and the MP table may be transmitted by another PID referenced from the package list table. In this case, the packet filter 50 can specify the PID for transmitting the control message (MP table) by referring to the package list table, and can separate the control message (MP table).

また、パケットフィルタ５０は、制御メッセージ（ＭＰテーブル）を記述子抽出・削除部５１に出力し、メディアフラグメントユニットをパラメータセット抽出・削除部５２に出力し、その他のパケットをパケット混合部５５に出力する。 Further, the packet filter 50 outputs a control message (MP table) to the descriptor extraction / deletion unit 51, outputs the media fragment unit to the parameter set extraction / deletion unit 52, and outputs other packets to the packet mixing unit 55. do.

記述子抽出・削除部５１は、パケットフィルタ５０より入力した制御メッセージ（ＭＰテーブル）から拡張ＭＰＵタイムスタンプ記述子を抽出すると共に、制御メッセージ（ＭＰテーブル）の拡張ＭＰＵタイムスタンプ記述子を削除するものである。そして、記述子抽出・削除部５１は、制御メッセージ（ＭＰテーブル）から抽出した拡張ＭＰＵタイムスタンプ記述子をメタデータ変換部５３に出力する。さらに、記述子抽出・削除部５１は、拡張ＭＰＵタイムスタンプ記述子を削除した制御メッセージ（ＭＰテーブル）をパケット混合部５５に出力する。 The descriptor extraction / deletion unit 51 extracts the extended MPU time stamp descriptor from the control message (MP table) input from the packet filter 50, and deletes the extended MPU time stamp descriptor of the control message (MP table). Is. Then, the descriptor extraction / deletion unit 51 outputs the extended MPU time stamp descriptor extracted from the control message (MP table) to the metadata conversion unit 53. Further, the descriptor extraction / deletion unit 51 outputs a control message (MP table) from which the extended MPU time stamp descriptor has been deleted to the packet mixing unit 55.

パラメータセット抽出・削除部５２は、パケットフィルタ５０より入力したメディアフラグメントユニットから映像符号化のパラメータセットを抽出すると共に、メディアフラグメントユニットのパラメータセットを削除するものである。具体的には、パラメータセット抽出・削除部５２は、ＭＰＵ先頭のフレームのメディアフラグメントユニットからＨＥＶＣのパラメータセットを抽出し、抽出したパラメータセットをメタデータ変換部５３に出力する。さらに、パラメータセット抽出・削除部５２は、パラメータセットを削除したメディアフラグメントユニットをＭＰＵバッファ５４に出力する。 The parameter set extraction / deletion unit 52 extracts the video coding parameter set from the media fragment unit input from the packet filter 50, and deletes the parameter set of the media fragment unit. Specifically, the parameter set extraction / deletion unit 52 extracts the HEVC parameter set from the media fragment unit of the frame at the head of the MPU, and outputs the extracted parameter set to the metadata conversion unit 53. Further, the parameter set extraction / deletion unit 52 outputs the media fragment unit from which the parameter set has been deleted to the MPU buffer 54.

メタデータ変換部５３は、記述子抽出・削除部５１から入力した拡張ＭＰＵタイムスタンプ記述子のＤＴＳ−ＰＴＳ差分情報をムービーフラグメントメタデータに変換するものである。具体的には、メタデータ変換部５３は、拡張ＭＰＵタイムスタンプ記述子を解析して、ＤＴＳ−ＰＴＳ差分情報をＩＳＯＢＭＦＦ及びＣＭＡＦで規定されるＢｏｘ形式のメタデータに変換する。 The metadata conversion unit 53 converts the DTS-PTS difference information of the extended MPU time stamp descriptor input from the descriptor extraction / deletion unit 51 into movie fragment metadata. Specifically, the metadata conversion unit 53 analyzes the extended MPU time stamp descriptor and converts the DTS-PTS difference information into Box format metadata defined by ISOBMFF and CMAF.

また、メタデータ変換部５３は、パラメータセット抽出・削除部５２から入力したパラメータセットを含むＭＰＵメタデータ（ムービーメタデータ）を生成する。具体的には、メタデータ変換部５３は、ＨＥＶＣのパラメータセットをＩＳＯＢＭＦＦ及びＣＭＡＦで規定されるＢｏｘ形式のメタデータを生成し、ＭＰＵメタデータとしてパケット混合部５５に出力する。なお、ＭＰＵメタデータは、ＭＰＵの先頭で一度だけ出力する。 Further, the metadata conversion unit 53 generates MPU metadata (movie metadata) including the parameter set input from the parameter set extraction / deletion unit 52. Specifically, the metadata conversion unit 53 generates the Box format metadata defined by ISOBMFF and CMAF for the HEVC parameter set, and outputs the MPU metadata to the packet mixing unit 55. The MPU metadata is output only once at the beginning of the MPU.

そして、メタデータ変換部５３は、チャンク１個分のＤＴＳ−ＰＴＳ差分情報の変換が完了すると、ムービーフラグメントメタデータをパケット混合部５５に出力すると共に、ＭＰＵバッファ５４に出力指示を行う。この出力指示は、ムービーフラグメントメタデータの出力タイミングに同期させて、ＭＰＵバッファ５４が出力すべきチャンクを指定している。 Then, when the conversion of the DTS-PTS difference information for one chunk is completed, the metadata conversion unit 53 outputs the movie fragment metadata to the packet mixing unit 55 and gives an output instruction to the MPU buffer 54. This output instruction specifies the chunk to be output by the MPU buffer 54 in synchronization with the output timing of the movie fragment metadata.

ＭＰＵバッファ５４は、パラメータセット抽出・削除部５２から入力したメディアフラグメントユニットを蓄積するバッファである。また、ＭＰＵバッファ５４は、ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位でメディアフラグメントユニットを出力する。つまり、ＭＰＵバッファ５４は、メタデータ変換部５３からの出力指示で指定されたチャンクに対応するメディアフラグメントユニットをパケット混合部５５に出力する。メタデータ変換部５３からＭＰＵバッファ５４への出力指示は、例えば、ＭＰＵシーケンス番号とそのＭＰＵの中の何番目のチャンクかにより、チャンクを指定して出力させるものである。 The MPU buffer 54 is a buffer for accumulating the media fragment unit input from the parameter set extraction / deletion unit 52. Further, the MPU buffer 54 outputs the media fragment unit in chunk units according to the output timing of the movie fragment metadata. That is, the MPU buffer 54 outputs the media fragment unit corresponding to the chunk specified by the output instruction from the metadata conversion unit 53 to the packet mixing unit 55. The output instruction from the metadata conversion unit 53 to the MPU buffer 54 is to specify and output chunks according to, for example, the MPU sequence number and the number of chunks in the MPU.

パケット混合部５５は、記述子抽出・削除部５１から入力した制御メッセージ（ＭＰテーブル）と、メタデータ変換部５３から入力したＭＰＵメタデータ及びムービーフラグメントメタデータと、ＭＰＵバッファ５４から入力したメディアフラグメントユニットと、パケットフィルタ５０から入力したその他のパケットとを混合し、ＣＭＡＦ適用ＭＭＴとして出力するものである。 The packet mixing unit 55 includes a control message (MP table) input from the descriptor extraction / deletion unit 51, MPU metadata and movie fragment metadata input from the metadata conversion unit 53, and a media fragment input from the MPU buffer 54. The unit and other packets input from the packet filter 50 are mixed and output as a CMAF-applied MMT.

ここで、記述子抽出・削除部５１、パラメータセット抽出・削除部５２及びＭＰＵバッファ５４の各処理においては、ＭＭＴＰパケットの形式を維持して処理し、メタデータ変換部５３でＭＭＴＰパケットを生成してもよいし、又は、ＭＭＴＰパケットのペイロードである処理対象データを一旦抽出した形式で処理してもよい。前者の場合、パケット混合部５５は、複数のＭＭＴＰパケット列を入力として、それらを混合した単一のＭＭＴＰパケット列として出力する。後者の場合、パケット混合部５５は、制御メッセージ（ＭＰテーブル）、ＭＰＵメタデータ、ムービーフラグメントメタデータ、及び、メディアフラグメントユニットをペイロードとして含むＭＭＴＰパケットを生成し、それらをその他のパケットとして入力されるＭＭＴＰパケット列に混合して、単一のＭＭＴＰパケット列として出力する。さらに、パケット混合部５５は、必要に応じて、出力するＭＭＴＰパケット列についてパケットシーケンス番号の連続性を修正するなど、ヘッダ部を書き換えてもよい。 Here, in each process of the descriptor extraction / deletion unit 51, the parameter set extraction / deletion unit 52, and the MPU buffer 54, the format of the MMTP packet is maintained and processed, and the metadata conversion unit 53 generates the MMTP packet. Alternatively, the data to be processed, which is the payload of the MMTP packet, may be processed in a format once extracted. In the former case, the packet mixing unit 55 takes a plurality of MMTP packet sequences as inputs and outputs them as a single MMTP packet sequence in which they are mixed. In the latter case, the packet mixing unit 55 generates an MMTP packet containing a control message (MP table), MPU metadata, movie fragment metadata, and a media fragment unit as a payload, and inputs them as other packets. It is mixed with the MMTP packet string and output as a single MMTP packet string. Further, the packet mixing unit 55 may rewrite the header unit, for example, by correcting the continuity of the packet sequence numbers for the output MMTP packet string, if necessary.

［ＭＭＴ変換装置の動作］
図６を参照し、ＭＭＴ変換装置５の動作について説明する。
図６に示すように、ステップＳ１０において、パケットフィルタ５０は、ＣＭＡＦ非適用ＭＭＴから、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニットと、その他のパケットとを分離する。 [Operation of MMT converter]
The operation of the MMT conversion device 5 will be described with reference to FIG.
As shown in FIG. 6, in step S10, the packet filter 50 separates the control message (MP table), the media fragment unit, and other packets from the CMAF non-applied MMT.

ステップＳ１１において、パラメータセット抽出・削除部５２は、メディアフラグメントユニットから映像符号化のパラメータセットを抽出すると共に、メディアフラグメントユニットのパラメータセットを削除する。 In step S11, the parameter set extraction / deletion unit 52 extracts the video coding parameter set from the media fragment unit and deletes the parameter set of the media fragment unit.

ステップＳ１２において、記述子抽出・削除部５１は、制御メッセージ（ＭＰテーブル）から拡張ＭＰＵタイムスタンプ記述子を抽出すると共に、制御メッセージ（ＭＰテーブル）の拡張ＭＰＵタイムスタンプ記述子を削除する。 In step S12, the descriptor extraction / deletion unit 51 extracts the extended MPU time stamp descriptor from the control message (MP table) and deletes the extended MPU time stamp descriptor of the control message (MP table).

ステップＳ１３において、メタデータ変換部５３は、抽出したパラメータセットを含むＭＰＵメタデータを生成する。
ステップＳ１４において、メタデータ変換部５３は、拡張ＭＰＵタイムスタンプ記述子のＤＴＳ−ＰＴＳ差分情報を変換したムービーフラグメントメタデータを生成する。 In step S13, the metadata conversion unit 53 generates MPU metadata including the extracted parameter set.
In step S14, the metadata conversion unit 53 generates movie fragment metadata obtained by converting the DTS-PTS difference information of the extended MPU time stamp descriptor.

ステップＳ１５において、ＭＰＵバッファ５４は、ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位でメディアフラグメントユニットを出力する。
ステップＳ１６において、パケット混合部５５は、制御メッセージと、ＭＰＵメタデータと、ムービーフラグメントメタデータと、メディアフラグメントユニットと、その他のパケットとを混合し、ＣＭＡＦ適用ＭＭＴとして出力する。 In step S15, the MPU buffer 54 outputs the media fragment unit in chunk units according to the output timing of the movie fragment metadata.
In step S16, the packet mixing unit 55 mixes the control message, the MPU metadata, the movie fragment metadata, the media fragment unit, and other packets, and outputs the CMAF-applied MMT.

なお、ステップＳ１０〜Ｓ１６の処理は、図６の順序で逐次的に実行せずとも、入力されたパケットの種別や順序に応じて、各ステップの処理順序を入れ替えたり、各ステップの処理を同時並列に実行してもよい。 Even if the processes of steps S10 to S16 are not sequentially executed in the order shown in FIG. 6, the processing order of each step can be changed or the processing of each step can be performed simultaneously according to the type and order of the input packets. It may be executed in parallel.

［作用・効果］
以上のように、ＭＭＴ変換装置５は、ＣＭＡＦ非適応ＭＭＴをＣＭＡＦ適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４Ｃが正常に映像・音声を再生できる。このように、受信機４Ｃは、ＣＭＡＦで規定されたチャンク構造のＩＳＯＢＭＦＦメタデータのみに対応する場合でも、ＭＭＴ変換装置５によって、正常に映像・音声を再生できる。 [Action / Effect]
As described above, since the MMT conversion device 5 converts the CMAF non-adaptive MMT into the CMAF-adaptive MMT, the receiver 4C can normally reproduce the video / audio regardless of the suitability of the CMAF. As described above, even when the receiver 4C corresponds only to the ISOBMFF metadata of the chunk structure defined by CMAF, the video / audio can be normally reproduced by the MMT conversion device 5.

（変形例２）
図７を参照し、変形例２に係るＭＭＴ変換装置５Ｂについて、第２実施形態と異なる点を説明する。
変形例２では、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応し、パラメータセットをメディアフラグメントユニットに含むこととする。つまり、ＭＭＴ変換装置５Ｂは、パラメータセットが入力のメディアフラグメントユニットに元々含まれており、そのまま出力すればよいので、パラメータセットをメディアフラグメントユニットから抽出して削除する必要がない。 (Modification 2)
With reference to FIG. 7, the MMT conversion device 5B according to the second modification will be described as different from the second embodiment.
In the second modification, the CMAF adaptive MMT corresponds to “hev1”, and the parameter set is included in the media fragment unit. That is, in the MMT conversion device 5B, since the parameter set is originally included in the input media fragment unit and may be output as it is, it is not necessary to extract the parameter set from the media fragment unit and delete it.

ＭＭＴ変換装置５Ｂは、図８（ｂ）のＣＭＡＦ非適応ＭＭＴを図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｅｖ１）に変換するものである。図７に示すように、ＭＭＴ変換装置５Ｂは、パケットフィルタ（分離部）５０Ｂと、記述子抽出・削除部５１と、メタデータ変換部（変換部）５３Ｂと、ＭＰＵバッファ（出力部）５４Ｂと、パケット混合部（混合部）５５とを備える。 The MMT conversion device 5B converts the CMAF non-adaptive MMT of FIG. 8 (b) into the CMAF-adaptive MMT (hev1) of FIG. 9 (b). As shown in FIG. 7, the MMT conversion device 5B includes a packet filter (separation unit) 50B, a descriptor extraction / deletion unit 51, a metadata conversion unit (conversion unit) 53B, and an MPU buffer (output unit) 54B. , A packet mixing unit (mixing unit) 55 is provided.

パケットフィルタ５０Ｂは、ＣＭＡＦ適用ＭＭＴから分離した制御メッセージ（ＭＰテーブル）を記述子抽出・削除部５１に出力する。この他、パケットフィルタ５０Ｂは、第２実施形態と同様のため、説明を省略する。 The packet filter 50B outputs a control message (MP table) separated from the CMAF-applied MMT to the descriptor extraction / deletion unit 51. In addition, since the packet filter 50B is the same as that of the second embodiment, the description thereof will be omitted.

メタデータ変換部５３Ｂは、パラメータセットを含むＭＰＵメタデータを生成しない以外、第２実施形態と同様のため、説明を省略する。なお、メタデータ変換部５３Ｂは、パラメータセットを含まないＭＰＵメタデータ（図示せず）を生成して出力してもよい。 Since the metadata conversion unit 53B is the same as the second embodiment except that it does not generate MPU metadata including a parameter set, the description thereof will be omitted. The metadata conversion unit 53B may generate and output MPU metadata (not shown) that does not include a parameter set.

ＭＰＵバッファ５４Ｂは、制御メッセージ（ＭＰテーブル）の出力タイミングに従って、パケットフィルタ５０Ｂから入力したメディアフラグメントユニットをチャンク単位で出力する。この他、ＭＰＵバッファ５４Ｂは、第２実施形態と同様のため、説明を省略する。 The MPU buffer 54B outputs the media fragment unit input from the packet filter 50B in chunk units according to the output timing of the control message (MP table). In addition, since the MPU buffer 54B is the same as that of the second embodiment, the description thereof will be omitted.

［作用・効果］
以上のように、ＭＭＴ変換装置５Ｂは、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応する場合でも、第２実施形態と同様にＣＭＡＦ非適応ＭＭＴをＣＭＡＦ適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４Ｃが正常に映像・音声を再生できる。 [Action / Effect]
As described above, the MMT conversion device 5B converts the CMAF non-adaptive MMT into the CMAF-adaptive MMT even when the CMAF-adaptive MMT corresponds to “hev1”, and therefore, regardless of the suitability of the CMAF. , The receiver 4C can play video / audio normally.

以上、本発明の各実施形態を詳述してきたが、本発明はこれらに限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。
前記した各実施形態では、多重方式がＭＭＴであることとして説明したが、これに限定されない。例えば、ＭＭＴ変換装置時への入力及びＭＭＴ変換装置からの出力の少なくとも一方において、多重方式がＤＡＳＨ／ＲＯＵＴＥであってもよい。 Although each embodiment of the present invention has been described in detail above, the present invention is not limited to these, and includes design changes and the like within a range that does not deviate from the gist of the present invention.
In each of the above-described embodiments, the multiplex method has been described as MMT, but the present invention is not limited thereto. For example, the multiplexing method may be DASH / ROUTE in at least one of the input to the MMT converter and the output from the MMT converter.

前記した各実施形態では、映像符号化方式がＨＥＶＣであることとして説明したが、これに限定されない。例えば、映像符号化方式は、ＡＶＣ（Advanced Video Coding）、ＶＶＣ（Versatile Video Coding）であってもよい。また、本発明は、符号化方式が映像符号化方式に限られず、音声符号化方式であるＡＡＣや３ＤＡ（3D Audio）にも適用できる。 In each of the above-described embodiments, the video coding method has been described as HEVC, but the present invention is not limited thereto. For example, the video coding method may be AVC (Advanced Video Coding) or VVC (Versatile Video Coding). Further, the present invention is not limited to the video coding method as the coding method, and can be applied to AAC and 3DA (3D Audio) which are audio coding methods.

前記した各実施形態では、ＭＭＴ変換装置が受信機に内蔵されていることとして説明したが、これに限定されない。例えば、ＭＭＴ変換装置は、独立したハードウェアとして実装してもよい。また、放送局側の符号化装置又は送出装置がＭＭＴ変換装置を内蔵してもよい。 In each of the above-described embodiments, the MMT converter is described as being built in the receiver, but the present invention is not limited thereto. For example, the MMT converter may be implemented as independent hardware. Further, the coding device or the transmitting device on the broadcasting station side may have a built-in MMT conversion device.

また、コンピュータが備えるＣＰＵ、メモリ、ハードディスク等のハードウェア資源を、前記したＭＭＴ変換装置として動作させるプログラムで実現することもできる。これらのプログラムは、通信回線を介して配布してもよく、ＣＤ−ＲＯＭやフラッシュメモリ等の記録媒体に書き込んで配布してもよい。 Further, hardware resources such as a CPU, a memory, and a hard disk provided in a computer can be realized by a program that operates as the above-mentioned MMT conversion device. These programs may be distributed via a communication line, or may be written and distributed on a recording medium such as a CD-ROM or a flash memory.

１，１ＢＭＭＴ変換装置（多重信号変換装置）
１０，１０Ｂパケットフィルタ（分離部）
１１メッセージバッファ
１２記述子変換部
１３記述子追加部
１４パラメータセット抽出部
１５パラメータセット追加部
１６，１６ＢＭＰＵバッファ（出力部）
１７パケット混合部（混合部）
２符号化装置
３，３Ｃ送出装置
４，４Ｃ受信機
５，５ＢＭＭＴ変換装置（多重信号変換装置）
５０，５０Ｂパケットフィルタ（分離部）
５１記述子抽出・削除部
５２パラメータセット抽出・削除部
５３，５３Ｂメタデータ変換部（変換部）
５４，５４ＢＭＰＵバッファ（出力部）
５５パケット混合部（混合部）
１００，１００Ｃ放送システム 1,1B MMT converter (multiple signal converter)
10,10B packet filter (separator)
11 Message buffer 12 Descriptor conversion unit 13 Descriptor addition unit 14 Parameter set extraction unit 15 Parameter set addition unit 16, 16B MPU buffer (output unit)
17 Packet mixing section (mixing section)
2 Coding device 3, 3C Transmission device 4, 4C Receiver 5, 5B MMT conversion device (multiple signal conversion device)
50, 50B packet filter (separator)
51 Descriptor extraction / deletion unit 52 Parameter set extraction / deletion unit 53,53B Metadata conversion unit (conversion unit)
54,54B MPU buffer (output section)
55 Packet mixing section (mixing section)
100,100C broadcasting system

Claims

ＣＭＡＦを適用した多重信号であるＣＭＡＦ適用多重信号を、ＣＭＡＦを適用していない多重信号であるＣＭＡＦ非適用多重信号に変換する多重信号変換装置であって、
前記ＣＭＡＦ適用多重信号からムービーメタデータとムービーフラグメントメタデータと制御メッセージとメディアデータとを分離する分離部と、
前記ムービーフラグメントメタデータのＤＴＳ−ＰＴＳ差分情報を記述子に変換する記述子変換部と、
前記記述子を前記制御メッセージに追加する記述子追加部と、
前記制御メッセージの出力タイミングに従って、フラグメント単位で前記メディアデータを出力する出力部と、
前記記述子追加部からの制御メッセージと前記出力部からのメディアデータとを混合し、前記ＣＭＡＦ非適用多重信号として出力する混合部と、
を備えることを特徴とする多重信号変換装置。 A multiplex signal conversion device that converts a CMAF-applied multiplex signal, which is a multiplex signal to which CMAF is applied, into a CMAF-non-applicable multiplex signal, which is a multiplex signal to which CMAF is not applied.
A separator that separates movie metadata, movie fragment metadata, control messages, and media data from the CMAF-applied multiplex signal.
A descriptor conversion unit that converts the DTS-PTS difference information of the movie fragment metadata into a descriptor, and
A descriptor addition part that adds the descriptor to the control message,
An output unit that outputs the media data in fragment units according to the output timing of the control message, and
A mixing unit that mixes the control message from the descriptor addition unit and the media data from the output unit and outputs the signal as a CMAF non-applicable multiplex signal.
A multiplex signal conversion device characterized by comprising.

前記ムービーメタデータから符号化のパラメータセットを抽出するパラメータセット抽出部と、
前記パラメータセットを前記メディアデータに追加するパラメータセット追加部と、をさらに備え、
前記出力部は、前記制御メッセージの出力タイミングに従って、前記パラメータセットが追加されたメディアデータを前記フラグメント単位で出力することを特徴とする請求項１に記載の多重信号変換装置。 A parameter set extraction unit that extracts a coding parameter set from the movie metadata,
Further provided with a parameter set addition unit for adding the parameter set to the media data.
The multiplex signal conversion device according to claim 1, wherein the output unit outputs media data to which the parameter set is added in units of fragments according to the output timing of the control message.

ＣＭＡＦを適用していない多重信号であるＣＭＡＦ非適用多重信号を、ＣＭＡＦを適用した多重信号であるＣＭＡＦ適用多重信号に変換する多重信号変換装置であって、
前記ＣＭＡＦ非適用多重信号から制御メッセージとメディアデータとを分離する分離部と、
前記制御メッセージからＤＴＳ−ＰＴＳ差分情報を含む記述子を抽出すると共に、前記制御メッセージの前記記述子を削除する記述子抽出・削除部と、
前記記述子のＤＴＳ−ＰＴＳ差分情報をムービーフラグメントメタデータに変換する変換部と、
前記ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位で前記メディアデータを出力する出力部と、
前記記述子抽出・削除部からの制御メッセージと前記変換部からのムービーフラグメントメタデータと前記出力部からのメディアデータとを混合し、前記ＣＭＡＦ適用多重信号として出力する混合部と、
を備えることを特徴とする多重信号変換装置。 A multiplex signal conversion device that converts a CMAF non-applicable multiplex signal, which is a multiplex signal to which CMAF is not applied, into a CMAF-applied multiplex signal, which is a multiplex signal to which CMAF is applied.
A separation unit that separates the control message and media data from the CMAF non-applicable multiplex signal,
A descriptor extraction / deletion unit that extracts the descriptor including the DTS-PTS difference information from the control message and deletes the descriptor of the control message.
A conversion unit that converts the DTS-PTS difference information of the descriptor into movie fragment metadata, and
An output unit that outputs the media data in chunk units according to the output timing of the movie fragment metadata,
A mixing unit that mixes the control message from the descriptor extraction / deletion unit, the movie fragment metadata from the conversion unit, and the media data from the output unit, and outputs the CMAF-applied multiplex signal.
A multiplex signal conversion device characterized by comprising.

前記メディアデータから符号化のパラメータセットを抽出すると共に、前記メディアデータの前記パラメータセットを削除するパラメータセット抽出・削除部、をさらに備え、
前記変換部は、前記パラメータセットを含むムービーメタデータを生成することを特徴とする請求項３に記載の多重信号変換装置。 A parameter set extraction / deletion unit that extracts the coding parameter set from the media data and deletes the parameter set of the media data is further provided.
The multiplex signal conversion device according to claim 3, wherein the conversion unit generates movie metadata including the parameter set.

前記多重信号がＭＭＴ、前記ムービーメタデータがＭＰＵメタデータ、前記メディアデータがメディアフラグメントユニット、及び、前記記述子が拡張ＭＰＵタイムスタンプ記述子であることを特徴とする請求項１、請求項２又は請求項４に記載の多重信号変換装置。 Claim 1, claim 2, or claim 2, wherein the multiplex signal is an MMT, the movie metadata is MPU metadata, the media data is a media fragment unit, and the descriptor is an extended MPU time stamp descriptor. The multiplex signal conversion device according to claim 4.

前記多重信号がＭＭＴ、前記メディアデータがメディアフラグメントユニット、及び、前記記述子が拡張ＭＰＵタイムスタンプ記述子であることを特徴とする請求項３に記載の多重信号変換装置。 The multiplex signal conversion device according to claim 3, wherein the multiplex signal is an MMT, the media data is a media fragment unit, and the descriptor is an extended MPU time stamp descriptor.

前記フラグメントがＭＰＵであることを特徴とする請求項１又は請求項２に記載の多重信号変換装置。 The multiplex signal conversion device according to claim 1 or 2, wherein the fragment is an MPU.

コンピュータを、請求項１から請求項７の何れか一項に記載の多重信号変換装置として機能させるためのプログラム。 A program for causing a computer to function as the multiplex signal conversion device according to any one of claims 1 to 7.

請求項１から請求項７の何れか一項に記載の多重信号変換装置を備える受信機。 A receiver including the multiplex signal conversion device according to any one of claims 1 to 7.