JP5523357B2

JP5523357B2 - Video quality estimation apparatus, method and program

Info

Publication number: JP5523357B2
Application number: JP2011000622A
Authority: JP
Inventors: 太一河野; 敬志郎渡辺; 淳岡本
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-01-05
Filing date: 2011-01-05
Publication date: 2014-06-18
Anticipated expiration: 2031-01-05
Also published as: JP2012142848A

Description

本発明は、映像品質推定装置及び方法及びプログラムに係り、特に、ＩＰネットワーク経由で行う映像通信の映像品質を推定する映像品質推定装置及び方法及びプログラムに関する。 The present invention relates to a video quality estimation apparatus, method, and program, and more particularly, to a video quality estimation apparatus, method, and program for estimating video quality of video communication performed via an IP network.

ＩＰ網の広帯域化、端末機器の高性能化，映像符号化技術の進歩に伴い、映像メディアを用いた映像通信サービス（例えば、IPTVサービス、TV電話サービスなど）が普及してきている。 Video communication services (for example, IPTV service, TV phone service, etc.) using video media have become widespread with the expansion of IP network bandwidth, the enhancement of terminal equipment performance, and the progress of video encoding technology.

映像メディアを用いた通信では、ネットワークの利用効率を向上させるため映像メディアの情報量の圧縮（映像符号化）が行われる。映像符号化による映像情報の削減により、映像信号にモザイク状の歪（ブロックノイズ）、ぼけ、にじみ、ぎくしゃく感などの現象が発生し、ユーザが知覚する品質（ユーザ体感品質：QoE（Quality of Experience））が低下する。 In communication using video media, the amount of information of the video media is compressed (video coding) in order to improve the utilization efficiency of the network. By reducing video information through video coding, phenomena such as mosaic distortion (block noise), blur, blurring, and jerky sensations occur in the video signal, and the quality perceived by the user (Quality of Experience: QoE (Quality of Experience) )) Is reduced.

上記サービスを品質良く提供するためには、サービス提供に先立った品質設計やサービス提供中の品質管理が重要となり、このためには、ユーザが享受する品質を適切に定量化し、しかも簡便かつ効率的な映像品質評価技術が必要となる。 In order to provide the above services with good quality, quality design prior to service provision and quality management during service provision are important. For this purpose, the quality enjoyed by the user is appropriately quantified, and it is simple and efficient. Image quality assessment technology is required.

従来、劣化のない映像信号（原映像信号）と符号化劣化やパケット損失劣化が発生した映像信号（劣化映像信号）の画素情報を比較することで映像品質を推定する映像品質評価法がある（例えば、特許文献１、非特許文献１，２参照）。 Conventionally, there is a video quality evaluation method for estimating video quality by comparing pixel information of a video signal without deterioration (original video signal) and a video signal (degraded video signal) in which coding degradation or packet loss degradation has occurred ( For example, see Patent Document 1 and Non-Patent Documents 1 and 2).

また、パケットのヘッダ情報から映像品質に影響を与える品質パラメータ（ビットレート，パケット損失率など）を算出し、品質パラメータから映像品質を推定する映像品質評価法がある（例えば、特許文献２参照）。 In addition, there is a video quality evaluation method for calculating quality parameters (bit rate, packet loss rate, etc.) that affect video quality from packet header information and estimating video quality from the quality parameters (see, for example, Patent Document 2). .

さらに、品質パラメータと、画素情報から算出される映像コンテンツの特徴量を用いることで映像品質を推定する映像品質評価法がある（例えば、非特許文献３参照）。 Furthermore, there is a video quality evaluation method for estimating video quality by using quality parameters and feature amounts of video content calculated from pixel information (see, for example, Non-Patent Document 3).

特許４２５７３３３号公報Japanese Patent No. 4257333 特許４４９０４８３号公報Japanese Patent No. 4490483

ITU-T勧告J。144ITU-T recommendation J. 144 ITU-T勧告J。247ITU-T recommendation J. 247 K. Yamagishi, T。 Kawano, and T. Hayashi, "Hybrid video-quality-estimation model for IPTV services," IEEE GLOBECOM 2009, CQPRM13-1, Nov. 2009.K. Yamagishi, T. Kawano, and T. Hayashi, "Hybrid video-quality-estimation model for IPTV services," IEEE GLOBECOM 2009, CQPRM13-1, Nov. 2009.

しかしながら、特許文献１及び非特許文献１、２の技術は、原映像信号を必要とするため、原映像信号を入手できない環境では適用できないという制限がある。 However, since the techniques of Patent Document 1 and Non-Patent Documents 1 and 2 require an original video signal, there is a limitation that the technique cannot be applied in an environment where the original video signal cannot be obtained.

また、特許文献２の技術は、原映像信号を必要としない技術であるが、映像コンテンツの特徴（例えば、動きの多い映像など）の違いによる品質のゆらぎ（コンテンツ依存性）は考慮できない。図１は、主観品質評価実験の結果を基に、ビットレートと映像品質の関係をグラフ化したものである。図１のように、同じビットレート条件でも、映像コンテンツの違いにより映像品質が異なる。映像品質を正確に推定するためには、コンテンツ依存性を加味する必要がある。 The technique of Patent Document 2 is a technique that does not require an original video signal, but quality fluctuations (content dependency) due to differences in video content characteristics (for example, video with a lot of motion) cannot be considered. FIG. 1 is a graph showing the relationship between bit rate and video quality based on the results of subjective quality evaluation experiments. As shown in FIG. 1, even under the same bit rate condition, the video quality varies depending on the video content. In order to accurately estimate video quality, it is necessary to consider content dependency.

さらに、非特許文献３の技術はコンテンツ依存性を考慮するために、品質パラメータに加え、画素情報から算出されるコンテンツの特徴を定量化した特徴量（コンテンツ特徴量）を用いている。しかし、多種多様なコンテンツの特徴を画素情報に基づくコンテンツ特徴量のみで表現するのは難しく、コンテンツ特徴量だけでは、完全にコンテンツ依存性を考慮したとはいい難い。 Furthermore, in order to consider content dependency, the technology of Non-Patent Document 3 uses a feature amount (content feature amount) obtained by quantifying the content feature calculated from pixel information in addition to the quality parameter. However, it is difficult to express features of various types of content only with content feature amounts based on pixel information, and it is difficult to completely consider content dependency only with content feature amounts.

本発明は、上記に鑑みてなされたもので、その目的とするところは、原映像信号を用いずに、劣化映像信号のみからコンテンツ依存性を多面的に考慮して映像品質を推定する装置及び方法及びプログラムを提供することである。 The present invention has been made in view of the above, and an object of the present invention is to provide an apparatus for estimating video quality by considering content dependency from various aspects only from a degraded video signal without using an original video signal, and It is to provide a method and program.

上記の課題を解決するため、本発明は、ネットワークを介して映像を配信するサービスに対してユーザが体感する映像品質を推定する映像品質推定装置であって、
映像配信で用いるパケットのヘッダ情報及び前記パケットに含まれる映像ビットストリームを復号し得られる映像信号から、特徴量を算出する特徴量導出手段と、
前記特徴量から映像品質を算出する映像品質導出手段と、
を有し、
前記特徴量導出手段は、
前記映像配信で用いるパケットのヘッダ情報から単位時間あたりのパケットもしくは映像ビットストリームのビット量（ビットレート）を算出するビットレート算出手段と、
前記映像ビットストリームを復号して得られる映像信号から動き量を算出する動き量算出手段と、
前記映像ビットストリームを復号して得られる映像信号からブロックノイズ量を算出するブロックノイズ量算出手段と、
を備え、
前記映像品質導出手段は、
前記ビットレートと、前記ブロックノイズ量と、前記動き量から映像品質を推定する映像品質推定手段を備えることを特徴とする。
In order to solve the above problems, the present invention is a video quality estimation device that estimates video quality experienced by a user for a service for distributing video over a network,
Feature quantity deriving means for calculating a feature quantity from header information of a packet used in video distribution and a video signal obtained by decoding a video bitstream included in the packet;
Video quality deriving means for calculating video quality from the feature amount;
Have
The feature amount derivation means includes:
A bit rate calculation means for calculating the bit amount of the packet or video bitstream per unit time from the header information of the packet used in video distribution (bit rate),
And motion amount calculation means for calculating a motion amount from the video signal obtained by decoding the video bit stream,
And block noise amount calculating means for calculating the block noise amount from a video signal obtained by decoding the video bit stream,
With
The video quality deriving means includes
Video quality estimating means for estimating video quality from the bit rate, the block noise amount, and the motion amount is provided.

上記にように、本発明によれば、パケットヘッダ情報と映像信号からコンテンツの特徴及び符号化劣化を定量化した特徴量を抽出し映像通信サービスの映像品質値を推定できる。そのため、サービスを利用するユーザに対してある一定以上の品質を保っているかどうかをより正確に判断することができる。これにより、提供中のサービスの品質実態を把握・管理することが可能となる。 As described above, according to the present invention, it is possible to estimate the video quality value of the video communication service by extracting the feature amount obtained by quantifying the content feature and the encoding deterioration from the packet header information and the video signal. Therefore, it is possible to more accurately determine whether or not a certain level of quality is maintained for a user who uses the service. This makes it possible to grasp and manage the quality of the service being provided.

映像品質のコンテンツ依存性を示す図である。It is a figure which shows the content dependence of video quality. 動き量の品質特性を示す図である。It is a figure which shows the quality characteristic of motion amount. 本発明の一実施の形態における映像品質推定装置の構成図である。It is a block diagram of the video quality estimation apparatus in one embodiment of this invention. 本発明の一実施の形態におけるブロックノイズ量算出部における符号化ブロック境界領域を示す図である。It is a figure which shows the encoding block boundary area | region in the block noise amount calculation part in one embodiment of this invention. 本発明の一実施の形態におけるビットレートと平均映像品質の関係を示す図である。It is a figure which shows the relationship between the bit rate and average image quality in one embodiment of this invention.

以下図面と共に、本発明の実施の形態を説明する。 Embodiments of the present invention will be described below with reference to the drawings.

本発明では、パケットヘッダ情報から得られる品質パラメータと画素情報から得られるコンテンツの特徴量に、画素情報から得られる符号化劣化を定量化した特徴量をコンテンツ依存性を評価する項目に加えることで、コンテンツの特徴だけではなく、コンテンツの違いによる劣化も考慮することで、コンテンツ依存性を多面的に捉え、映像品質を推定する精度を向上させる。 In the present invention, by adding the feature quantity obtained by quantifying the coding deterioration obtained from the pixel information to the quality parameter obtained from the packet header information and the content feature quantity obtained from the pixel information to the item for evaluating the content dependency. By taking into account not only the characteristics of the content but also the deterioration due to the difference in content, the content dependency is multifaceted and the accuracy of estimating the video quality is improved.

具体的には、パケットヘッダ情報から得られる品質パラメータとして「ビットレート」、画素情報から得られるコンテンツの特徴量として「動き量」、画素情報から得られる符号化劣化を定量化した特徴量として「ブロックノイズ量」を用いてコンテンツ依存性を考慮し、映像品質を推定する。 Specifically, the quality parameter obtained from the packet header information is “bit rate”, the content feature quantity obtained from the pixel information is “motion amount”, and the coding quantity obtained from the pixel information is quantified as “feature quantity”. Video quality is estimated in consideration of content dependency using “block noise amount”.

（１）ビットレート（パケットヘッダ情報）
ビットレートとは、パケットもしくは、パケットに含まれる映像ビットストリームの時間単位あたりのビット量を表す。パケットのヘッダ情報（例えば、ＩＰパケットヘッダに含まれるパケット長とＲＴＰパケットのヘッダに含まれるタイムスタンプ）を用いて単位時間あたりのビット量を計算することで、ビットレートは算出される。 (1) Bit rate (packet header information)
The bit rate represents the amount of bits per time unit of a packet or a video bit stream included in the packet. The bit rate is calculated by calculating the bit amount per unit time using the packet header information (for example, the packet length included in the IP packet header and the time stamp included in the RTP packet header).

映像符号化は情報量の圧縮のために映像情報を削減している。そのため、ビットレートが高い／低いは、映像情報の削減量の少ない／多いを意味し、同じ映像コンテンツに限れば、ビットレートが高いほど品質は高く、逆に、ビットレートが低くなるほど映像情報の削減量が多くなり、品質は低くなる。このことから、映像品質に影響を与える特徴量の一つとして符号化劣化を定量化したビットレートを用いる。 Video coding reduces video information to compress the amount of information. Therefore, a high / low bit rate means a small / large video information reduction amount. For the same video content, the higher the bit rate, the higher the quality, and vice versa. Reduction amount increases and quality decreases. For this reason, a bit rate obtained by quantifying coding degradation is used as one of the feature quantities that affect the video quality.

（３）動き量（コンテンツ特徴量）
映像情報の中には、冗長性のあるものや、人間の知覚的に認識されないものもあり、そういった映像情報は削減しても、品質はあまり低下しない。このように、映像情報の削減量が品質に及ぼす影響の大小に作用する。映像情報が有するコンテンツの特徴の一つとして、品質に及ぼす影響の大きい「動き」を用い、「動き」を定量化した特徴量をコンテンツ依存性の評価項目とする。 (3) Movement amount (content feature amount)
Some video information is redundant and some is not perceptually perceived by humans. Even if such video information is reduced, the quality does not deteriorate much. In this way, the amount of video information reduction affects the magnitude of the effect on quality. As one of the features of the content of the video information, “motion” that has a large effect on quality is used, and a feature amount obtained by quantifying “motion” is used as an evaluation item for content dependency.

動きを定量化した動き量の算出方法を以下で説明する、全ての隣接する２つフレームの組に対し、フレーム間の同じ画素位置の輝度値の差分値を全ての画素において求め、その差分値の標準偏差を算出する。このように求めた標準偏差を全ての隣接するフレームの組で平均した値を動き量とする。 A method for calculating the amount of motion by quantifying motion will be described below. For all adjacent two frames, a difference value of luminance values at the same pixel position between frames is obtained for all pixels, and the difference value is obtained. Calculate the standard deviation. A value obtained by averaging the standard deviations obtained in this way over a set of all adjacent frames is defined as a motion amount.

映像符号化は画素値の時間的な冗長性を削減して情報量を圧縮している。そのため、画素の時間的変化、つまり、動きが多い映像は、画素値の時間的冗長性が低くなり、品質を保って情報を圧縮するのが難しくなる。逆に、動きが少ない映像は、画素値の時間的冗長性が高くなり、品質を保って情報を圧縮し易くなる。図２は、ビットレートと映像品質の関係を示しており、プロット点が濃いほど、動き量が小さいことを示し、プロット点が薄いほど、動き量が大きいことを示す。同じビットレート条件では、動き量が大きいほど、品質が低くなる傾向がある。 Video encoding compresses the amount of information by reducing temporal redundancy of pixel values. For this reason, a temporal change of pixels, that is, an image with a lot of movement has low temporal redundancy of pixel values, and it becomes difficult to compress information while maintaining quality. Conversely, a video with little motion has high temporal redundancy of pixel values, and information can be easily compressed while maintaining quality. FIG. 2 shows the relationship between the bit rate and the video quality. The darker the plot point, the smaller the motion amount, and the thinner the plot point, the greater the motion amount. Under the same bit rate condition, the larger the amount of motion, the lower the quality.

（３）ブロックノイズ量（劣化特徴量）
劣化映像信号において発生した「劣化」の中でも主要な符号化劣化である「ブロックノイズ」を定量化した「ブロックノイズ量」をコンテンツ依存性の評価項目とする。 (3) Block noise amount (deterioration feature amount)
The “block noise amount” obtained by quantifying “block noise”, which is the main coding deterioration among the “deteriorations” generated in the deteriorated video signal, is used as an evaluation item for content dependency.

映像符号化は、フレームもしくはフィールドを矩形ブロック（符号化ブロック）に区分し、符号化ブロック毎に情報を削減しているため、符号化ブロックの境界付近に原映像信号には存在しない画素値（輝度値及び色差値）の不自然な段差が生じる。こうした劣化をブロックノイズという。ブロックノイズを定量化したブロックノイズ量の算出方法を以下で説明する。 In video encoding, a frame or field is divided into rectangular blocks (encoded blocks), and information is reduced for each encoded block. Therefore, pixel values that do not exist in the original video signal near the boundary of the encoded block ( Luminance value and color difference value) are unnatural. Such deterioration is called block noise. A method for calculating the block noise amount obtained by quantifying the block noise will be described below.

対象とする映像信号にエッジ抽出フィルタを適用する。このとき、符号化ブロックの境界付近（符号化ブロック境界領域）にブロックノイズにより発生したエッジが抽出される。 An edge extraction filter is applied to the target video signal. At this time, an edge generated by block noise is extracted near the boundary of the encoded block (encoded block boundary region).

次に、フレーム毎に符号化ブロック境界領域のエッジ量を算出する。符号化ブロック境界領域には、原映像信号のエッジも含まれるため、符号化ブロック境界領域以外の領域のエッジ量で符号化ブロック境界領域のエッジ量を除算し、正規化する。正規化した値を全フレームで平均した値をブロックノイズ量とする。 Next, the edge amount of the encoded block boundary region is calculated for each frame. Since the encoded block boundary region includes the edge of the original video signal, the edge amount of the encoded block boundary region is divided by the edge amount of the region other than the encoded block boundary region and normalized. A value obtained by averaging the normalized values in all frames is set as the block noise amount.

ブロックノイズ量が増加すると、符号化ブロックの境界付近にブロックノイズが発生し、映像品質が低下する。 When the block noise amount increases, block noise is generated near the boundary of the encoded block, and the video quality is degraded.

図３は、本発明の一実施の形態における映像品質推定装置の構成図である。 FIG. 3 is a configuration diagram of the video quality estimation apparatus according to the embodiment of the present invention.

同図に示す映像品質推定装置は、特徴導出部１０と映像品質導出部２０から構成される。 The video quality estimation apparatus shown in FIG. 1 includes a feature deriving unit 10 and a video quality deriving unit 20.

特徴導出部１０は、ビットレート算出部１００、ブロックノイズ量算出部１０１、動き量算出部１０２から構成される。 The feature deriving unit 10 includes a bit rate calculating unit 100, a block noise amount calculating unit 101, and a motion amount calculating unit 102.

特徴量導出部１０は、映像通信に用いるＩＰパケット及びＩＰパケットに含まれる映像ビットストリームを復号した映像信号を入力とする。ＩＰパケットをビットレート算出部１００に、映像信号をブロックノイズ量算出部１０１と動き量算出部１０２に入力する。 The feature amount deriving unit 10 receives an IP packet used for video communication and a video signal obtained by decoding a video bitstream included in the IP packet. The IP packet is input to the bit rate calculation unit 100, and the video signal is input to the block noise amount calculation unit 101 and the motion amount calculation unit 102.

ビットレート算出部１００は、入力されたＩＰパケットから、単位時間あたりのＩＰパケットもしくは映像ビットストリームのビット量（ビットレートＢｒ）を算出する。例えば、ＩＰパケットのヘッダに含まれるパケット長とＲＴＰ（Real time Transport Protocol）ヘッダに含まれるタイムスタンプを用いて算出することができる。ビットレートBrを映像品質導出部２０に入力する。 The bit rate calculation unit 100 calculates the bit amount (bit rate Br) of the IP packet or video bit stream per unit time from the input IP packet. For example, it can be calculated using the packet length included in the header of the IP packet and the time stamp included in the RTP (Real time Transport Protocol) header. The bit rate Br is input to the video quality deriving unit 20.

ブロックノイズ量算出部１０１は、入力された映像信号（本実施の形態では映像信号に含まれる輝度信号に適用するが，色差信号でもよい）からブロックノイズ量Bを算出する。ブロックノイズ量Bは、文献１（河野，山岸，岡本，"IPTVサービスを対象としたNR型メディアレイヤモデルの提案," 信学技報, CQ 109(373), pp.137-142, Jan. 2010）に記載のブロックノイズ量を用いる。 The block noise amount calculation unit 101 calculates the block noise amount B from the input video signal (in this embodiment, it is applied to the luminance signal included in the video signal, but may be a color difference signal). The block noise amount B is described in Ref. 1 (Kono, Yamagishi, Okamoto, "Proposal of NR type media layer model for IPTV service," IEICE Technical Report, CQ 109 (373), pp.137-142, Jan. 2010) is used.

まず，入力された映像信号に、エッジ抽出フィルタを適用する．このとき，適用後の信号をエッジ映像信号とする。本実施の形態では、エッジ抽出フィルタとしてCanny Edge Detector（文献２："A Computational Approach to Edge Detection," by J.F. Canny, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, pp.769-798, 1986）を使用する。エッジ抽出フィルタとしては、Sobelフィルタやラプラスフィルタを適用することができるが、Canny Edge Detectorを利用すると、精度良くブロックノイズを抽出できる。 First, an edge extraction filter is applied to the input video signal. At this time, the applied signal is an edge video signal. In this embodiment, Canny Edge Detector (Reference 2: “A Computational Approach to Edge Detection,” by JF Canny, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, pp.769-798, 1986) is used as an edge extraction filter. use. As an edge extraction filter, a Sobel filter or a Laplace filter can be applied. However, if Canny Edge Detector is used, block noise can be extracted with high accuracy.

次に、符号化ブロック境界領域と非符号化ブロック境界領域を定義する。符号化ブロック境界領域は、離散コサイン変換及び量子化の処理単位である符号化ブロックの境界付近の画素の集合である。本実施の形態では、図４のように、符号化ブロック境界線（図の黒の線）に隣接する画素の集合、ブロック境界線を含む画素及びその画素に隣接する画素の集合を符号化ブロック境界領域とする。なお、符号化ブロック境界線の位置はコーデックの種類により異なる。また、解像度変換などの画像処理により、コーデックで定義されているブロック境界位置と一致しないことがある。例えば、解像度1440×1080の符号化映像を、解像度1920×1080の映像にサイズ変換する場合、8×8のブロックサイズは10.66×8のブロックサイズとなる。符号化ブロック境界線の位置についてはシステムに基づいて予め設定しておくか、推定する必要があり、本実施の形態では、ブロック境界位置が予め設定されている場合の例を示す。一方、非符号化ブロック境界領域は、符号化ブロック境界領域以外の全ての画素の集合を示す。 Next, a coded block boundary region and a non-coded block boundary region are defined. The coding block boundary region is a set of pixels near the boundary of the coding block which is a unit of processing for discrete cosine transform and quantization. In the present embodiment, as shown in FIG. 4, a set of pixels adjacent to an encoded block boundary line (black line in the figure), a pixel including a block boundary line, and a set of pixels adjacent to the pixel are encoded blocks. Boundary area. The position of the encoding block boundary line varies depending on the type of codec. Also, image processing such as resolution conversion may not match the block boundary position defined by the codec. For example, when the size of an encoded video having a resolution of 1440 × 1080 is converted to a video having a resolution of 1920 × 1080, the block size of 8 × 8 becomes a block size of 10.66 × 8. The position of the encoded block boundary line needs to be set or estimated in advance based on the system. In this embodiment, an example in which the block boundary position is set in advance is shown. On the other hand, the non-encoded block boundary area indicates a set of all pixels other than the encoded block boundary area.

次に、フレーム毎にエッジ映像信号の符号化ブロック境界領域内に位置する画素値の合計を符号化ブロック境界領域の総画素数で除算した値B₁(f)を算出する。ここでfはフレーム番号を示す。ブロックノイズのエッジは、符号化ブロック境界領域に発生するため、ブロックノイズが強くなると、B₁(f)が大きくなる。しかし、原映像信号のエッジも符号化ブロック境界領域に含まれるため、フレーム毎にエッジ映像信号の非符号化ブロック境界領域内に位置する画素値の合計を非符号化ブロック境界領域の総画素数で除算した値
B₂(f)を算出し、B₁(f)をB₂(f)で除算し原映像信号のエッジの影響を取り除く。次式のように、
B₁(f)／B₂(f)
を全てのフレームで平均した値をブロックノイズ量Bとする。 Next, a value B ₁ (f) is calculated by dividing the sum of the pixel values located in the encoded block boundary region of the edge video signal by the total number of pixels in the encoded block boundary region for each frame. Here, f indicates a frame number. Since the block noise edge occurs in the coding block boundary region, B ₁ (f) increases when the block noise increases. However, since the edge of the original video signal is also included in the encoded block boundary region, the total number of pixel values located in the non-encoded block boundary region of the edge video signal for each frame is the total number of pixels in the non-encoded block boundary region Value divided by
B ₂ (f) is calculated, and B ₁ (f) is divided by B ₂ (f) to remove the influence of the edge of the original video signal. Like the following formula:
B ₁ (f) / B ₂ (f)
Is the block noise amount B.

ここでNは総フレーム数を示す。

Here, N indicates the total number of frames.

動き量算出部１０２は，入力された映像信号から，動き量Ｍを算出する．動き量Ｍは文献３（ＩＴＵ−Ｔ勧告Ｐ．９１０）に記載のＴＩ（Temporal perceptual information）をベースとしている。 The motion amount calculation unit 102 calculates a motion amount M from the input video signal. The movement amount M is based on TI (Temporal perceptual information) described in Document 3 (ITU-T recommendation P.910).

まず、隣接するフレーム間の同じ画素位置の輝度値の差分値ｍ（ｆ，ｗ，ｈ）を次式のように算出する。 First, a difference value m (f, w, h) of luminance values at the same pixel position between adjacent frames is calculated as follows.

M(f，w，h) = y(f，w，h) − y(f−1，w，h)
ここでｙ（ｆ，ｗ，ｈ）は映像信号のｆフレーム目の画素位置（ｗ，ｈ）の輝度値を示す。 M (f, w, h) = y (f, w, h) − y (f−1, w, h)
Here, y (f, w, h) represents the luminance value at the pixel position (w, h) of the f frame of the video signal.

次に、次式のように隣接するフレームの組毎にｍ（ｆ，ｗ，ｈ）の標準偏差を算出し、全ての隣接するフレームの組で平均した値を動き量Ｍとする。 Next, a standard deviation of m (f, w, h) is calculated for each set of adjacent frames as in the following equation, and a value averaged over all sets of adjacent frames is set as the motion amount M.

上記において、std_space(m(f,w,h))は、m(f,w,h)の標準偏差をフレームｆ毎に出力する関数、Ｎは総フレーム数を示す。

In the above, std _space (m (f, w, h)) is a function for outputting the standard deviation of m (f, w, h) for each frame f, and N indicates the total number of frames.

映像品質導出部２０は平均映像品質算出部２００と映像品質算出部２０１から構成される。映像品質導出部２０は、入力されたビットレートを平均映像品質算出部２００に入力し、入力されたブロックノイズ量と動き量を映像品質算出部２０１に入力する。 The video quality deriving unit 20 includes an average video quality calculating unit 200 and a video quality calculating unit 201. The video quality deriving unit 20 inputs the input bit rate to the average video quality calculating unit 200 and inputs the input block noise amount and motion amount to the video quality calculating unit 201.

平均映像品質算出部２００は、入力されたビットレートBrから平均映像品質を算出する。平均映像品質は、特許文献２の平均映像品質を用いる。図５は、ビットレートBrと、ビットレート毎のある映像コンテンツ群の平均映像品質（平均映像品質Ｖq_ave)の関係を示す。ビットレートBrの増加に伴い、平均映像品質Ｖq_aveも増加し、品質の上限値に収束する．一方，ビットレートBrの減少に伴い，平均映像品質Ｖq_aveも減少し、品質の下限値に収束する。この品質特性を数式化し、その数式にビットレートBrを入力することで、平均映像品質を算出する。本実施の形態では以下の数式を用いる。 The average video quality calculation unit 200 calculates the average video quality from the input bit rate Br. As the average video quality, the average video quality disclosed in Patent Document 2 is used. FIG. 5 shows the relationship between the bit rate Br and the average video quality (average video quality Vq _ave ) of a certain video content group for each bit rate. With the increase in bit rate Br, average video quality Vq _ave also increases and converges to the upper limit value of the quality. On the other hand, as the bit rate Br decreases, the average video quality Vq _ave also decreases and converges to the lower limit of quality. By formulating this quality characteristic and inputting the bit rate Br into the formula, the average video quality is calculated. In the present embodiment, the following mathematical formula is used.

但し、v₁，v₂，v₃はコーデック種別、サービス種別、映像解像度などにより決まる係数である。表１のように、予め，コーデック種別（H.264，MPEG-2など）、サービス種別（IPTV，テレビ電話など）、映像解像度（1080i，720pなど）の組み合わせ毎に、モデル係数のテーブルとして保持する必要がある。これらのテーブルは主観品質評価実験の結果に基づいて作成される。

However, v ₁ , v ₂ , and v ₃ are coefficients determined by the codec type, service type, video resolution, and the like. As shown in Table 1, a model coefficient table is stored for each combination of codec type (H.264, MPEG-2, etc.), service type (IPTV, videophone, etc.), and video resolution (1080i, 720p, etc.). There is a need to. These tables are created based on the results of subjective quality evaluation experiments.

映像品質算出部２０１は、入力された平均映像品質Ｖq_ave，ブロックノイズ量B，動き量Mから映像品質Vｑを算出する。次式のように、映像品質Vｑは、平均映像品質Ｖq_ave，ブロックノイズ量B，動き量M，及び、それらをかけ合わせたものの線形和として算出される。

The video quality calculation unit 201 calculates the video quality Vq from the input average video quality Vq _ave , block noise amount B, and motion amount M. As shown in the following equation, the video quality Vq is calculated as a linear sum of the average video quality Vq _ave , the block noise amount B, the motion amount M, and the product of them.

Ｖq=ｖ₄ Ｖq_ave＋ｖ₅ B + ｖ₆ M + v7 Ｖq_aveB + ｖ₈ Ｖq_aveM + ｖ₉B M+ｖ₁₀
但し、ｖ₄，ｖ_５，ｖ_６，ｖ_７，ｖ_８，ｖ_９，ｖ_１０はコーデック種別，サービス種別，映像解像度などにより決まる係数で表１のように、予め、コーデック種別（H.264，MPEG-2など）、サービス種別（IPTV，テレビ電話など）、映像解像度（1080i，720pなど）の組み合わせ毎に，モデル係数のテーブルとして保持する必要がある。これらのテーブルは主観品質評価実験の結果に基づいて作成される。 _{_{Vq = v 4 Vq ave + v}} 5 B + v 6 M + v7 Vq ave B + v 8 Vq ave M + v 9 B M + v 10
However, v ₄ , v ₅ , v ₆ , v ₇ , v ₈ , v ₉ , v ₁₀ are coefficients determined by the codec type, service type, video resolution, and the like. , MPEG-2, etc.), service type (IPTV, videophone, etc.), and video resolution (1080i, 720p, etc.) must be stored as a model coefficient table. These tables are created based on the results of subjective quality evaluation experiments.

なお、図３に示す映像品質推定装置の構成要素の各動作をプログラムとして構築し、映像品質推定装置として利用されるコンピュータにインストールして実行させる、または、ネットワークを介して流通させることが可能である。 Each operation of the constituent elements of the video quality estimation apparatus shown in FIG. 3 can be constructed as a program and installed in a computer used as the video quality estimation apparatus, or can be distributed via a network. is there.

本発明は、上記の実施の形態に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments, and various modifications and applications can be made within the scope of the claims.

１０特徴量導出部
２０映像品質導出部
１００ビットレート算出部
１０１ブロックノイズ量算出部
１０２動き量算出部
２００平均映像品質算出部
２０１映像品質算出部 DESCRIPTION OF SYMBOLS 10 Feature amount deriving part 20 Video quality deriving part 100 Bit rate calculating part 101 Block noise amount calculating part 102 Motion amount calculating part 200 Average video quality calculating part 201 Video quality calculating part

Claims

ネットワークを介して映像を配信するサービスに対してユーザが体感する映像品質を推定する映像品質推定装置であって、
映像配信で用いるパケットのヘッダ情報及び前記パケットに含まれる映像ビットストリームを復号し得られる映像信号から、特徴量を算出する特徴量導出手段と、
前記特徴量から映像品質を算出する映像品質導出手段と、
を有し、
前記特徴量導出手段は、
前記映像配信で用いるパケットのヘッダ情報から単位時間あたりのパケットもしくは映像ビットストリームのビット量（ビットレート）を算出するビットレート算出手段と、
前記映像ビットストリームを復号して得られる映像信号から動き量を算出する動き量算出手段と、
前記映像ビットストリームを復号して得られる映像信号からブロックノイズ量を算出するブロックノイズ量算出手段と、
を備え、
前記映像品質導出手段は、
前記ビットレートと、前記ブロックノイズ量と、前記動き量から映像品質を推定する映像品質推定手段を備える
ことを特徴とする映像品質推定装置。 A video quality estimation device that estimates video quality experienced by a user for a service that distributes video over a network,
Feature quantity deriving means for calculating a feature quantity from header information of a packet used in video distribution and a video signal obtained by decoding a video bitstream included in the packet;
Video quality deriving means for calculating video quality from the feature amount;
Have
The feature amount derivation means includes:
A bit rate calculation means for calculating the bit amount of the packet or video bitstream per unit time from the header information of the packet used in video distribution (bit rate),
And motion amount calculation means for calculating a motion amount from the video signal obtained by decoding the video bit stream,
And block noise amount calculating means for calculating the block noise amount from a video signal obtained by decoding the video bit stream,
With
The video quality deriving means includes
An image quality estimation apparatus comprising image quality estimation means for estimating image quality from the bit rate, the block noise amount, and the motion amount.

前記動き量算出手段は、
前記映像信号に基づき，映像コンテンツの画素値の時間方向の変化を定量化した動き量を算出する手段と、
前記映像信号に基づき，隣接する２つのフレーム間の同じ画素位置の画素値の差分値の標準偏差を、隣接する２つのフレーム組毎に算出し、前記標準偏差を全ての隣接する２つのフレーム組で平均した値を動き量として算出する手段と、
を含む請求項１に記載の映像品質推定装置。 The movement amount calculating means includes
Means for calculating a motion amount quantifying a change in a pixel value of a video content in a time direction based on the video signal;
Based on the video signal, a standard deviation of a difference value of pixel values at the same pixel position between two adjacent frames is calculated for each two adjacent frame sets, and the standard deviation is calculated for all two adjacent frame sets. Means for calculating the amount of movement as an amount of movement,
The video quality estimation apparatus according to claim 1 , comprising:

前記ブロックノイズ量算出手段は、
前記映像信号にエッジ抽出フィルタを適用し、エッジ映像信号を生成する手段と、
前記エッジ映像信号について、映像符号化方式における離散コサイン変換の適用単位である矩形ブロックに関する境界線の近隣領域の画素の集合を符号化ブロック境界領域とし、符号化ブロック境界領域以外の画素の集合を非符号化ブロック境界領域として、フレーム毎に分類する手段と、
前記エッジ映像信号からフレーム毎に符号化ブロック境界領域のエッジ量（符号化ブロック境界領域エッジ量）を算出する手段と、
前記エッジ映像信号からフレーム毎に非符号化ブロック境界領域のエッジ量（非符号化ブロック境界領域エッジ量）を算出する手段と、
フレーム毎に前記符号化ブロック境界領域エッジ量を前記非符号化ブロック境界領域エッジ量で除算し、除算した値を全フレームで平均した値をブロックノイズ量として算出する手段と、
を含む請求項１に記載の映像品質推定装置。 The block noise amount calculating means includes:
Means for applying an edge extraction filter to the video signal to generate an edge video signal;
For the edge video signal, a set of pixels in a neighboring area of a boundary line related to a rectangular block which is an application unit of discrete cosine transform in the video coding method is set as a coded block boundary area, and a set of pixels other than the coded block boundary area Means for classifying each frame as an uncoded block boundary region;
Means for calculating an edge amount of an encoded block boundary region (encoded block boundary region edge amount) for each frame from the edge video signal;
Means for calculating an edge amount of an unencoded block boundary region (unencoded block boundary region edge amount) for each frame from the edge video signal;
Means for dividing the encoded block boundary region edge amount by the unencoded block boundary region edge amount for each frame, and calculating a value obtained by averaging the divided values in all frames as a block noise amount;
The video quality estimation apparatus according to claim 1, comprising:

前記映像品質推定手段は、
前記ビットレートから平均映像品質（ビットレート毎の複数コンテンツの映像品質の平均値）を算出する平均映像品質算出手段と、
前記平均映像品質算出手段で算出された当該平均映像品質と、前記ブロックノイズ量算出手段で算出された前記ブロックノイズ量と、前記動き量算出手段で算出された前記動き量を基に、映像品質を算出する映像品質算出手段と、
を含む請求項１に記載の映像品質推定装置。 The video quality estimation means includes
Average video quality calculating means for calculating average video quality (average value of video quality of a plurality of contents for each bit rate) from the bit rate;
Based on the average video quality calculated by the average video quality calculation unit, the block noise amount calculated by the block noise amount calculation unit, and the motion amount calculated by the motion amount calculation unit , the video quality Video quality calculating means for calculating
The video quality estimation apparatus according to claim 1 , comprising:

前記平均映像品質算出手段は、
前記ビットレートから平均映像品質を算出するモデル式と、
コーデック種別、サービス種別、映像解像度といった品質特性に対応して、予め，その品質特性に応じたモデル式の係数のテーブルと、
を備える請求項４に記載の映像品質推定装置。 The average video quality calculating means includes
A model formula for calculating the average video quality from the bit rate;
Corresponding to quality characteristics such as codec type, service type, and video resolution, a table of model coefficient coefficients corresponding to the quality characteristics in advance,
The video quality estimation apparatus according to claim 4 , comprising:

前記映像品質算出手段は、
前記平均映像品質、前記動き量、前記ブロックノイズ量から映像品質を算出するモデル式と、
コーデック種別、サービス種別、映像解像度といった品質特性に対応して、予め、その品質特性に応じたモデル式の係数のテーブルと、
を備える請求項４に記載の映像品質推定装置。 The video quality calculating means includes
A model formula for calculating video quality from the average video quality, the motion amount, and the block noise amount,
Corresponding to quality characteristics such as codec type, service type, and video resolution, a model coefficient table corresponding to the quality characteristics in advance,
The video quality estimation apparatus according to claim 4 , comprising:

ネットワークを介して映像を配信するサービスに対してユーザが体感する映像品質を推定する映像品質推定方法であって、
特徴量導出手段が、映像配信で用いるパケットのヘッダ情報及び前記パケットに含まれる映像ビットストリームを復号し得られる映像信号から、特徴量を算出する特徴量導出ステップと、
映像品質導出手段が、前記特徴量から映像品質を算出する映像品質導出ステップと、を行う映像品質推定方法であり、
前記特徴量導出ステップにおいて、
前記映像配信で用いるパケットのヘッダ情報から単位時間あたりのパケットもしくは映像ビットストリームのビット量（ビットレート）を算出し、
前記映像ビットストリームを復号して得られる映像信号から動き量を算出し、
前記映像ビットストリームを復号して得られる映像信号からブロックノイズ量を算出することにより前記特徴量を算出し、
前記映像品質導出ステップにおいて、
前記ビットレートと、前記ブロックノイズ量と、前記動き量から映像品質を推定する
ことを特徴とする映像品質推定方法。 A video quality estimation method for estimating video quality experienced by a user for a service that distributes video over a network,
A feature amount derivation step, wherein the feature amount derivation means calculates the feature amount from the header information of the packet used in video distribution and the video signal obtained by decoding the video bitstream included in the packet;
A video quality estimation method in which video quality deriving means performs a video quality deriving step of calculating video quality from the feature amount ;
In the feature amount derivation step,
From the header information of the packet used in the video distribution, calculate the bit amount (bit rate) of the packet or video bit stream per unit time,
A motion amount is calculated from a video signal obtained by decoding the video bitstream;
Calculating the feature amount by calculating a block noise amount from a video signal obtained by decoding the video bitstream;
In the video quality derivation step,
Video quality is estimated from the bit rate, the block noise amount, and the motion amount.
A video quality estimation method characterized by the above .

前記映像品質導出ステップにおいて、
前記ビットレートから平均映像品質（ビットレート毎の複数コンテンツの映像品質の平均値）を算出し、
前記平均映像品質と、前記ブロックノイズ量と、前記動き量を基に、映像品質を算出する、
請求項７に記載の映像品質推定方法。 In the video quality derivation step,
Calculate average video quality (average video quality of multiple contents for each bit rate) from the bit rate,
Calculating the video quality based on the average video quality, the block noise amount, and the motion amount;
The video quality estimation method according to claim 7 .

請求項１乃至６のいずれか１項に記載の映像品質推定装置を構成する各手段としてコンピュータを機能させるためのプログラム。 The program for functioning a computer as each means which comprises the video quality estimation apparatus of any one of Claims 1 thru | or 6.