CN110930398A

CN110930398A - Log-Gabor similarity-based full-reference video quality evaluation method

Info

Publication number: CN110930398A
Application number: CN201911250288.7A
Authority: CN
Inventors: 汪斌; 陈淑聪; 姜飞龙; 朱海滨; 毛凌航; 徐翘楚; 张奥; 李兴隆
Original assignee: Jiaxing University
Current assignee: Jiaxing University
Priority date: 2019-12-09
Filing date: 2019-12-09
Publication date: 2020-03-27
Anticipated expiration: 2039-12-09
Also published as: CN110930398B

Abstract

The invention discloses a full-reference video quality evaluation method based on Log-Gabor similarity, which comprises the steps of firstly carrying out Log-Gabor transformation on U and V components of a video frame to obtain the amplitude and phase of a transformation coefficient, calculating the amplitude similarity and the phase similarity, and comprehensively obtaining the Log-Gabor similarity; then constructing stereo LBP characteristics for Y video components of three adjacent frames, extracting a stereo LBP characteristic histogram, and calculating stereo LBP characteristic similarity of a distorted video and a reference video; and finally, combining the log-Gabor similarity and the stereo LBP characteristic similarity to obtain the total similarity which is used as an objective video quality evaluation result. The invention fully considers the Y, U of the video and the transformation domain and space domain characteristics of the V component, and adopts the stereo LBP characteristics to extract the time domain characteristics, thereby improving the video quality evaluation precision.

Description

Log-Gabor similarity-based full-reference video quality evaluation method

Technical Field

The invention belongs to the field of video processing, and particularly relates to a full-reference video quality evaluation method based on Log-Gabor similarity.

Background

Video quality evaluation is a key problem in the field of video processing, and video quality evaluation methods can be divided into subjective video quality evaluation methods and objective video quality evaluation methods according to whether people participate in the video quality evaluation methods. The subjective video quality evaluation method is characterized in that people score images, the evaluation result is accurate, but the evaluation process is complex, the time consumption is long, and real-time application is difficult to achieve. The objective video quality evaluation method does not need human participation, automatically predicts the image quality through a specific computer algorithm, and can be divided into a full-reference video quality evaluation method, a half-reference video quality evaluation method and a no-reference video quality evaluation method according to whether an original distortion-free video is used as a reference. The full-reference video quality evaluation method predicts the video quality by using all information of the reference video, the semi-reference video quality evaluation method predicts the video quality by using partial information of the reference video, and the no-reference video quality evaluation method does not evaluate the video quality by using any information of the reference video. The traditional full-reference video quality evaluation method adopts MSE (mean square error) or PSNR (peak signal to noise ratio) to evaluate the video quality, has definite physical significance and simple algorithm, has the defects of mismatching with subjective visual characteristics of human eyes and the like, and cannot be widely applied in practical occasions. For Video Quality evaluation, many scholars propose improved methods, Zhang [ y.zhang, x. -b.gao, l.he, w.lu, r.he.objective Video Quality Assessment association Combining Transfer Learning with cnn.ieee Transactions on Neural Networks and Learning Systems (IEEE TNNLS),2019] adopt convolutional Neural network merging migration Learning to perform Video image Quality evaluation; lu [ w.lu, r.he, j.yang, c.jia, x. -b.gao.a spatiot Model of Video Quality Assessment via 3DGradient difference information Science, vol.478, pp.141-151,2019 ] proposes a Video Quality Assessment method based on three-dimensional gradient difference, and although these methods improve the accuracy of non-reference image Quality Assessment, the results thereof are still different from the subjective image Quality Assessment results of human eyes.

Disclosure of Invention

The invention aims to provide a full-reference video quality evaluation method based on Log-Gabor similarity aiming at the defects of the prior art.

The purpose of the invention is realized by the following technical scheme: a full reference video quality evaluation method based on Log-Gabor similarity comprises the following steps:

(1) respectively inputting a reference video and a distortion video of a training video set and a video set to be tested;

(2) respectively extracting a Y component, a U component and a V component of a frame of video in a reference video and a distorted video of a video set to be detected, and filtering by using Log-Gabor filters with different scales and different directions, wherein the method comprises the following substeps:

(2.1) constructing a frequency domain Log-Gabor filter G (omega) with S scales and O directions, wherein the frequency domain expression is as follows:

where ω is an angular frequency variable, ω₀Is the center frequency, σ, of the filter_ωIs the filter variance;

(2.2) filtering the extracted Y component, U component and V component by using the filter G (omega) constructed in the step (2.1) to obtain filter coefficients of the Y component, the U component and the V component, which are recorded as GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n); wherein s is a scale index of the filter coefficient, o is a direction index of the filter coefficient, m is a row index of the Y-component filter coefficient, and n is a column index of the Y-component filter coefficient; the filter coefficients GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n) are complex numbers;

(2.3) respectively calculating the amplitude and the phase of the filter coefficients of the Y component, the U component and the V component obtained in the step (2.2), wherein the calculation formula is as follows:

wherein, MY (s, o, m, n) is the amplitude of the Y component filter coefficient, AY (s, o, m, n) is the phase of the Y component filter coefficient, and R _ GY (s, o, m, n) and I _ GY (s, o, m, n) are the real part and imaginary part of the Y component filter coefficient GY (s, o, m, n); similarly, the amplitude MU (s, o, m, n), the phase AU (s, o, m, n) of the U component filter coefficient, the amplitude MV (s, o, m, n) of the V component filter coefficient, and the phase AV (s, o, m, n) can be obtained;

(3) respectively calculating the amplitude similarity and the phase similarity of the filter coefficients of the Y component, the U component and the V component in different scales and directions, wherein the calculation formula is as follows:

wherein S _ MY (S, o, m, n) is the amplitude similarity of the Y component, and S _ AY (S, o, m, n) is the phase similarity of the Y component; c₁And C₂A constant set to avoid the denominator being zero; MY _ D (S, o, m, n) and AY _ D (S, o, m, n) are amplitude and phase of Y component filter coefficients of the distorted video, and MY _ S (S, o, m, n) and AY _ S (S, o, m, n) are amplitude and phase of Y component filter coefficients of the reference video; similarly, the amplitude similarity S _ MU (S, o, m, n), the angle similarity S _ AU (S, o, m, n) of the U component, the amplitude similarity S _ MV (S, o, m, n), the angle similarity S _ AV (S, o, m, n), C of the V component can be obtained₁And C₂A constant set to avoid the denominator being 0;

(4) and calculating the total amplitude similarity and the total phase similarity of the filter coefficients of the Y component, the U component and the V component in different scales and directions, wherein the calculation formula is as follows:

wherein, S _ MY is the total amplitude similarity of the Y component filter coefficients, S _ AY is the total angle similarity of the Y component filter coefficients, and M, N is the total row number and the total column number of the certain scale and direction filter coefficients of the Y component; similarly, the total amplitude similarity S _ MU, the total angle similarity S _ AU and the total amplitude similarity S _ MV and the total angle similarity S _ AV of the U component filter coefficient and the V component filter coefficient can be obtained;

(5) calculating filter coefficient similarity S₁The calculation formula is as follows:

S_Y＝ω₁×S_MY+ω₂×S_AY

S_U＝ω₁×S_MU+ω₂×S_AU

S_V＝ω₁×S_MV+ω₂×S_AV

S₁＝ω₃×S_Y+ω₄×S_U+ω₅×S_V

wherein S is_YIs the total similarity of the Y components, S_UIs the total similarity of the U components, S_VIs the total similarity of the V components, ω₁、ω₂、ω₃、ω₄And ω₅Are all weighting coefficients;

(6) respectively applying cube local binary pattern operators to the training video set and the video set to be tested to obtain local binary pattern feature vectors:

(6.1) applying a stereo local binary pattern operator to the Y components of the reference video and the distorted video in the training video set to obtain a vector L₁And opposite vector L₁Obtaining E clustering centers by using a K nearest neighbor clustering method;

(6.2) applying a stereo local binary pattern operator to the Y component of the concentrated distorted video of the video to be detected to obtain a vector L₂And using K nearest neighbor clustering method to cluster vector L₂E clustering centers obtained in the step (6.1) are classified; according to the vectors L divided in the E clustering centers₂Constructing a local binary pattern histogram according to the number of the feature vectors P, and obtaining corresponding feature vectors P; similarly, applying a three-dimensional local binary pattern operator to the Y component of the reference video in the video set to be detected and constructing a local binary pattern histogram to obtain a feature vector Q;

(7) calculating stereo local binary pattern feature similarity S of reference video and distorted video in to-be-detected video set₂The calculation formula is as follows:

(8) according to the filter coefficient similarity S obtained in the step (5)₁And the stereo local binary pattern feature similarity S obtained in the step (7)₂Calculating the visual similarity theta of the reference video in the video set to be detected and the current frame of the distorted video according to the following formula;

θ＝ω₆S₁+ω₇S₂

wherein, ω is₆、ω₇Is a weighting coefficient;

(9) obtaining the visual similarity of other frames of the video set to be detected through the steps (2) to (8), and recording the visual similarity of the t-th frame as theta_t(ii) a And calculating a video image quality evaluation score Z, wherein the calculation formula is as follows:

and T is the total frame number of the reference video or the distorted video in the video set to be detected.

Further, the amplitude MY _ D (S, o, m, n), the phase AY _ D (S, o, m, n) of the Y component filter coefficient of the distorted video in the step (3), and the amplitude MY _ S (S, o, m, n) and the phase AY _ S (S, o, m, n) of the Y component filter coefficient of the reference video are calculated by the formula in the step (2.3).

Further, the applying a stereo local binary pattern operator in step (6) includes the following sub-steps:

(61) acquiring a Y component of the YUV video, and extracting a pixel point Y of the Y component at an (i, j) position in a t-th frame_t(i, j) and 8 pixel points of neighborhood thereof are respectively Y_t(i-1,j-1)、Y_t(i-1,j)、Y_t(i-1,j+1)、Y_t(i,j-1)、Y_t(i,j+1)、Y_t(i+1,j-1)、Y_t(i +1, j) and Y_t(i+1,j+1)；

(62) Extracting pixel point Y of (i, j) position of Y component in t-1 frame_t-1(i, j) and 8 pixel points of neighborhood thereof are respectively Y_t-1(i-1,j-1)、Y_t-1(i-1,j)、Y_t-1(i-1,j+1)、Y_t-1(i,j-1)、Y_t-1(i,j+1)、Y_t-1(i+1,j-1)、Y_t-1(i +1, j) and Y_t-1(i+1,j+1)；

(63) Extracting pixel point Y of (i, j) position of Y component in t +1 th frame_t+1(i, j) and 8 pixel points of neighborhood thereof are respectively Y_t+1(i-1,j-1)、Y_t+1(i-1,j)、Y_t+1(i-1,j+1)、Y_t+1(i,j-1)、Y_t+1(i,j+1)、Y_t+1(i+1,j-1)、Y_t+1(i +1, j) and Y_t+1(i+1,j+1)；

(64) With Y in the t-th frame_t(i, j) as the center pixel point, and Y is compared by the following formula_t(i, j) and the size of the 3 x 3 neighborhood pixels in the space domain and the time domain extracted in the steps (61) to (63) to obtain a comparison result s_k(a,b)：

Wherein, Y_k(a, b) is Y_t(i, j) the neighborhood pixel value, k belongs to { t-1, t, t +1}, a belongs to { i-1, i, i +1}, and b belongs to { j-1, j, j +1 };

(65) comparing the result s obtained in step (64)_k(a, b) according to s_t-1(i-1,j-1)、s_t-1(i-1,j)、s_t-1(i-1,j+1)、s_t-1(i,j-1)、s_t-1(i,j)、s_t-1(i,j+1)、s_t-1(i+1,j-1)、s_t-1(i+1,j)、s_t-1(i +1, j +1) and s_t(i-1,j-1)、s_t(i-1,j)、s_t(i-1,j+1)、s_t(i,j-1)、s_t(i,j+1)、s_t(i+1,j-1)、s_t(i+1,j)、s_t(i +1, j +1) and s_t+1(i-1,j-1)、s_t+1(i-1,j)、s_t+1(i-1,j+1)、s_t+1(i,j-1)、s_t+1(i,j)、s_t+1(i,j+1)、s_t+1(i+1,j-1)、s_t+1(i+1,j)、s_t+1The sequential arrangement of (i +1, j +1) constitutes a vector L.

Further, the vector L ∈ R in the step (65)^26×1。

Further, the feature vector P, Q ∈ R in the step (6.2)^E×1。

Further, the total frame number of the reference video and the distortion video is the same; the reference video and the distortion video are YUV videos.

The invention has the beneficial effects that: firstly, carrying out Log-Gabor transformation on U and V components of a video frame to obtain the amplitude and phase of a transformation coefficient, calculating the amplitude similarity and the phase similarity, and synthesizing to obtain Log-Gabor similarity; then, constructing stereo LBP (Local Binary Patterns) characteristics for Y video components of three adjacent frames, extracting a stereo LBP characteristic histogram, and calculating stereo LBP similarity of a distorted video and a reference video; and finally, obtaining the total similarity by combining the log-Gabor similarity and the stereo LBP similarity, and taking the total similarity as an objective video quality evaluation result. The method fully considers the Y, U of the video and the transformation domain and spatial domain characteristics of the V component, and adopts the stereo LBP characteristics to extract the time domain characteristics, thereby improving the video quality evaluation precision.

Drawings

Fig. 1 is a flow chart of a full-reference video quality evaluation method based on Log-Gabor similarity.

Detailed Description

The invention is described in detail below with reference to the accompanying drawings and examples. In specific implementation, a LIVE video database is used as an experiment database; the database contains 160 videos, the 160 videos are divided into 10 groups, each group contains 1 reference video and 15 distorted videos, and the 15 distorted videos in each group contain four types of wireless distortion, IP distortion, h.264 compression distortion and MPEG-2 compression distortion.

The specific steps adopted by the invention are shown in fig. 1, and comprise:

step (1): inputting a reference YUV video and a distorted YUV video, randomly selecting 20% of the reference YUV video and the distorted YUV video as a training video set, and selecting 80% of the reference YUV video and the distorted YUV video as a test video set;

step (2): respectively extracting a Y component, a U component and a V component of a frame of video in a YUV video sequence from a reference YUV video and a distorted YUV video of the test video set; filtering the extracted Y component, U component and V component by using Log-Gabor filters with different scales and different directions respectively, and comprising the following substeps:

step (2.1): constructing a frequency domain Log-Gabor filter with S scales and O directions, wherein the calculation formula is as follows:

where ω is an angular frequency variable, ω₀Is the center frequency, σ, of the filter_ωFor the variance of the filter, exp (-) is an exponential operation, ln (-) is a logarithm operation, G (omega) is a frequency domain expression of the Log-Gabor filter, S is 3, and O is 4;

step (2.2): filtering the Y component, the U component and the V component extracted in the step (1) by using the filter constructed in the step (2.1) to respectively obtain filter coefficients of the Y component, the U component and the V component, and recording the filter coefficients as GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n); wherein GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n) are complex numbers, s is a scale index of the filter coefficients, o is a direction index of the filter coefficients, m is a row index of the Y component filter coefficients, and n is a column index of the Y component filter coefficients;

step (2.3): calculating the amplitudes MY (s, o, m, n), MU (s, o, m, n), MV (s, o, m, n) and phases AY (s, o, m, n), AU (s, o, m, n), AV (s, o, m, n) of the filter coefficients GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n) of the Y component, U component and V component in the step (2.2), wherein the calculation formula is as follows:

wherein, R _ GY (s, o, m, n) and I _ GY (s, o, m, n) are the real part and the imaginary part of the Y component filter coefficient GY (s, o, m, n); r _ GU (s, o, m, n) and I _ GU (s, o, m, n) are the real part and the imaginary part of the U component filter coefficient GU (s, o, m, n); r _ GV (s, o, m, n), I _ GV (s, o, m, n) are the real part and imaginary part of the V component filter coefficient GV (s, o, m, n);

and (3): respectively calculating the amplitude similarity and the phase similarity of the filter coefficients of the Y component, the U component and the V component in different scales and directions, wherein the calculation formula is as follows:

wherein S _ MY (S, o, m, n) is the amplitude similarity of the Y component, and S _ AY (S, o, m, n) is the phase similarity of the Y component; c₁And C₂A constant set to avoid the denominator being zero; MY _ D (S, o, m, n) is the amplitude of the Y component filter coefficient of the distorted video, MY _ S (S, o, m, n) is the amplitude of the Y component filter coefficient of the reference video, AY _ D (S, o, m, n) is the phase of the Y component filter coefficient of the distorted video, and AY _ S (S, o, m, n) is the amplitude of the Y component filter coefficient of the reference video; similarly, the amplitude similarity S _ MU (S, o, m, n), the angle similarity S _ AU (S, o, m, n) of the U component, the amplitude similarity S _ MV (S, o, m, n), the angle similarity S _ AV (S, o, m, n), C of the V component can be obtained₁And C₂Constants set to avoid denominator being 0, C₁And C₂The value is 0.01;

and (4): and calculating the total amplitude similarity and the total phase similarity of the filter coefficients of the Y component, the U component and the V component in different scales and directions, wherein the calculation formula is as follows:

wherein S _ MY is the total amplitude similarity of the Y component filter coefficients, S _ AY is the total angle similarity of the Y component filter coefficients, M is a certain scale and the total row number of the direction filter coefficients of the Y component, N is a certain scale and the total column number of the direction filter coefficients of the Y component, S is the total scale number of the filter coefficients, and O is the total direction number of the filter coefficients; similarly, the total amplitude similarity S _ MU, the total angle similarity S _ AU and the total amplitude similarity S _ MV and the total angle similarity S _ AV of the U component filter coefficient and the V component filter coefficient can be obtained;

and (5): and calculating to obtain the total filter coefficient similarity S by combining the total amplitude similarity and the total phase similarity of the filter coefficients of the Y component, the U component and the V component in different scales and directions₁The calculation formula is as follows:

S_Y＝ω₁×S_MY+ω₂×S_AY

S_U＝ω₁×S_MU+ω₂×S_AU

S_V＝ω₁×S_MV+ω₂×S_AV

S₁＝ω₃×S_Y+ω₄×S_U+ω₅×S_V

wherein S is_YIs the total similarity of the Y components, S_UIs the total similarity of the U components, S_VIs the total similarity of the V components, ω₁、ω₂、ω₃、ω₄And ω₅For weighting coefficients, set by human beings, omega₁、ω₂Value of 0.5, omega₃Value of 0.75, omega₄And ω₅A value of 0.125;

and (6): applying stereo Local Binary Pattern (LBP) operators to YUV reference video and YUV distorted video components in the test video set and the training video set, respectively, comprising the following sub-steps:

step (6.1): extracting a pixel point Y at the (i, j) position of the Y video in the t frame_t(i, j) and obtaining 8 pixel points in the neighborhood of the pixel position of (i, j), wherein the pixel positions are Y_t(i-1,j-1)、Y_t(i-1,j)、Y_t(i-1,j+1)、Y_t(i,j-1)、Y_t(i,j+1)、Y_t(i+1,j-1)、Y_t(i +1, j) and Y_t(i +1, j +1), wherein t is the frame number of the current frame, t-1 is the frame number of the previous frame, and t +1 is the frame number of the next frame;

step (6.2): extracting pixel points Y of the Y video at the t-1 th frame and the (i, j) position_t-1(i, j) and obtaining 8 pixel points of t-1 th frame in the neighborhood of the (i, j) position, wherein the pixel points are Y respectively_t-1(i-1,j-1)、Y_t-1(i-1,j)、Y_t-1(i-1,j+1)、Y_t-1(i,j-1)、Y_t-1(i,j+1)、Y_t-1(i+1,j-1)、Y_t-1(i +1, j) and Y_t-1(i+1,j+1)；

Step (6.3): extracting pixel points Y of the Y video at the t +1 th frame and the (i, j) position_t+1(i, j) and obtaining 8 pixel points of t +1 th frame in the neighborhood of the (i, j) position, wherein the pixel points are Y_t+1(i-1,j-1)、Y_t+1(i-1,j)、Y_t+1(i-1,j+1)、Y_t+1(i,j-1)、Y_t+1(i,j+1)、Y_t+1(i+1,j-1)、Y_t+1(i +1, j) and Y_t+1(i+1,j+1)；

Step (6.4): with Y in the t-th frame_t(i, j) as the center pixel point, and comparing Y_t(i, j) the size of the 3 × 3 neighborhood pixels in the space domain and the time domain, if the neighborhood pixel value is greater than or equal to the central pixel value, the comparison result is marked as 1, otherwise, the comparison result is 0; the comparative formula is as follows:

wherein, Y_k(m, n) is Y_t(i, j) neighborhood pixel value, k belongs to { t-1, t, t +1}, m belongs to { i-1, i, i +1}, n belongs to { j-1, j, j +1}, s_k(m, n) is the comparison result of the (m, n) pixel positions;

step (6.5): will s_k(m, n) as s_t-1(i-1,j-1)、s_t-1(i-1,j)、s_t-1(i-1,j+1)、s_t-1(i,j-1)、s_t-1(i,j)、s_t-1(i,j+1)、s_t-1(i+1,j-1)、s_t-1(i+1,j)、s_t-1(i +1, j +1) and s_t(i-1,j-1)、s_t(i-1,j)、s_t(i-1,j+1)、s_t(i,j-1)、s_t(i,j+1)、s_t(i+1,j-1)、s_t(i+1,j)、s_t(i +1, j +1) and s_t+1(i-1,j-1)、s_t+1(i-1,j)、s_t+1(i-1,j+1)、s_t+1(i,j-1)、s_t+1(i,j)、s_t+1(i,j+1)、s_t+1(i+1,j-1)、s_t+1(i+1,j)、s_t+1The sequential arrangement of (i +1, j +1) constitutes a vector L, where L ∈ R^26×1；

Step (6.6): extracting a vector L from the reference video and the distorted video in the training video set according to the steps (6.1) - (6.5)₁Obtaining E clustering centers by adopting a K-nearest neighbor (KNN) clustering method, wherein the value of E is 32;

step (6.7): extracting a vector L from the distorted video in the test video set to each pixel point according to the steps (6.1) - (6.5)₂And applying KNN methodClassifying it into E clusters;

step (6.8): vector L classified into E cluster centers according to distorted videos in test video set₂Constructing a local binary pattern histogram by the number to obtain a corresponding local binary pattern feature vector P, wherein P belongs to R^E×1；

Step (6.9): processing the Y video component of the reference video in the test video set in the same way through the steps (6.1) - (6.8), applying a cube local binary pattern operator to obtain a local binary pattern feature vector Q, wherein Q belongs to R^E×1；

And (7): calculating stereo local binary pattern feature similarity S of reference video and distorted video in test video set₂The calculation formula is as follows:

wherein, C₃Constants set to avoid denominator being 0, C₃The value is 0.01, the operator is the inner product of the vector, | | | · | | is the modular operation of solving the vector;

and (8): combining the Log-Gabor filter coefficient similarity S obtained in the step (5)₁And (7) calculating the similarity S of the characteristics of the three-dimensional local binary pattern₂To obtain the visual similarity theta of the current frame_t；

θ_t＝ω₆S₁+ω₇S₂

Wherein, ω is₆、ω₇For weighting coefficients, set by human beings, omega₆And ω₇The values are 0.8 and 0.2 respectively;

and (9): and calculating to obtain a final video image quality evaluation score Z, wherein the calculation formula is as follows:

where T is the total number of frames of the reference video or the distorted video.

The Log-Gabor filter is more consistent with the nonlinear logarithmic characteristic of human eye vision, is consistent with the visual perception process of human eyes, and obtains the similarity of the filter coefficients in different scales and different directions by extracting the Log-Gabor filter coefficients in multiple scales and multiple directions and comparing the Log-Gabor filter coefficients of the reference video and the distorted video, and synthesizes the obtained total Log-Gabor similarity, wherein the similarity is more consistent with the visual characteristic of human eyes; meanwhile, the stereo LBP feature fuses texture features of a space domain and a time domain of the video, and the obtained stereo LBP feature similarity fully considers the texture distortion degree of the distorted video in an adjacent space domain and the time domain, so that the video image quality evaluation score obtained by fusing the Log-Gabor similarity and the stereo LBP feature similarity can fully reflect the distortion degree of the video. Wherein, the larger the video image quality evaluation score Z is, the better the video quality is.

Claims

1. A full reference video quality evaluation method based on Log-Gabor similarity is characterized by comprising the following steps:

(1) respectively inputting a training video set and a video set to be tested, wherein the input training video set and the video set to be tested both comprise a reference video and a distortion video.

where ω is an angular frequency variable, ω₀Is the center frequency, σ, of the filter_ωIs the filter variance.

(2.2) filtering the extracted Y component, U component and V component by using the filter G (omega) constructed in the step (2.1) to obtain filter coefficients of the Y component, the U component and the V component, which are recorded as GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n); wherein s is a scale index of the filter coefficient, o is a direction index of the filter coefficient, m is a row index of the Y-component filter coefficient, and n is a column index of the Y-component filter coefficient; the filter coefficients GY (s, o, m, n), GU (s, o, m, n) and GV (s, o, m, n) are complex numbers.

wherein S _ MY (S, o, m, n) is the amplitude similarity of the Y component, and S _ AY (S, o, m, n) is the phase similarity of the Y component; c₁And C₂A constant set to avoid the denominator being zero; MY _ D (S, o, m, n) and AY _ D (S, o, m, n) are amplitude and phase of Y component filter coefficients of the distorted video, and MY _ S (S, o, m, n) and AY _ S (S, o, m, n) are amplitude and phase of Y component filter coefficients of the reference video; similarly, the amplitude similarity S _ MU (S, o, m, n), the angle similarity S _ AU (S, o, m, n) of the U component, and the amplitude of the V component can be obtainedDegree similarity S _ MV (S, o, m, n), angle similarity S _ AV (S, o, m, n), C₁And C₂A constant set to avoid the denominator being 0;

S_Y＝ω₁×S_MY+ω₂×S_AY

S_U＝ω₁×S_MU+ω₂×S_AU

S_V＝ω₁×S_MV+ω₂×S_AV

S₁＝ω₃×S_Y+ω₄×S_U+ω₅×S_V

(6.1) for reference video and distorted video in training video setThe Y component obtains a vector L by applying a stereo local binary pattern operator₁And opposite vector L₁Obtaining E clustering centers by using a K nearest neighbor clustering method;

θ＝ω₆S₁+ω₇S₂

wherein, ω is₆、ω₇Is a weighting coefficient;

2. The Log-Gabor similarity-based full-reference video quality evaluation method according to claim 1, wherein the amplitude MY _ D (S, o, m, n), the phase AY _ D (S, o, m, n) of the Y component filter coefficient of the distorted video and the amplitude MY _ S (S, o, m, n), the phase AY _ S (S, o, m, n) of the Y component filter coefficient of the reference video in step (3) are calculated by the formula in step (2.3).

3. The method for evaluating quality of full-reference video based on Log-Gabor similarity according to claim 1, wherein the step (6) of applying a stereo local binary pattern operator comprises the following sub-steps:

4. The method for full-reference video quality evaluation based on Log-Gabor similarity according to claim 3, wherein the vector L e R in the step (65)^26×1。

5. The method for full-reference video quality evaluation based on Log-Gabor similarity according to claim 1, wherein the feature vector P, Qe R in the step (6.2)^E×1。

6. The method for evaluating the quality of the full reference video based on the Log-Gabor similarity according to claim 1, wherein the total number of frames of the reference video and the distorted video is the same; the reference video and the distortion video are YUV videos.