CN106780441B - Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics - Google Patents

Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics Download PDF

Info

Publication number
CN106780441B
CN106780441B CN201611078458.4A CN201611078458A CN106780441B CN 106780441 B CN106780441 B CN 106780441B CN 201611078458 A CN201611078458 A CN 201611078458A CN 106780441 B CN106780441 B CN 106780441B
Authority
CN
China
Prior art keywords
image
dictionary
images
scs
distorted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611078458.4A
Other languages
Chinese (zh)
Other versions
CN106780441A (en
Inventor
张桦
沈方瑶
阳宁凯
王彧
吴以凡
戴国骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Electronic Science and Technology University
Original Assignee
Hangzhou Electronic Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Electronic Science and Technology University filed Critical Hangzhou Electronic Science and Technology University
Priority to CN201611078458.4A priority Critical patent/CN106780441B/en
Publication of CN106780441A publication Critical patent/CN106780441A/en
Application granted granted Critical
Publication of CN106780441B publication Critical patent/CN106780441B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics. The invention is divided into two parts of plane image quality measurement and three-dimensional perception quality measurement. Planar image quality measurement: firstly, performing dictionary learning on all left and right reference images to obtain a dictionary; then solving a sparse coefficient matrix from all the reference and distorted images on a corresponding dictionary, and calculating the similarity of sparse coefficients; and finally, fusing the left and right distorted image qualities by using a stereo masking effect to obtain the plane image quality. Stereo perception quality measurement: and performing dictionary learning on the parallax images of the reference images, and solving sparse coefficient matrixes on the parallax images of the reference images and the distortion images on corresponding dictionaries to obtain the stereo perception quality. And finally, fusing the plane and the stereo perception quality to obtain the stereo image quality. The method of the invention considers the visual characteristics of human eyes and accurately measures the quality of the stereo image.

Description

Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics
Technical Field
The invention relates to a method for measuring the quality of a three-dimensional image, in particular to a method for objectively measuring the quality of the three-dimensional image based on dictionary learning and human visual characteristics.
Background
With the multimedia society, the technologies such as naked eye 3D display, virtual reality, 3D navigation and the like gradually enter thousands of households, and meanwhile, people put higher requirements on the quality of stereo images, and the objective measurement of the quality of stereo images becomes a hotspot of current research. The stereo image is subjected to four processes of acquisition, encoding, transmission and decoding before being displayed, and each process causes a certain degree of distortion, thereby affecting the quality of the stereo image. The invention has higher stereo image quality accuracy from the viewpoint of human visual characteristics.
Many factors influencing the quality of the stereo image, such as distortion type, stereo perception characteristics, visual comfort and the like, can be measured through machine learning according to the current technology, but the result of the method is easily influenced by training content and training strategy, so that the quality measurement can be better carried out only by extracting essential characteristics of the image.
Disclosure of Invention
The invention aims to solve the technical problem of providing a three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics, which considers the essential characteristics of images and the human eye visual characteristics and effectively improves the correlation between objective measurement values and subjective measurement values.
The technical scheme adopted by the invention for solving the technical problems is as follows: a three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics is characterized by comprising two aspects of plane image quality measurement and three-dimensional perception quality measurement, and the process comprises the following specific steps:
(1) planar image quality measurement
1.1 training reference image dictionary:
assuming that there are k pairs of reference images, where the left and right images have k numbers, respectively, dictionary training is performed on these reference images, respectively. Taking a left image of M × N pixels as an example, selecting M8 × 8 overlapped blocks with larger variance as training samples to form a matrix Yl,Yl=[y1,y2,…,ym]∈Rn×mWherein y isi∈Rn×1(i∈[1,m]) Is a vector resulting from ordering n pixels in the ith block column by column, and n is 8 × 8 is 64. Using training samples YlOver-complete dictionary D can be obtained by trainingl=[d1,d2,…,dp]∈Rn ×pWherein p is>n。YlAnd DlThe relationship between them can be expressed by the following formula:
Figure BDA0001166493490000021
wherein | · | purple0Is a0Norm used for calculating the number of nonzero elements in the matrix, | · | non-conducting phosphor2Is a2And (4) norm.
x=[x,1x2,…,xm]∈Rp×mRepresenting a sparse coefficient matrix, the optimization goal is to minimize non-zero numbers in x, i.e., sparsest. With the proviso that Yl-Dlx||2Less than a given reconstruction error epsilon. Performing dictionary learning and sparse coding by adopting a K-SVD algorithm and an OMP algorithm to finally obtain a dictionary D of the left imagel. In the same way, the dictionary D of the right image corresponding to the left image can be obtainedr
Dictionary learning is respectively carried out on the left and right images of the k pair reference image to obtain all left image dictionaries DL=[Dl1,Dl2,…,Dlk]Wherein D isliIs the dictionary of the ith left picture, all right pictures dictionary DR=[Dr1,Dr2,…,Drk]Wherein D isrjIs the dictionary of the jth left figure therein.
1.2 sparse coefficient similarity solution:
and solving sparse coefficients of all the reference images and the distorted images on the corresponding dictionaries, and comparing the similarity of the sparse coefficients of the reference images and the sparse coefficients of the distorted images. Take two reference left graphs of M × N pixels and the corresponding jp2k distorted left graph as an example. Firstly, the whole reference left image is divided into 8 × 8 pixel blocks without overlapping to obtain q small blocks. The pixels in each small block are sorted according to columns to obtain a 64 multiplied by 1 column vector
Figure BDA0001166493490000022
Where i denotes the ith tile in the picture. q small blocks can form matrix
Figure BDA0001166493490000023
According to the relation between the sample matrix and the dictionary, the sparse representation of the reference image on the corresponding dictionary can be obtained
Figure BDA0001166493490000024
Wherein
Figure BDA0001166493490000025
Is a dictionary DiIs determined by the generalized inverse matrix of (a),
Figure BDA0001166493490000026
the same way can be solved for the sparse representation of the jp2k distortion left image on the dictionaryWherein Y isdisIs a matrix obtained by dividing the distorted left image into 8 × 8 non-overlapping partitions,
Figure BDA0001166493490000028
then to XrefAnd XdisAnd solving the sparse coefficient similarity of the corresponding vector.
The Sparse Coefficient Similarity (SCS) is calculated from the angular difference and the amplitude value difference of the two vectors, the angular difference being:
Figure BDA0001166493490000031
wherein the content of the first and second substances,
Figure BDA0001166493490000032
and
Figure BDA0001166493490000033
sparse coefficient vectors representing the reference image and the distorted image respectively,<·>which means that the inner product of two vectors is calculated, c is a constant.
The amplitude value difference calculation formula is as follows:
Figure BDA0001166493490000034
thus, the sparse coefficient similarity expression of a certain 8 × 8 small block of the reference image and the small block on the corresponding distorted image is obtained as follows:
SCSi=Pi·Vi
finally, the sparse coefficient similarity between the reference left image and the corresponding jp2k distortion left image is obtained, namely the SCS mean of the q small blocks:
Figure BDA0001166493490000035
and in the same way, the sparse coefficient similarity of all the reference images and the corresponding distorted images can be obtained. The number of the left and right distorted images is equal, and if the number is t, the sparse coefficient similarity of all the left images is SCSL=[SCSl1,SCSl2,…,SCSlt]Wherein SCSliRepresenting the SCS mean value obtained by the ith distortion left graph, wherein the sparse coefficient similarity of all the right graphs is SCSR=[SCSr1,SCSr2,…,SCSrt]Wherein SCSljThe SCS mean value obtained from the jth left distorted image is shown.
1.3 left and right image fusion to obtain planar image quality measurement
According to the research of visual physiology and visual psychology, the human eyes have a stereo masking effect, namely, good images in left and right viewpoints have masking and inhibiting effects on poor images. Meanwhile, for different distortion types, the contribution of the left and right eyes of human eyes to the stereo image quality is also different. For the two factors, the left and right eye fusion weight under different distortion types is researched, and finally a certain image plane image quality measurement value is obtained, wherein the formula is as follows:
Ql_r=wSCSl+(1-w)SCSr
where w represents the contribution made by the left eye at the time of the quality measurement, and 1-w represents the contribution made by the right eye.
(2) Stereo perception quality measurement
2.1 training dictionary for reference image disparity map
Due to the existence of binocular distance of human eyes, images received by left and right eyes have parallax, so that people can feel stereoscopic impression, and therefore the stereoscopic perception quality is measured by using the characteristics of an absolute difference image of left and right viewpoints. Horizontal parallax calculation is performed on the k pairs of reference images, respectively, and then respective dictionaries are learned. Taking a pair of M × N reference images as an example, the calculation formula of the parallax space image (DSI) is as follows:
DSI(x,y;d)=||IL(x,y)-IR(x-d,y)||2
wherein d is the parallax of the left and right images, and d belongs to [0, d ∈max),dmaxThe maximum disparity of the left and right images.
Selecting m 8 × 8 overlapped blocks with large disparity map variance by using a dictionary learning method in 1.1 to form a training matrix Yref,Yref=[yref1,yref2,…,yrefm]∈Rn×mWherein n is 64. Training with the sample to obtain overcomplete dictionary Dref=[dref1,dref2,…,drefp]∈Rn×pWherein p is>n。
And obtaining a dictionary of absolute disparity maps of all reference images in the same way.
2.2 solving the sparse coefficient similarity of the reference image disparity map and the distorted image disparity map
Taking a pair of left and right distorted images as an example, firstly calculating an absolute disparity map of the pair of distorted images, dividing the difference map into 8X 8 non-overlapped small blocks by adopting a method in 1.2, coding all the small blocks to obtain a distorted image disparity map matrix sample, and performing sparse representation X on a dictionary obtained by training a corresponding reference image disparity mapdisi=[xdis1,xdis2,…,xdism]∈Rp×mWherein x isdisi(1 ≦ i ≦ m), which is a 64 × 1 vector representing a dictionary sparse representation of an 8 × 8 patch in the absolute difference map of the distorted image.
And obtaining a sparse representation of the absolute difference graph of the reference image corresponding to the pair of distorted images on the corresponding dictionary in the same way: xrefi=[xref1,xref2,…,xrefst]∈Rp×mWherein x isrefi(1 ≦ i ≦ m), which is a 64 × 1 vector representing a sparse representation of an 8 × 8 patch in the dictionary in the reference image absolute difference map.
And according to the definition of the sparse coefficient similarity in 1.2, calculating from the angle difference and the amplitude value difference of the two vectors to obtain a sparse coefficient similarity value SCS between the distorted image absolute difference map and the reference image absolute difference map.
In the same way, the sparse coefficient similarity SCS between the absolute difference maps of all the distorted images and the absolute difference map of the corresponding reference image can be obtainedi(i is more than or equal to 1 and less than or equal to t). Meanwhile, the value indicates the quality of the stereoscopic perception quality, and the larger the value is, the better the stereoscopic perception quality is.
(3) Objective measurement of stereo image quality
The invention fuses two parts of plane image quality and three-dimensional perception quality to obtain a three-dimensional image quality objective measurement value, and combines the plane image quality measurement value and the depth perception quality measurement value in a multiplicative combination mode to obtain a final three-dimensional image quality objective measurement value, wherein the formula is as follows:
Q=Ql_r×(SCS)λ
wherein Q isl_rRepresenting a planar image quality measurement, SCS representing a stereo perceptual quality measurement, and λ is a constant.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention trains dictionaries for each reference image and reference space parallax image, and the capacity of the selected training sample is larger, and the finally trained dictionary is more comprehensive, so that the result is more accurate when the images are expressed sparsely.
(2) The method applies dictionary learning and sparse representation to plane image quality measurement and spatial parallax image quality measurement, and effectively extracts the essential characteristics of the image characteristics by a simpler method, thereby avoiding the complicated process of extracting and fusing various characteristics, and finally calculating the result to display the strong correlation between the distorted three-dimensional image quality objective measurement value and the subjective measurement value.
Drawings
Fig. 1 is a block diagram of the overall implementation of the method of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The invention provides a stereo image quality objective measurement method based on dictionary learning and human eye visual characteristics, which is realized by the overall block diagram shown in figure 1.
In the plane image quality measurement stage, firstly, each left and right reference image is subjected to overlapping sampling, a dictionary of the image is trained by using a K-SVD algorithm, then, the reference image and the distorted image are subjected to non-overlapping sampling, respective sparse coefficient matrixes are obtained on the corresponding dictionaries, and the matrixes are equivalent to the extracted essential features of the image. After sparse coefficient matrixes of the reference image and the corresponding distorted image are obtained, the sparse coefficient similarity of the two matrixes is solved, which is equivalent to obtaining the distortion degree of each distorted image. And finally, fusing the sparse coefficient similarity matrixes of the left image and the right image to obtain the planar image quality measurement.
In the stage of measuring the stereoscopic perception quality, the spatial disparity maps of the left and right reference images are firstly solved, and a non-overlapping sampling training dictionary is respectively carried out on each disparity map. And then, carrying out non-overlapping sampling on each reference disparity map and each distorted disparity map, and respectively solving sparse coefficient matrixes on corresponding dictionaries. And finally, solving the similarity of the two sparse coefficients to obtain the objective value of the stereo perception quality. And combining the two measurement results in a multiplicative combination mode to obtain a final objective measurement value of the quality of the three-dimensional image. The method comprises the following specific steps:
(1) planar image quality measurement
1.1 training reference image dictionary:
assuming that there are k pairs of reference images, where the left and right images have k numbers, respectively, dictionary training is performed on these reference images, respectively. Taking a left image of M × N pixels as an example, selecting M8 × 8 overlapped blocks with larger variance as training samples to form a matrix Yl,Yl=[y1,y2,…,ym]∈Rn×mWherein y isi∈Rn×1(i∈[1,m]) Is a vector resulting from ordering n pixels in the ith block column by column, and n is 8 × 8 is 64. Using training samples YlCan train to obtain over-complete charactersDian Dl=[d1,d2,…,dp]∈Rn ×pWherein p is>n。YlAnd DlThe relationship between them can be expressed by the following formula:
wherein | · | purple0Is a0Norm used for calculating the number of nonzero elements in the matrix, | · | non-conducting phosphor2Is a2And (4) norm.
x=[x,1x2,…,xm]∈Rp×mRepresenting a sparse coefficient matrix, the optimization goal is to minimize non-zero numbers in x, i.e., sparsest. With the proviso that Yl-Dlx||2Less than a given reconstruction error epsilon. Performing dictionary learning and sparse coding by adopting a K-SVD algorithm and an OMP algorithm to finally obtain a dictionary D of the left imagel. In the same way, the dictionary D of the right image corresponding to the left image can be obtainedr
Dictionary learning is respectively carried out on the left and right images of the k pair reference image to obtain all left image dictionaries DL=[Dl1,Dl2,…,Dlk]Wherein D isliIs the dictionary of the ith left picture, all right pictures dictionary DR=[Dr1,Dr2,…,Drk]Wherein D isrjIs the dictionary of the jth left figure therein.
In the implementation, images in the phase1 database of LIVE laboratories are used, wherein there are 20 pairs of left and right reference images, so the value of k is 20. Since the image is 640 × 360 pixels, M is 640 and N is 360. m is the sample capacity of dictionary training, and the larger the value of m, the higher the precision of the dictionary obtained by training, but the higher the computational complexity, so m is 10000 in this embodiment.
1.2 sparse coefficient similarity solution:
and solving sparse coefficients of all the reference images and the distorted images on the corresponding dictionaries, and comparing the similarity of the sparse coefficients of the reference images and the sparse coefficients of the distorted images. Reference left picture with two M × N pixelsAnd the corresponding jp2k distortion left figure. Firstly, the whole reference left image is divided into 8 × 8 pixel blocks without overlapping to obtain q small blocks. The pixels in each small block are sorted according to columns to obtain a 64 multiplied by 1 column vector
Figure BDA0001166493490000071
Where i denotes the ith tile in the picture. q small blocks can form matrixAccording to the relation between the sample matrix and the dictionary, the sparse representation of the reference image on the corresponding dictionary can be obtained
Figure BDA0001166493490000073
Wherein Di +Is a dictionary DiIs determined by the generalized inverse matrix of (a),
Figure BDA0001166493490000074
the same way can be solved for the sparse representation of the jp2k distortion left image on the dictionary
Figure BDA0001166493490000075
Wherein Y isdisIs a matrix obtained by dividing the distorted left image into 8 × 8 non-overlapping partitions,
Figure BDA0001166493490000076
then to XrefAnd XdisAnd solving the sparse coefficient similarity of the corresponding vector.
The Sparse Coefficient Similarity (SCS) is calculated from the angular difference and the amplitude value difference of the two vectors, the angular difference being:
Figure BDA0001166493490000077
wherein the content of the first and second substances,
Figure BDA0001166493490000078
and
Figure BDA0001166493490000079
sparse coefficient vectors representing the reference image and the distorted image respectively,<·>which means that the inner product of two vectors is calculated, c is a constant.
The amplitude value difference calculation formula is as follows:
Figure BDA0001166493490000081
thus, the sparse coefficient similarity expression of a certain 8 × 8 small block of the reference image and the small block on the corresponding distorted image is obtained as follows:
SCSi=Pi·Vi
finally, the sparse coefficient similarity between the reference left image and the corresponding jp2k distortion left image is obtained, namely the SCS mean of the q small blocks:
Figure BDA0001166493490000082
and in the same way, the sparse coefficient similarity of all the reference images and the corresponding distorted images can be obtained. The number of the left and right distorted images is equal, and if the number is t, the sparse coefficient similarity of all the left images is SCSL=[SCSl1,SCSl2,…,SCSlt]Wherein SCSliRepresenting the SCS mean value obtained by the ith distortion left graph, wherein the sparse coefficient similarity of all the right graphs is SCSR=[SCSr1,SCSr2,…,SCSrt]Wherein SCSljThe SCS mean value obtained from the jth left distorted image is shown.
In this embodiment, there are a total of five distortion types, jp2k, jpeg, wn, blu, and ff, and the image size is 640 × 360. Taking a certain left image of a jp2k distortion type as an example, when 8 × 8 non-overlapping small blocks of samples are taken from the distortion image, q is 3200, so that the samples cover the whole image, and a sparse coefficient matrix which completely embodies the characteristics of the image can be obtained by training on a corresponding dictionary. When sparse coefficient similarity solving is carried out, a constant c is selected to be 0.02, and the denominator is guaranteed not to be zero.
1.3 left and right image fusion to obtain planar image quality measurement
According to the research of visual physiology and visual psychology, the human eyes have a stereo masking effect, namely, good images in left and right viewpoints have masking and inhibiting effects on poor images. Meanwhile, for different distortion types, the contribution of the left and right eyes of human eyes to the stereo image quality is also different. Aiming at the two factors, the fusion weight of the left eye and the right eye under different distortion types is researched, and finally a certain image plane image quality measurement value is obtained, wherein the formula is as follows:
Ql_r=wSCSl+(1-w)SCSr
where w represents the contribution made by the left eye at the time of the quality measurement, and 1-w represents the contribution made by the right eye.
According to experiments, the values of the left eye under different distortion types in this embodiment are shown in table 1:
distortion type jp2k jpeg wn blur ff
w 0.50 0.75 0.55 0.90 0.65
(2) Stereo perception quality measurement
2.1 training dictionary for reference image disparity map
Due to the existence of binocular distance of human eyes, images received by left and right eyes have parallax, so that people can feel stereoscopic impression, and therefore the stereoscopic perception quality is measured by using the characteristics of an absolute difference image of left and right viewpoints. Horizontal parallax calculation is performed on the k pairs of reference images, respectively, and then respective dictionaries are learned. Taking a pair of M × N reference images as an example, the calculation formula of the parallax space image (DSI) is as follows:
DSI(x,y;d)=||IL(x,y)-IR(x-d,y)||2
wherein d is the parallax of the left and right images, and d belongs to [0, d ∈max),dmaxThe maximum disparity of the left and right images.
Selecting m 8 × 8 overlapped blocks with large disparity map variance by using a dictionary learning method in 1.1 to form a training matrix Yref,Yref=[yref1,yref2,…,yrefm]∈Rn×mWherein n is 64. Training with the sample to obtain overcomplete dictionary Dref=[dref1,dref2,…,drefp]∈Rn×pWherein p is>n。
And obtaining a dictionary of absolute disparity maps of all reference images in the same way.
In the specific example, k is 20 and m is 10000, as in the case of the parameter 1.1.
2.2 solving the sparse coefficient similarity of the reference image disparity map and the distorted image disparity map
Taking a pair of left and right distorted images as an example, firstly calculating an absolute disparity map of the pair of distorted images, dividing the difference map into 8X 8 non-overlapped small blocks by adopting a method in 1.2, coding all the small blocks to obtain a distorted image disparity map matrix sample, and performing sparse representation X on a dictionary obtained by training a corresponding reference image disparity mapdisi=[xdis1,xdis2,…,xdism]∈Rp×mWherein x isdisi(1. ltoreq. i.ltoreq.m) is a 64X 1 vector representing a loserAnd (3) sparse representation of an 8 x 8 small block on a dictionary in the true image absolute difference value map.
And obtaining a sparse representation of the absolute difference graph of the reference image corresponding to the pair of distorted images on the corresponding dictionary in the same way: xrefi=[xref1,xref2,…,xrefst]∈Rp×mWherein x isrefi(1 ≦ i ≦ m), which is a 64 × 1 vector representing a sparse representation of an 8 × 8 patch in the dictionary in the reference image absolute difference map.
And according to the definition of the sparse coefficient similarity in 1.2, calculating from the angle difference and the amplitude value difference of the two vectors to obtain a sparse coefficient similarity value SCS between the distorted image absolute difference map and the reference image absolute difference map.
In the same way, the sparse coefficient similarity SCS between the absolute difference maps of all the distorted images and the absolute difference map of the corresponding reference image can be obtainedi(i is more than or equal to 1 and less than or equal to t). Meanwhile, the value indicates the quality of the stereoscopic perception quality, and the larger the value is, the better the stereoscopic perception quality is.
(3) Objective measurement of stereo image quality
The invention fuses two parts of plane image quality and three-dimensional perception quality to obtain a three-dimensional image quality objective measurement value, and combines the plane image quality measurement value and the depth perception quality measurement value in a multiplicative combination mode to obtain a final three-dimensional image quality objective measurement value, wherein the formula is as follows:
Q=Ql_r×(SCS)λ
wherein Q isl_rRepresenting a planar image quality measurement, SCS representing a stereo perceptual quality measurement, and λ is a constant.
The invention adopts two commonly used objective quality measurement indexes to evaluate the performance of the established model, namely Pearson correlation coefficient (PLCC) and Spearman correlation coefficient (SROCC) under the condition of nonlinear regression. Higher values of PLCC and SROCC indicate better correlation between objective and subjective measurements. Table 2 below shows two evaluation values obtained by the method of the present invention:
distortion type jp2k jpeg wn blur ff mean
PLCC 0.912 0.923 0.965 0.968 0.834 0.920
SROCC 0.891 0.850 0.923 0.912 0.752 0.866
From the above table, it can be seen that the PLCC value and the SROCC value of the method of the present invention are both relatively high, and thus it can be seen that the method of the present invention is suitable for measuring the quality of stereoscopic images.

Claims (1)

1. A three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics is characterized by comprising the following steps:
step 1, measuring the quality of a plane image, specifically:
1.1 training reference image dictionary:
supposing that k pairs of reference images are provided, wherein the left image and the right image respectively have k pieces, and performing dictionary training on the reference images respectively; taking a left image of M × N pixels as an example, selecting M overlapped blocks of 8 × 8 of the image as training samples to form a matrix Yl,Yl=[y1,y2,…,ym]∈Rn×mWherein y isi∈Rn×1Is a vector of n pixels in the ith block ordered column by column, and n is 8 × 8 is 64, i ∈ [1, m](ii) a Using training samples YlOver-complete dictionary D can be obtained by trainingl=[d1,d2,…,dp]∈Rn×pWherein p is>n;YlAnd DlThe relationship between them is expressed by the following formula:
Figure FDA0002222072770000011
wherein | · | purple0Is a0Norm used for calculating the number of nonzero elements in the matrix, | · | non-conducting phosphor2Is a2A norm; x ═ x1,x2,…,xm]∈Rp×mRepresenting a sparse coefficient matrix, the optimization objective being to minimize non-zero numbers in x, i.e. to be sparsest; with the proviso that Yl-Dlx||2Less than a given reconstruction error epsilon; performing dictionary learning and sparse coding by adopting a K-SVD algorithm and an OMP algorithm to finally obtain a dictionary D of the left imagel(ii) a In the same way, the dictionary D of the right image corresponding to the left image can be obtainedr
Dictionary learning is respectively carried out on the left and right images of the k pair reference image to obtain all left image dictionaries DL=[Dl1,Dl2,…,Dlk]Wherein D isliIs the dictionary of the ith left picture, all right pictures dictionary DR=[Dr1,Dr2,…,Drk]Wherein D isrjIs the dictionary of the jth right figure therein;
1.2 sparse coefficient similarity solution:
solving sparse coefficients of all the reference images and the distorted images on the corresponding dictionaries, and comparing the similarity of the sparse coefficients of the reference images and the sparse coefficients of the distorted images; taking two reference left graphs of M multiplied by N pixels and a corresponding jp2k distortion left graph as an example; firstly, carrying out non-overlapping division on 8 multiplied by 8 pixel blocks on the whole reference left image to obtain q small blocks; the pixels in each small block are sorted according to columns to obtain a 64 multiplied by 1 column vector
Figure FDA0002222072770000021
Wherein i represents the ith small block in the picture; q small blocks form matrix
Figure FDA0002222072770000022
Obtaining sparse representation of the reference image on the corresponding dictionary according to the relation between the sample matrix and the dictionary
Figure FDA0002222072770000023
Wherein
Figure FDA0002222072770000024
Is a dictionary DiIs determined by the generalized inverse matrix of (a),
Figure FDA0002222072770000025
the same way can be solved for the sparse representation of the jp2k distortion left image on the dictionary
Figure FDA0002222072770000026
Wherein Y isdisIs a matrix obtained by dividing the distorted left image into 8 × 8 non-overlapping partitions,then to XrefAnd XdisSolving the sparse coefficient similarity of the corresponding vector;
the sparse coefficient similarity is calculated from the angle difference and the amplitude value difference of the two vectors, the angle difference being as follows:
Figure FDA0002222072770000028
wherein the content of the first and second substances,
Figure FDA0002222072770000029
andsparse coefficient vectors respectively representing a reference image and a distorted image, < said > represents the calculation of the inner product of the two vectors, and c is a constant;
the amplitude value difference calculation formula is as follows:
Figure FDA00022220727700000211
thus, the sparse coefficient similarity expression of a certain 8 × 8 small block of the reference image and the small block on the corresponding distorted image is obtained as follows:
SCSi=Pi·Vi
finally, the sparse coefficient similarity between the reference left image and the corresponding jp2k distortion left image is obtained, namely the SCS mean of the q small blocks:
Figure FDA00022220727700000212
in the same way, the sparse coefficient similarity of all the reference images and the corresponding distorted images can be obtained; the number of the left and right distorted images is equal, and if the number is t, the sparse coefficient similarity of all the left images is SCSL=[SCSl1,SCSl2,…,SCSlt]Wherein SCSliRepresenting the SCS mean value obtained by the ith distortion left graph, wherein the sparse coefficient similarity of all the right graphs is SCSR=[SCSr1,SCSr2,…,SCSrt]Wherein SCSrjRepresenting the SCS mean value obtained by the jth distortion right graph;
1.3, fusing left and right images to obtain plane image quality measurement:
Ql_r=wSCSL+(1-w)SCSR
where w represents the contribution made by the left eye at the time of the quality measurement, and 1-w represents the contribution made by the right eye;
step 2, three-dimensional perception quality measurement, specifically:
2.1 training dictionary for reference image disparity map
The stereo perception quality is measured by adopting the characteristics of the absolute difference graph of the left viewpoint and the right viewpoint; respectively carrying out horizontal parallax calculation on the k pairs of reference images, and then respectively learning respective dictionaries; taking a pair of M × N reference images as an example, the calculation formula of the parallax space image DSI is as follows:
DSI(x,y;d)=||IL(x,y)-IR(x-d,y)||2
wherein d is the parallax of the left and right images, and d belongs to [0, d ∈max),dmaxMaximum parallax of the left and right images;
selecting m 8 × 8 overlapped blocks with large disparity map variance by using a dictionary learning method in 1.1 to form a training matrix Yref,Yref=[yref1,yref2,…,yrefm]∈Rn×mWherein n is 64; training with the sample to obtain overcomplete dictionary Dref=[dref1,dref2,…,drefp]∈Rn×pWherein p is>n;
Obtaining a dictionary of absolute disparity maps of all reference images in the same way;
2.2 solving the sparse coefficient similarity of the reference image disparity map and the distorted image disparity map
Taking a pair of left and right distorted images as an example, firstly calculating an absolute disparity map of the pair of distorted images, dividing the difference map into 8 × 8 non-overlapping small blocks by adopting the method in 1.2, coding all the small blocks to obtain a distorted image disparity map matrix sample, and training the corresponding reference image disparity map to obtain a distorted image disparity map matrix sampleOn a dictionary of Xdisi=[xdis1,xdis2,…,xdism]∈Rp×mWherein x isdisiIs a 64 x 1 vector, and represents the sparse representation of an 8 x 8 small block on a dictionary in the absolute difference value graph of the distorted image;
and obtaining a sparse representation of the absolute difference graph of the reference image corresponding to the pair of distorted images on the corresponding dictionary in the same way: xrefi=[xref1,xref2,…,xrefst]∈Rp×mWherein x isrefiIs a 64 x 1 vector, which represents the sparse representation of an 8 x 8 small block on a dictionary in the absolute difference diagram of the reference image;
according to the definition of the sparse coefficient similarity in 1.2, calculating from the angle difference and the amplitude value difference of the two vectors to obtain a sparse coefficient similarity value SCS between the distortion image absolute difference diagram and the reference image absolute difference diagram;
in the same way, the sparse coefficient similarity SCS between the absolute difference maps of all the distorted images and the absolute difference map of the corresponding reference image can be obtainedi
Step 3, objective measurement of stereo image quality
The planar image quality and the three-dimensional perception quality are fused to obtain a three-dimensional image quality objective measurement value, and a final three-dimensional image quality objective measurement value is obtained by combining the planar image quality measurement value and the depth perception quality measurement value in a multiplicative combination mode, wherein the formula is as follows:
Q=Ql_r×(SCS)λ
wherein Q isl_rRepresenting a planar image quality measurement, SCS representing a sparse coefficient similarity value, λ is a constant.
CN201611078458.4A 2016-11-30 2016-11-30 Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics Active CN106780441B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611078458.4A CN106780441B (en) 2016-11-30 2016-11-30 Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611078458.4A CN106780441B (en) 2016-11-30 2016-11-30 Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics

Publications (2)

Publication Number Publication Date
CN106780441A CN106780441A (en) 2017-05-31
CN106780441B true CN106780441B (en) 2020-01-10

Family

ID=58898882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611078458.4A Active CN106780441B (en) 2016-11-30 2016-11-30 Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics

Country Status (1)

Country Link
CN (1) CN106780441B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109257591A (en) * 2017-07-12 2019-01-22 天津大学 Based on rarefaction representation without reference stereoscopic video quality method for objectively evaluating
CN107481221B (en) * 2017-07-19 2020-11-20 天津大学 Full-reference mixed distortion image quality evaluation method based on texture and cartoon sparse representation
CN108596889B (en) * 2018-04-20 2020-12-25 中国科学技术大学 Quality evaluation method and device for stereo image
CN109191428B (en) * 2018-07-26 2021-08-06 西安理工大学 Masking texture feature-based full-reference image quality evaluation method
CN109523513B (en) * 2018-10-18 2023-08-25 天津大学 Stereoscopic image quality evaluation method based on sparse reconstruction color fusion image
CN109887023B (en) * 2019-01-11 2020-12-29 杭州电子科技大学 Binocular fusion stereo image quality evaluation method based on weighted gradient amplitude
CN112233089B (en) * 2020-10-14 2022-10-25 西安交通大学 No-reference stereo mixed distortion image quality evaluation method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408716A (en) * 2014-11-24 2015-03-11 宁波大学 Three-dimensional image quality objective evaluation method based on visual fidelity
CN105488792A (en) * 2015-11-26 2016-04-13 浙江科技学院 No-reference stereo image quality evaluation method based on dictionary learning and machine learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408716A (en) * 2014-11-24 2015-03-11 宁波大学 Three-dimensional image quality objective evaluation method based on visual fidelity
CN105488792A (en) * 2015-11-26 2016-04-13 浙江科技学院 No-reference stereo image quality evaluation method based on dictionary learning and machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于双目特征联合的无参考立体图像质量评价;李柯蒙 等;《光电子 激光》;20151130;第26卷(第11期);第42-47页 *
色彩饱和度与立体图像质量的关系模型研究;王婧 等;《杭州电子科技大学学报(自然科学版)》;20160331;第36卷(第2期);第2224-2230页 *

Also Published As

Publication number Publication date
CN106780441A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN106780441B (en) Three-dimensional image quality objective measurement method based on dictionary learning and human eye visual characteristics
CN109308719B (en) Binocular parallax estimation method based on three-dimensional convolution
US8963998B2 (en) Full reference system for predicting subjective quality of three-dimensional video
CN104811691B (en) A kind of stereoscopic video quality method for objectively evaluating based on wavelet transformation
CN107578403A (en) The stereo image quality evaluation method of binocular view fusion is instructed based on gradient information
CN109523506B (en) Full-reference stereo image quality objective evaluation method based on visual salient image feature enhancement
CN109255358B (en) 3D image quality evaluation method based on visual saliency and depth map
CN109523513B (en) Stereoscopic image quality evaluation method based on sparse reconstruction color fusion image
Shen et al. No-reference stereoscopic image quality assessment based on global and local content characteristics
CN107635136B (en) View-based access control model perception and binocular competition are without reference stereo image quality evaluation method
CN103426173B (en) Objective evaluation method for stereo image quality
CN103780895B (en) A kind of three-dimensional video quality evaluation method
CN103152600A (en) Three-dimensional video quality evaluation method
Geng et al. A stereoscopic image quality assessment model based on independent component analysis and binocular fusion property
CN104408716A (en) Three-dimensional image quality objective evaluation method based on visual fidelity
CN104394403A (en) A compression-distortion-oriented stereoscopic video quality objective evaluating method
CN105654142A (en) Natural scene statistics-based non-reference stereo image quality evaluation method
CN107071423A (en) Application process of the vision multi-channel model in stereoscopic video quality objective evaluation
Han et al. Stereoscopic video quality assessment model based on spatial-temporal structural information
CN114648482A (en) Quality evaluation method and system for three-dimensional panoramic image
CN109345552A (en) Stereo image quality evaluation method based on region weight
CN105898279B (en) A kind of objective evaluation method for quality of stereo images
Shao et al. Multistage pooling for blind quality prediction of asymmetric multiply-distorted stereoscopic images
CN108848365A (en) A kind of reorientation stereo image quality evaluation method
CN104243974A (en) Stereoscopic video quality objective evaluation method based on three-dimensional discrete cosine transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant