CN103226818B

CN103226818B - Based on the single-frame image super-resolution reconstruction method of stream shape canonical sparse support regression

Info

Publication number: CN103226818B
Application number: CN201310147510.7A
Authority: CN
Inventors: 胡瑞敏; 江俊君; 董小慧; 韩镇; 陈军
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2013-04-25
Filing date: 2013-04-25
Publication date: 2015-09-02
Anticipated expiration: 2033-04-25
Also published as: CN103226818A

Abstract

Based on a single-frame image super-resolution reconstruction method for stream shape canonical sparse support regression, set up high-resolution and low-resolution image block collection respectively as high-resolution and low-resolution image block dictionary; Input low-resolution image is divided into several image blocks, sparse coding is carried out and the collection that is supported to image block low-resolution image block dictionary; Calculate the neighbor relationships of high-definition picture block support set and the high-definition picture block space after remaining to reconstruction, learn by the mapping relations of low-resolution image block space to high-definition picture block space; Utilize these mapping relations to try to achieve high-definition picture block corresponding to all input low-resolution image blocks, and be fused into high-definition picture.The present invention proposes stream shape canonical sparse support regression and represents model, the support set of adaptive selection rarefaction representation, and the manifold structure that make use of support set middle high-resolution image block is rebuild to retrain high-definition picture block, thus obtains higher-quality high-definition picture.

Description

Single-frame image super-resolution reconstruction method based on manifold regularization sparse support regression

Technical Field

The invention relates to the field of image super-resolution, in particular to a single-frame image super-resolution reconstruction method based on popular regular sparse support regression.

Background

With the development of computer networks and photographic handheld mobile devices, images and video are increasingly used in our lives. However, due to the limitation of network bandwidth and server storage, most of the images obtained by us have low resolution and quality, and are far from meeting the requirements of people. Image super-resolution is a technology that can use image processing algorithms to improve detail information of low-resolution images, and can provide high-resolution images containing more details to us without requiring more demanding hardware devices.

The super-resolution techniques can be classified into two major categories, that is, super-resolution techniques based on multi-frame image reconstruction and super-resolution techniques based on single-frame image learning, according to the number of input low-resolution images. In the present invention, we focus on super-resolution techniques based on single frame image learning. Compared with the super-resolution technology based on multi-frame low-resolution images, the method has wider practicability and flexibility. Chang et al propose a local linear embedding technique based on manifold assumption In document 1 (H.Chang, D.Yeung, and Y.Xiong.super-resolution through neighboring building [ A ]. In Proc.IEEE CVPR' 04[ C ]. Washington, 2004.275-282.) that considers that manifold spaces formed by high and low resolution image blocks have similar local geometric structures, and that high resolution can be obtained by linear combination of K nearest high resolution image blocks In a training set. Recently, Yang et al used sparse coding in documents 2 (j.yang, j.wright, t.huang, and y.ma, "Image super-resolution estimate presentation of raw Image patches," in proc.ieee conf.com.vis.pattern Recognit. (CVPR), pp.1-8,2008.) and documents 3 (j.yang, j.wright, t.huang, and y.ma. "Image super-resolution Image representation," IEEE trans.image Process, vol.19, No.11, pp.2861-2873,2010.) to perform Image super-resolution, which specifically forces corresponding high and low resolutions to share the same sparse representation: sparse constraint is carried out before regularization, the low-resolution image block is regarded as an over-complete dictionary for coding, then sparse representation coefficients are obtained, and the coefficients are linearly combined with the corresponding high-resolution image block to complete image super-resolution reconstruction. However, the potential assumption that the high and low resolution image blocks have the same sparse representation is difficult to achieve in practical situations. Tang et al, in document 4 (Y.Tang, P.Yan, Y.Yuan, and X.Li, "Single-image super-resolution visual learning," int.J.Mach.Learn. & Cyber., vol.2, pp.15-23,2011.), propose a method of Local Learning Regression (LLR), which is a method of selecting the nearest K sample points from the high-low resolution training set to learn a mapping, but does not consider the geometric structure information of the manifold space of the image block, but the manifold structure is crucial for the representation and analysis of the image.

Disclosure of Invention

The invention aims to provide a single-frame image super-resolution reconstruction method based on manifold regularization sparse support regression.

In order to achieve the purpose, the technical scheme adopted by the invention is a single-frame image super-resolution reconstruction method based on manifold regularization sparse support regression, which comprises the following steps:

step 1, constructing a high-resolution image block training set and a corresponding low-resolution image block training set, wherein the high-resolution image block training set is composed of a plurality of high-resolution image blocks, and the low-resolution image block training set is composed of a plurality of corresponding low-resolution image blocks; dividing an input low-resolution image into a plurality of low-resolution image blocks which are overlapped with each other, wherein the size of each low-resolution image block is the same as that of a low-resolution image block in a low-resolution image block training set;

step 2, for each low-resolution image block in the input low-resolution image, calculating a sparse coding coefficient and a support set which are subjected to sparse reconstruction by taking a low-resolution image block training set as a low-resolution image block dictionary to obtain a high-resolution image block support set and a low-resolution image block support set corresponding to the support set;

step 3, constructing a similar matrix W of a neighborhood in a high-resolution image block support set for each low-resolution image block in the input low-resolution image, and obtaining a manifold constraint item;

step 4, for each low-resolution image block in the input low-resolution image, according to the similar matrix W obtained in the step 3, constraining and reconstructing a mapping matrix P between the low-resolution image block support set and the corresponding high-resolution image block support set;

step 5, for each low-resolution image block in the input low-resolution image, reconstructing and inputting a corresponding high-resolution image block according to the mapping matrix P obtained in the step 4; and after high-resolution image blocks corresponding to all the low-resolution image blocks in the input low-resolution image are obtained, integrating to obtain a high-resolution image and outputting the high-resolution image.

Furthermore, the training set of high resolution image blocks obtained in step 1 is recorded asCorresponding to a low resolution image block training set ofWherein, y_iRepresenting the ith high-resolution image block, x, in the training set of high-resolution image blocks_iRepresenting the ith low-resolution image block in the low-resolution image block training set, wherein the total number of the high-resolution image blocks in the high-resolution image block training set and the total number of the low-resolution image blocks in the low-resolution image block training set are both N;

in step 2, dividing the input low-resolution image into any low-resolution image block x_tSparse coding is carried out through a low-resolution image block training set X, sparse coding coefficients are obtained through the following formula,

wherein λ is₁Is a balance parameter between coding error and sparsity, theta is a coding coefficient with the length of N,returning the value of theta when the function related to the variable theta obtains the minimum value

The support set S is defined as follows,

<math> <mrow> <mi>S</mi> <mo>=</mo> <mi>support</mi> <mrow> <mo>(</mo> <mover> <mi>θ</mi> <mo>^</mo> </mover> <mo>)</mo> </mrow> </mrow> </math>

wherein,the sparse coding coefficients are represented by a matrix of coefficients,is a set of indexes of non-zero elements in the sparse coding coefficient, and a high-resolution image block set and a low-resolution image block set corresponding to a support set S are respectively marked as a high-resolution image block support set Y_S＝{y_iI belongs to S and a low-resolution image block support set X_S＝{x_i|i∈S}；

In step 3, supporting set Y of high-resolution image blocks_SAny one of the high resolution image blocks y_iViewed as one vertex constituting the adjacency matrix graph G; connecting any two vertices y_iAnd y_jThe weight of the edge of (1) is w_ijThe value of i is 1,2,.. and the value of K, j is 1,2,. and K, i is not equal to j, wherein K is the number of the image blocks in the high-resolution image block supporting set;

establishing a similarity matrix W for the high resolution image block support set is according to the following formula,

W_·iis a phase ofI-th column vector like matrix W, W_iiAre the elements on the diagonal of the similarity matrix W,returning to the variable W_·iWhen the function of (A) reaches a minimum value W_·iValue ofλ₂Is y_iCoding error sum W_·iA sparsity balancing parameter;

the manifold constraint term is constructed in the following way,

<math> <mrow> <munder> <mi>Σ</mi> <mrow> <mi>i</mi> <mo>&Element;</mo> <mi>S</mi> </mrow> </munder> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>Px</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <msub> <mi>W</mi> <mrow> <mo>·</mo> <mi>i</mi> </mrow> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> <mn>2</mn> </msubsup> <mo>=</mo> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <mo>-</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <mi>W</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>=</mo> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <mrow> <mo>(</mo> <mi>I</mi> <mo>-</mo> <mi>W</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> </mrow> </math>

wherein I is an identity matrix;

in step 4, the mapping matrix P is obtained by minimizing the following formula,

wherein alpha and beta are regularization coefficients,

for the objective function O_MSSRThe derivation is performed and the matrix properties are used to obtain a mapping matrix P as follows,

in step 5, any low resolution image block x is processed_tThe corresponding high resolution image block is calculated by the following formula,

y_t＝Px_t

wherein, y_tRepresenting low resolution image blocks x_tCorresponding high resolution image blocks.

On the basis of sparse representation of the low-resolution image block, the method properly relaxes the constraint of the same sparse representation so as to more flexibly utilize the information of a local sample and adaptively find a support set representing coefficients, thereby avoiding the problem of over-fitting or improper fitting caused by fixing the number of the support sets in the similar algorithm; meanwhile, the characteristics of the geometric manifold structures among the image blocks in the support set are reserved, the reconstruction of the high-resolution image blocks is carried out through regression, the characteristics of the geometric structures of the sample images are reserved, the encoding result is more accurate and higher in quality, and the method can express different types of images more strongly and more flexibly.

Drawings

Fig. 1 is a flowchart of a single-frame image super-resolution reconstruction method based on manifold regularized sparse support regression according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention can adopt software technology to realize automatic flow operation. The technical solution of the present invention is further described in detail below with reference to the accompanying drawings and examples. Referring to fig. 1, the specific process of the embodiment of the present invention sequentially comprises the following steps:

step 1, constructing a high-resolution image block training set and a corresponding low-resolution image block training set, wherein the high-resolution image block training set is composed of a plurality of high-resolution image blocks, and the low-resolution image block training set is composed of a plurality of corresponding low-resolution image blocks. In specific implementation, a set of pre-generated corresponding high-resolution and low-resolution image block sets may be given as a high-resolution image block training set and a low-resolution image block training set, respectively.

Embodiments record an input high resolution image block training set asLow resolution image block training set ofWherein, y_iRepresenting the ith high-resolution image block, x, in the training set of high-resolution image blocks_iAnd the number of the ith low-resolution image block in the low-resolution image block training set is represented, and the total number of the high-resolution image blocks in the high-resolution image block training set and the total number of the low-resolution image blocks in the low-resolution image block training set are both N. A high resolution image block or a low resolution image block is recorded using a column of vectors, which are D and D dimensional vectors, respectively. D is the number of pixels in the high resolution image block and D is the number of pixels in the low resolution image block. The high-resolution image block training set Y and the low-resolution image block training set X can be respectively regarded as a high-resolution image block dictionary and a low-resolution image block dictionary. Input low resolution image X to be reconstructed_tDivided into a number of low resolutionsAnd the size of the image block of the rate is the same as that of the low-resolution image block in the low-resolution image block training set. Let input low resolution image X_tDividing into M overlapped low resolution image blocks, recording any one of them as x_tI.e. byCorresponding target high resolution image Y_t(i.e., outputting the high-resolution image) or dividing the image into overlapping blocks of high-resolution images, respectively, x_tThe corresponding image block is denoted y_tAnd the size of the image block is the same as that of the high-resolution image block in the high-resolution image block training set. The overlap division is well within the ordinary skill in the art, and in particular implementations, the overlap pixel size can be specified by those skilled in the art. In this embodiment, the size of all the high-resolution image block pixels is 9 × 9, the size of all the low-resolution image block pixels is 3 × 3, and the low-resolution sample image is a result of the high-resolution sample image being smoothed and down-sampled by three times. A total of 5000 pairs of high and low resolution image blocks are used, i.e. N is 5000.

Step 2, dividing the input low-resolution image into any low-resolution image block x_tAnd calculating the coding coefficient and the support set thereof sparsely reconstructed by the low-resolution image block dictionary.

Sparse coding is carried out through a low-resolution image block training set X, a coding coefficient of sparse reconstruction is recorded as a sparse coding coefficient, and the sparse coding coefficient can be obtained through the following formula:

wherein | |₁Representing a L1 norm, | | |. | non-woven circuitry₂Denotes the L2 norm, λ₁Is a balance parameter between coding error and sparsity, theta is a coding coefficient with the length of N,returning the value of theta when the function related to the variable theta obtains the minimum valueIn the present embodiment, the parameter λ₁Set to 0.1.

The support set S is defined as follows:

wherein,the sparse coding coefficients are represented by a matrix of coefficients,is a set of indices of non-zero elements in the sparse coding coefficients. The high-resolution image patch set (subset of the training set Y of high-resolution image patches), the low-resolution image patch set (subset of the training set X of low-resolution image patches) corresponding to the support set may be respectively represented as a support set Y of high-resolution image patches_S＝{y_iI belongs to S and a low-resolution image block support set X_S＝{x_i|i∈S}。

And 3, constructing a neighborhood similarity matrix W in the high-resolution image block support set for each low-resolution image block in the input low-resolution image, and obtaining a manifold constraint term, wherein the method can be carried out through the following substeps:

step 3.1, supporting set Y of high-resolution image blocks_SAny one of the high resolution image blocks y_iViewed as one vertex constituting the adjacency matrix graph G; connecting any two vertices y_iAnd y_jThe weight of the edge of (1) is w_ijThe value of i is 1,2,.. and the value of K, j is 1,2,. and K, i is not equal to j, wherein K is the number of the image blocks in the high-resolution image block supporting set;

establishing a similarity matrix W of the high-resolution image block support set according to the following formula:

W_·iis the ith column vector of W, W_iiIs the element on the diagonal of W,returning to the variable W_·iWhen the function of (A) reaches a minimum value W_·iValue ofY_SFor high resolution image block support set, λ₂Is y_iCoding error sum W_·iThe sparsity balance parameter, in this embodiment, is suggested to be 0.15. Maintaining the geometry of the high-resolution image patch support set by the similarity matrix W for manifold reconstruction of the high-resolution image patches;

step 3.2, for input low resolution image block x_tThe present invention seeks to learn a mapping function f (x, P) ═ Px from the low-resolution image block dictionary to the high-resolution image block dictionary such that the following cost function is minimized:

<math> <mrow> <mi>ϵ</mi> <mrow> <mo>(</mo> <mi>P</mi> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mi>Σ</mi> <mrow> <mi>i</mi> <mo>&Element;</mo> <mi>S</mi> </mrow> </munder> <msup> <mrow> <mo>(</mo> <msub> <mi>Px</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>y</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>+</mo> <mi>α</mi> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>P</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>H</mi> <mn>2</mn> </msubsup> </mrow> </math>

in the above formula, α is a regularization coefficient, and in this embodiment, a value of 0.3 is suggested; p is a mapping matrix of dimension D x D to be learned,is the induced norm of P in Hilbert (Hilbert) space. Within the support set, the above formula can be converted to:

wherein,representing the F-norm.

During the low-resolution to high-resolution reconstruction, constraints that minimize the geometry of the underlying manifold constraint term to preserve the high-resolution image patch support set may be employed,

<math> <mrow> <munder> <mi>Σ</mi> <mrow> <mi>i</mi> <mo>&Element;</mo> <mi>S</mi> </mrow> </munder> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>Px</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <msub> <mi>W</mi> <mrow> <mo>·</mo> <mi>i</mi> </mrow> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> <mn>2</mn> </msubsup> <mo>=</mo> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <mo>-</mo> <msub> <mi>PX</mi> <mi>S</mi> </msub> <mi>W</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>=</mo> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>P</mi> <msub> <mi>X</mi> <mi>S</mi> </msub> <mrow> <mo>(</mo> <mi>I</mi> <mo>-</mo> <mi>W</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> </mrow> </math>

where I is the identity matrix.I.e. the manifold constraint term sought.

In step 4, for each low-resolution image block in the input low-resolution image, according to the similarity matrix W obtained in step 3, constraining and reconstructing the low-resolution image block support set X_SCorresponding high resolution image block support set Y_SA mapping matrix P between, which can be obtained by minimizing:

in the above formula, α and β are both regularization parameters, and in the present embodiment, it is proposed to take 0.3 and 10, respectively. These two parameters can be used to balance the above-equation trinomial pair objective function O_MSSRThe contribution of (c). Objective function O_MSSRThe first term of (a) is a data term, the second term is a mapping function smoothing term, and the third term is a manifold constraint term. Using matrix propertiesAnd tr (a)^T) Where A, B represents a matrix, the above equation can be derived by applying the partial derivatives to P and setting to zero:

wherein G ═ I-W (I-W)^TAnd P is a mapping matrix between the reconstructed low-resolution image block support set and the corresponding high-resolution image block support set.

And 5, reconstructing a high-resolution image corresponding to the input low-resolution image according to the mapping matrix P obtained in the step 4.

In an embodiment, a low resolution image X is input^tIs divided into M mutually overlapping low resolution image blocks,for any low resolution image block x_tThe corresponding high resolution image blocks can be obtained by the following formula:

y_t＝Px_t

After all the high-resolution image blocks are obtained, the high-resolution image blocks are integrated (the average value is obtained at the overlapped position), and then the complete high-resolution image X can be obtained^t. The obtained high-resolution image can be output as a prediction result, and the reconstruction is completed.

Compared with the local learning regression method of document 4, the method of the invention learns a mapping from a low-resolution image block to a high-resolution image block, but the local learning regression method selects K nearest neighbor sample points for mapping learning, and the K value is fixed. The method does not need any presetting, adaptively selects the adjacent points, and discloses the mapping relation between the high-resolution image block and the low-resolution image block in the sparse support domain; meanwhile, the local learning regression method does not consider the geometric structure of the manifold of the image block, the geometric structure has an important role in the selection of the sample image block, and the method aims to maintain the geometric structure of the manifold space of the original high-resolution image block to help complete the reconstruction of the target high-resolution image block, so that compared with other methods, the method disclosed by the invention can better reveal the similar local geometric manifold structures of the high-resolution image block space and the low-resolution image block space, can improve the learning capability and achieve a better reconstruction effect.

To verify the effectiveness of the present invention, five widely used images were tested, which were "barbarbara", "foreman", "house", "lena" and "zebra", respectively. Meanwhile, several internationally most advanced methods such as bicubic interpolation, neighborhood embedding (document 1), sparse coding (document 3), and local learning based on regression (document 4) are used as comparison criteria. Indexes such as peak signal-to-noise ratio (PSNR, in dB), Root Mean Square Error (RMSE), and Structural Similarity (SSIM) are used to evaluate the objective quality of the super-resolution result. Because human eyes react more sensitively to the change of brightness, the super-resolution reconstruction only aims at the brightness part, and color difference components are directly subjected to up-sampling by using simple bicubic interpolation.

To obtain high frequency information of a low resolution image, gradients in four directions (2 vertical directions and 2 horizontal directions) are extracted as a feature representation thereof. In the present embodiment, the high-resolution image block pixels are all 9 × 9 in size, and the low-resolution image block pixels are 3 × 3 in size. 5000 pairs of image blocks are randomly selected from the training images of document 3 as sample image pairs of the embodiment to serve as a training set of neighborhood embedding and local learning regression. The neighbor number of the neighborhood embedding method is 10, and the coefficient parameter in the sparse coding method is 0.1. For fairness, we use the same training dictionary in this method and the sparse coding method.

In order to prove the superiority of the method, PSNR (dB), RMSE and SSIM values (averaging 5 test images) of 5 different test images obtained by the method are compared with those obtained by other methods. The average PSNR values of the Bicubic method, the neighborhood embedding method, the sparse representation method, the local learning regression method and the method are 28.76, 2.02,29.63,29.61 and 29.95 in sequence; the average RMSE values of the Bicubic method, the neighborhood embedding method, the sparse coding method, the local learning regression method and the method are 9.56, 9.24, 8.70, 8.69 and 8.39 in sequence, and the average SSIM values of the Bicubic method, the neighborhood embedding method, the sparse coding method, the local learning regression method and the method are 0.8332, 0.8364, 0.8466, 0.8499 and 0.8595 in sequence. The method of the invention is 0.32dB higher than the best local learning regression method proposed at present in average PSNR; the local learning regression method is 0.30 lower on the average RMSE than the best method currently proposed (smaller RMSE indicates smaller reconstruction error); the local learning regression was 0.0096 higher on average SSIM than the best currently proposed method.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A single-frame image super-resolution reconstruction method based on manifold regularization sparse support regression is characterized by comprising the following steps:

step 5, for each low-resolution image block in the input low-resolution image, reconstructing and inputting a corresponding high-resolution image block according to the mapping matrix P obtained in the step 4; after high-resolution image blocks corresponding to all the low-resolution image blocks in the input low-resolution image are obtained, integrating to obtain a high-resolution image and outputting the high-resolution image;

the realization is as follows,

recording the high-resolution image block training set obtained in the step 1 asCorresponding to a low resolution image block training set ofWherein, y_iRepresenting the ith high-resolution image block, x, in the training set of high-resolution image blocks_iRepresenting the ith low-resolution image block in the low-resolution image block training set, wherein the total number of the high-resolution image blocks in the high-resolution image block training set and the total number of the low-resolution image blocks in the low-resolution image block training set are both N;

in step 2, dividing the input low-resolution image into any low-resolution image block x_tTraining set by low resolution image blocksX is subjected to sparse coding, a sparse coding coefficient is obtained through the following formula,

The support set S is defined as follows,

In step 3, supporting set Y of high-resolution image blocks_SAny one of the high resolution image blocks y_iViewed as one vertex constituting the adjacency matrix graph G; connecting any two vertices y_iAnd y_jThe weight of the edge of (1) is w_ijThe value of i is 1,2, …, the value of K, j is 1,2, …, K, i ≠ j, and K is the number of image blocks in the high-resolution image block support set;

W_·iis the ith column vector of the similarity matrix W, W_iiAre the elements on the diagonal of the similarity matrix W,returning to the variable W_·iWhen the function of (A) reaches a minimum value W_·iValue ofλ₂Is y_iCoding error sum W_·iA sparsity balancing parameter;

the manifold constraint term is constructed in the following way,

wherein I is an identity matrix;

wherein alpha and beta are regularization coefficients,

wherein G ═ I-W (I-W)^T；

y_t＝Px_t