CN107085731A

CN107085731A - A kind of image classification method based on RGB D fusion features and sparse coding

Info

Publication number: CN107085731A
Application number: CN201710328468.7A
Authority: CN
Inventors: 周彦; 向程谕; 王冬丽
Original assignee: Xiangtan University
Current assignee: Xiangtan University
Priority date: 2017-05-11
Filing date: 2017-05-11
Publication date: 2017-08-22
Anticipated expiration: 2037-05-11
Also published as: CN107085731B

Abstract

The present invention discloses a kind of image classification method based on RGB D fusion features and sparse coding, and implementing step is：1) the dense SIFT features and PHOG features of coloured image and depth image are extracted；(2) feature to two kinds of image zooming-outs carries out Fusion Features in the form of linear series, finally gives four kinds of different fusion features；(3) clustering processing is carried out to different fusion features using K means++ clustering methods and obtains four kinds of different visual dictionaries；(4) local restriction uniform enconding is carried out in each visual dictionary, different Image Representation collection are obtained；(5) different Image Representation collection is classified using Linear SVM, uses ballot decision-making technique to determine final classification situation obtained multiple classification results.Nicety of grading of the present invention is high.

Description

A kind of image classification method based on RGB-D fusion features and sparse coding

Technical field

The present invention relates to the technical fields such as computer vision, pattern-recognition, and in particular to one kind is based on RGB-D fusion features With the image classification method of sparse coding.

Background technology

Today's society has been the epoch of information explosion, in addition to substantial amounts of text message, the multimedia messages of human contact (picture, video etc.) is also in explosive growth.In order to accurately and efficiently utilize, management and retrieval image, this is accomplished by computer Understand picture material exactly in the way of human intelligible.Image classification is to solve the problems, such as the important channel of image understanding, right The development of multimedia search technology has important impetus.And acquired image may be changed by viewpoint, illuminates, block The multifactor influence with background etc., this causes image classification to be all computer vision, one tool of artificial intelligence field all the time Therefore challenging problem, many characteristics of image descriptions and sorting technique are developed rapidly.

In current characteristics of image description and sorting technique, main algorithm be feature based bag (Bag-of-Feature, BOF algorithm), S.Lazebnik is in article " Spatial pyramid matching for recognizing natural Proposed in scene categories " based on BOF spatial pyramid matching (Spatial Pyramid Matching, SPM) framework, the algorithm overcomes the spatial information lost in BOF algorithms, effectively raises the accuracy rate of image classification.But It is that the algorithm based on BoW is all that feature is encoded using vector quantization (Vector Quantization, VQ), and it is this Hard coded pattern does not consider the correlation between vision word in visual dictionary, so as to cause after characteristics of image coding Error is larger, and then influences the performance of whole image sorting algorithm.

In recent years, with the theoretical increasingly maturation of sparse coding (Sparse Coding, SC), the theory also turns into figure As the technology of classification field hot topic the most.Yang is in article " Linear spatial pyramid matching using Proposed in sparse coding for image classification " a kind of based on the matching of sparse coding spatial pyramid (Sparse coding Spatial Pyramid Matching, ScSPM), the model substitutes hard point with the mode of sparse coding With pattern, the weight coefficient of visual dictionary can be optimized, so that preferably quantized image feature so that the degree of accuracy of image classification and Efficiency all has greatly improved, but due to excessively complete code book, the high several features of script similarity are possible to be cut So differently show, the stability of ScSPM models is bad.Wang etc. improves ScSPM, in article " Locality- Local restriction uniform enconding is proposed in constrained linear coding for image classification " (Locality-constrained Linear Coding, LLC), it is indicated that locality is more important than openness, uses vision word One Feature Descriptor of multiple basis representations in allusion quotation, and similar Feature Descriptor obtains similar by sharing its local base Coding, this causes ScSPM unstability to be greatly improved.

The above method have ignored the depth information in object or scene both for the classification of coloured image, and depth Information is one of image classification important clue again, because depth information is easy to separate prospect with background according to distance, can be straight It is reversed to reflect object or the three-dimensional information of scene.With Kinect rise, increasingly easier, the knot of the acquisition change of depth image The algorithm for closing depth information progress image classification also begins to catch on.Liefeng Bo etc. are in article " Kernel Descriptors for visual recognition " propose the feature from the angle extraction image of kernel method and schemed As classification, but the defect of this algorithm is to first have to carry out three-dimensional modeling to object, and this is very time-consuming, the reality of algorithm When property is not high；N.Silberman is in article " Indoor scene segmentation using a structured light First change (Scale Invariant Feature Transform, SIFT) algorithm with scale invariant feature in sensor " to distinguish The feature of depth image (Depth images) and coloured image (RGB image) is extracted, Fusion Features are then carried out again, are used afterwards SPM codings carry out image classification；A.Janoch is in article " A Category-Level 3D Object Dataset:Putting The middle histograms of oriented gradients of the Kinect to Work " (Histogram of Oriented Gradient, HOG) algorithm Feature extraction is carried out to depth image and coloured image respectively, final image classification is realized after Fusion Features；Mirdanies M etc. is in article " Object recognition system in remote controlled weapon station using The SURF features of the SIFT feature of the RGB image extracted and depth image are melted in SIFT and SURF methods " Close, and the feature after fusion is used for target classification.These algorithms are all to carry out melting for RGB feature and depth characteristic in characteristic layer Close, can effectively improve the precision of image classification.But this class algorithm similarly has certain defect, here it is right The feature that RGB image and depth image are extracted all is single feature, and is carried when using single features in the presence of the information to image Deficiency is taken, resulting fusion feature can not sufficiently state picture material, and its reason is：RGB image is vulnerable to illumination Change, visual angle change, image geometry deformation, shade and many influences such as block, depth image is easily by imaging device Influence, the problems such as causing to occur hole, noise in image, single image characteristics extraction can not in image it is all because Element keeps robustness, and this will certainly lose the information in image.

Therefore, it is necessary to design a kind of more accurate image classification method of classification.

The content of the invention

The technical problem to be solved in the present invention is that there is provided a kind of integrated RGB-D fusions are special in view of the shortcomings of the prior art The image classification method with sparse coding is levied, accuracy is high, and stability is good.

In order to solve the above-mentioned technical problem, technical scheme provided by the present invention is：

A kind of image classification method based on RGB-D fusion features and sparse coding, including training stage and test phase：

The training stage comprises the following steps：

Step A1, for each sample data, extract its RGB image and Depth images (coloured image and depth map Picture) denseSIFT (Scale-invariantfeaturetransform, Scale invariant features transform) and PHOG (PyramidHistogramofOrientedGradients, laminated gradient direction histogram) feature；The number of sample data is n；

Step A2, for each sample data, to the shape of the feature of its two kinds of image zooming-outs using linear series two-by-two Formula carries out Fusion Features, obtains four kinds of different fusion features；The fusion feature of the same race that n sample data is obtained constitutes a collection Close, obtain four kinds of fusion feature collection；

Extracted by features described above, denseSIFT the and PHOG features of RGB image, and Depth images DenseSIFT and PHOG features；Resulting feature is normalized afterwards, all features is possessed similar yardstick； The present invention is merged, i.e., to reduce the complexity of Fusion Features by the way of linear series two-by-two to feature：

F=K₁·α+K₂·β (1)

Wherein K₁, K₂It is characterized corresponding weights, and K₁+K₂=1, the present invention in make K₁=K₂.α represents RGB image extraction Feature, β represents the feature of Depth image zooming-outs；Four kinds of different fusion features are finally given, i.e.,：RGBD-dense SIFT Feature, RGB-dense SIFT feature+PHOGD features, RGB-PHOG feature+D-dense SIFT features, RGBD-PHOG are special Levy；Fusion feature, the dense of RGB image that the dense SIFT features of RGB image and Depth images are produced are represented respectively Fusion feature, the PHOG features of RGB image and the Depth images that SIFT feature and the PHOG features of Depth images are produced The fusion feature that the PHOG features of fusion feature, RGB image and Depth images that dense SIFT features are produced are produced.

Step A3, clustering processing is carried out to the fusion feature that four kinds of fusion features are concentrated respectively, obtain four kinds and different regard Feel dictionary；

Step A4, in every kind of visual dictionary, using local restriction uniform enconding model to fusion feature carry out feature volume Code, obtains four kinds of different Image Representation collection；

Step A5, the class label according to four kinds of different fusion feature collection, Image Representation collection and corresponding sample data Structural classification device, obtains four different graders.

The test phase comprises the following steps：

Step B1, according in step A2~A3 method extract and fusion image to be classified feature, obtain figure to be sorted Four kinds of fusion features of picture；

Step B2, in four kinds of visual dictionaries that step A3 is obtained, using local restriction uniform enconding model respectively to step Four kinds of fusion features that rapid B1 is obtained carry out feature coding, obtain four kinds of different Image Representations of image to be classified；

Four kinds of Image Representations that step B3, four graders obtained with step A5 are obtained to step B2 respectively are divided Class, obtains four class labels (identical class label may be included in four class labels, it is also possible to be all different class labels)；

Step B4, based on four obtained class labels, the final class of the image to be classified is obtained using ballot decision-making technique Label, that is, choose the most class label of poll in four class labels and be used as final class label.

Further, in the step A3, the fusion that certain fusion feature is concentrated is directed to using K-means++ clustering methods Feature carries out clustering processing.

Tradition set up visual dictionary K-means algorithms have the advantages that simply, performance efficiency.But K-means algorithms are certainly Body there is also certain limitation, algorithm be in the selection to initial cluster center it is random, this result in cluster result by The influence of initial center point is larger, if being absorbed in locally optimal solution by the selection of initial center point, and this is to the correct classification of image Result be fatal.So not enough for this point, the present invention carries out visual dictionary foundation using K-means++ algorithms, takes A kind of method that probability is chosen replaces random selection initial cluster center.Clustering processing is carried out for any fusion feature, is obtained Concrete methods of realizing to corresponding visual dictionary is as follows：

3.1) the obtained fusion feature obtained by n sample data is constituted into a set, i.e. fusion feature collection H_I= {h₁,h₂,h₃,…,h_n, and set clusters number to be m；

3.2) in fusion feature collection H_I={ h₁,h₂,h₃,…,h_nIn random selection be used as first initial clustering at one o'clock Center S₁；Count value t=1 is set；

3.3) to fusion feature collection H_I={ h₁,h₂,h₃,…,h_nIn each point h_i, h_i∈H_I, calculate it and S_tBetween Apart from d (h_i)；

3.4) next initial cluster center S is selected_t+1：

Based on formulaCalculate point h_i' it is selected as the probability of next initial cluster center, wherein h_i'∈H_I；

The maximum point of select probability is used as next initial cluster center S_t+1；

3.5) t=t+1 is made, repeat step (3) and (4), until t=m, i.e., m initial cluster center, which is selected, to be come；

3.6) K-means algorithms are run using the initial cluster center elected, m cluster centre is most generated finally；

3.7) it is a vision word in visual dictionary to define each cluster centre, and clusters number m is visual dictionary Size.

Further, in the step A4, feature coding is carried out to fusion feature using local restriction uniform enconding model, Model expression is as follows：

In formula：h_iFor fusion feature collection H_IIn fusion feature, i.e., characteristic vector to be encoded, h_i∈R^d, d represent fusion The dimension of feature；B=[b₁,b₂,b₃…b_m] it is the visual dictionary set up by K-means++ algorithms, b₁~b_mFor visual dictionary In m vision word, b_j∈R^d；C=[c₁,c₂,c₃…c_n] it is the Image Representation collection that coding is obtained, wherein c_i∈R^mFor coding After the completion of piece image sparse coding representation；λ is LLC penalty factor；Represent that element correspondence is multiplied；1^Tc_iIn 1 table Show the vector that whole elements are 1, then 1^Tc_i=1 is used to enter row constraint to LLC, makes it have translation invariance；d_iIt is defined as：

Wherein dist (h_i, B) and=[dist (h_i,b₁),dist(h_i,b₂),…dist(h_i,b_m)]^T, dist (h_i,b_j) represent h_iWith b_jBetween Euclidean distance, σ be used for adjust local location constraint weight decrease speed.

The present invention is using local restriction uniform enconding (Locality-constrainedlinearcoding, LLC).Because The locality position constraint of feature can necessarily meet the openness of feature, and meet the openness of feature and not necessarily meet local Property position constraint, so local more important than sparse.LLC replaces sparse constraint using local restriction, can obtain good performance.

Further, in the step A4, fusion feature is carried out using approximate local restriction uniform enconding model special Assemble-publish code；Encoding model is solving c in formula (2)_iDuring, characteristic vector h to be encoded_iTendency selection visual dictionary middle-range From nearer vision word, a Local coordinate system is formed.Therefore, it can be used according to this rule a kind of simple approximate LLC feature codings mode accelerates cataloged procedure, i.e., do not solve formula (2), for any one characteristic vector h to be encoded_i, make K vision word for choosing visual dictionary B middle-range its nearest neighbours with k proximity search is used as local visual word matrix B_i, pass through Solution scale smaller linear system is encoded.Its expression formula is as follows：

Wherein,The Image Representation collection obtained for Approximation Coding, whereinFor Approximation Coding After the completion of piece image sparse coding representation, according to formula (4) analytic solutions, approximate LLC feature codings will can calculate multiple It is miscellaneous to spend from o (n²) it is reduced to o (n+k²), wherein k<<N, but last performance is more or less the same with LLC feature codings.Approximate LLC features Coded system can both retain local feature, can ensure to encode openness requirement again, so in the present invention using approximate LLC models carry out feature coding.

Further, k=50 is taken.

Further, in the step A1, image is divided and obtains equal-sized spy by denseSIFT characteristic uses grid Block (block) is levied, and overlap mode is used between block and block, the center of each characteristic block leads to as a characteristic point The point of all pixels in same characteristic block is crossed to form the SIFT feature descriptor of this feature point (as traditional SIFT feature Feature descriptor：Histogram of gradients), these last feature point groups based on SIFT feature descriptor are into entire image Dense SIFT features；

PHOG feature extractions are comprised the following steps that：

1.1) marginal information of statistical picture；The edge contour of image is extracted using Canny edge detection operators, and will This profile is used for the shape for describing image；

1.2) pyramid Multi-level segmentation is carried out to image, the block number of image segmentation depends on the number of plies of pyramid grade；This 3 layers are divided the image into invention, the 1st layer is whole image；2nd layer divides an image into 4 sub-regions, the size in each region Unanimously；3rd layer is that 4 sub-regions are divided on the basis of the 2nd layer, each region is further subdivided into 4 sub-regions, most 4 × 4 sub-regions are obtained eventually；

1.3) HOG characteristic vectors (the Histogram of Oriented of each sub-regions are extracted in each layer Gridients, histograms of oriented gradients)；

1.4) the HOG characteristic vectors of subregion in image each layer are finally subjected to cascade processing (series connection), obtaining level After HOG data after connection, the normalization operation of data is carried out, the PHOG features of entire image are finally given.

Further, in the step A5, grader uses Linear SVM grader.

Further, to occur that inhomogeneity label is obtained for the ballot decision-making technique in the step B4 most and equal The problem of poll, in this case, using randomly selected method, randomly choosed in the class label of these equal polls One of class label is used as final class label.

The beneficial effects of the invention are as follows：

The present invention selects multiple fusion features, can make up lacking for the single fusion feature existence information amount deficiency of image Point, effectively raises the accuracy rate of image classification.Visual dictionary is set up from Kmeans++ algorithms, the side chosen using probability Method replaces random selection initial cluster center, effectively algorithm can be avoided to be absorbed in locally optimal solution.Finally utilize decision-making of voting Method each class result is voted, the big classification results of difference are merged, last classification performance is determined by ballot decision-making, It ensure that the stability of result.

Brief description of the drawings

Fig. 1 is the flow chart of the image classification method of integrated RGB-D fusion features and sparse coding.

Fig. 2 is LLC feature coding models in training stage step A5 of the invention.

Fig. 3 is test image categorised decision module in test phase step B4 of the present invention.

Identity confusion matrixes of the Fig. 4 for the present invention on RGB-D Scenes data sets.

Embodiment

With reference to instantiation, and with reference to drawings in detail, the present invention is described in more detail.But described example The understanding of the present invention is intended to, and does not play any restriction effect to it.

Fig. 1 is the system flow chart of the image classification of integrated RGB-D fusion features and sparse coding, and specific implementation step is such as Under：

Step S1：Extract the dense SIFT features and PHOG features of RGB image and Depth images；

Step S2：Feature to two kinds of image zooming-outs carries out Fusion Features in the form of series connection, finally gives four kinds not Same fusion feature；

Step S3：Clustering processing is carried out to different fusion features using K-means++ clustering methods and obtains four kinds of differences Visual dictionary；

Step S4：Local restriction uniform enconding is carried out in each visual dictionary, different Image Representation collection are obtained；

Step S5：Using Linear SVM to different Image Representation collection structural classification devices, finally by these four graders Classification results carry out ballot decision-making and determine final classification.

Image classification method based on integrated RGB-D fusion features and sparse coding, the present invention is using experimental data to this The method of invention is verified.

The experimental data set that the present invention is used is RGB-D Scenes data sets, and the data set is provided by University of Washington A various visual angles scene image data collection, the data set is made up of 8 classification scenes, and totally 5972 pictures, image is whole Obtained by Kinect video camera, size is 640*480.

In RGB-D Scenes data sets, all images are used to test and picture size is adjusted to 256*256.It is right In feature extraction, the dense SIFT feature sampling intervals of image zooming-out are set to 8 pixels in this experiment, and image block is 16 × 16.PHOG feature extraction parameters are set to：Tile size is 16 × 16, and the sampling interval is 8 pixels, and gradient direction is set to 9.Build During vertical visual dictionary, dictionary size is set to 200.Using the libsvm3.12 tool boxes of LIBSVM kits, data during svm classifier Concentration takes 80% picture to be used to train, and 20% picture is used to test.

In this experiment, from the inventive method from the aspect of two, first, investigate the inventive method accurate with current class The method of some higher researchers of true rate is contrasted；Second, investigate different RGB-D fusion features and the inventive method Classifying quality is contrasted.

Table 1RGB-D Scenes data set classification results compare

Sorting technique	Accuracy rate/%
		Linear SVM	89.6%
Gaussian kernel function SVM	90.0%
		Random forest	90.1%
HOG	77.2%
		SIFT+SPM	84.2%
The inventive method	91.7%

The contrast of classification accuracy and other method is as shown in table 1.Liefeng Bo are in article " Kernel It is in descriptors for visual recognition " that three kinds of features are integrated, respectively with Linear SVM (Linear SVM), gaussian kernel function SVM (Kernel SVM) and random forest (Random Forest) are trained and classified to it, This obtains 89.6%, 90.0% and 90.1% accuracy rate respectively in testing.A.Janoch is in article " A Category- Level 3D Object Dataset:Using HOG algorithms respectively to depth map in Putting the Kinect to Work " Picture and coloured image carry out feature extraction, final classification are realized using SVM classifier after Fusion Features, in this experiment The method obtains 77.2% accuracy rate.N.Silberman is in article " Indoor scene segmentation using a First extract the feature of depth image and coloured image, Ran Houzai in structured light sensor " respectively with SIFT algorithms Fusion Features are carried out, feature coding is carried out using SPM afterwards, are finally classified using SVM, this algorithm is obtained in this experiment 84.2% classification accuracy.And algorithm proposed by the present invention obtains 91.7% accuracy rate, with result best before Compared to improving 1.6%, it can be seen that inventive algorithm has good classification performance.

The difference fusion feature classification results contrast of table 2RGB-D Scenes data sets

From table 2 it can be seen that when combined depth information carries out image classification, the sorting algorithm based on single fusion feature Accuracy rate is less than the sorting algorithm based on many fusion features, and image classification algorithms based on multi-feature fusion can be obtained preferably Classification accuracy, but slightly below the image classification algorithms based on many fusion feature Decision fusions.

The specific embodiment of the present invention is described above.It should be appreciated that the invention is not limited in above-mentioned Particular implementation, all any modification, equivalent substitution and improvements done within the Spirit Essence and principle of the present invention etc., It should be included within the scope of protection of the invention.

Claims

1. a kind of image classification method based on RGB-D fusion features and sparse coding, it is characterised in that including the training stage and Test phase：

The training stage comprises the following steps：

Step A1, for each sample data, extract denseSIFT and the PHOG feature of its RGB image and Depth images； The number of sample data is n；

Step A2, for each sample data, the feature to its two kinds of image zooming-outs is entered in the form of linear series two-by-two Row Fusion Features, obtain four kinds of different fusion features；The fusion feature composition one for the identical type that n sample data is obtained Individual set, obtains four kinds of fusion feature collection；

Step A3, the fusion feature progress clustering processing concentrated respectively to four kinds of fusion features, obtain four kinds of different vision words Allusion quotation；

Step A4, in every kind of visual dictionary, using local restriction uniform enconding model to fusion feature carry out feature coding, obtain To four kinds of different Image Representation collection；

Step A5, the class label configurations according to four kinds of different fusion feature collection, Image Representation collection and corresponding sample data Grader, obtains four different graders；

The test phase comprises the following steps：

Step B1, according in step A2~A3 method extract and fusion image to be classified feature, obtain image to be classified Four kinds of fusion features；

Step B2, in four kinds of visual dictionaries that step A3 is obtained, using local restriction uniform enconding model respectively to step B1 Four kinds of obtained fusion features carry out feature coding, obtain four kinds of different Image Representations of image to be classified；

Four kinds of Image Representations that step B3, four graders obtained with step A5 are obtained to step B2 respectively are classified, and are obtained To four class labels；

Step B4, based on four obtained class labels, the final class label of the image to be classified is obtained using ballot decision-making technique, Choose the most class label of poll in four class labels and be used as final class label.

2. the image classification method according to claim 1 based on RGB-D fusion features and sparse coding, its feature exists In in the step A3, at the fusion feature progress cluster concentrated using K-means++ clustering methods for certain fusion feature Reason, the method for setting up corresponding visual dictionary is as follows：

3.1) remember that this kind of fusion feature integrates as H_I={ h₁,h₂,h₃,…,h_n, and set clusters number to be m；

3.2) in H_IOne fusion feature of middle random selection is used as first initial cluster center S₁；Count value t=1 is set；

3.3) to H_IIn each fusion feature h_i, h_i∈H_I, calculate it and S_tThe distance between d (h_i)；

3.4) next initial cluster center S is selected_t+1：

Based on formulaCalculate point h_i' it is selected as the probability of next initial cluster center, wherein h_i'∈H_I；Selection The fusion feature of maximum probability is used as next initial cluster center S_t+1；

3.7) it is a vision word in visual dictionary to define each cluster centre, and clusters number m is the big of visual dictionary It is small.

3. the image classification method according to claim 2 based on RGB-D fusion features and sparse coding, its feature exists In in the step A4, using local restriction uniform enconding model to fusion feature progress feature coding, model expression is such as Under：

<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>C</mi> </munder> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <mo>|</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>Bc</mi> <mi>i</mi> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>+</mo> <mi>&lambda;</mi> <mo>|</mo> <mo>|</mo> <msub> <mi>d</mi> <mi>i</mi> </msub> <mo>&CircleTimes;</mo> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> </mtd> </mtr> <mtr> <mtd> <mtable> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <msup> <mn>1</mn> <mi>T</mi> </msup> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>&ForAll;</mo> <mi>i</mi> </mrow> </mtd> </mtr> </mtable> </mtd> </mtr> </mtable> </mfenced>

In formula：h_iFor fusion feature collection H_IIn fusion feature, i.e., characteristic vector to be encoded, h_i∈R^d, d represents fusion feature Dimension；B=[b₁,b₂,b₃…b_m] it is the visual dictionary set up by K-means++ algorithms, b₁~b_mFor m in visual dictionary Vision word, b_j∈R^d；C=[c₁,c₂,c₃…c_n] it is the Image Representation collection that coding is obtained, wherein c_i∈R^mTo melt after the completion of coding Close feature h_iCode coefficient；λ is LLC penalty factor；Represent that element correspondence is multiplied；1^Tc_iIn 1 represent that whole elements are 1 Vector, then 1^Tc_i=1 is used to enter row constraint to local constrained line encoding model, makes it have translation invariance；d_iDefinition For：

<mrow> <msub> <mi>d</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>exp</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>d</mi> <mi>i</mi> <mi>s</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>B</mi> <mo>)</mo> </mrow> </mrow> <mi>&sigma;</mi> </mfrac> <mo>)</mo> </mrow> </mrow>

4. the image classification method according to claim 1 based on RGB-D fusion features and sparse coding, its feature exists In in the step A4, using approximate local restriction uniform enconding model to fusion feature progress feature coding, model tormulation Formula is as follows：

<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mover> <mi>C</mi> <mo>~</mo> </mover> </munder> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <mo>|</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>B</mi> <mi>i</mi> </msub> <mover> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>~</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> </mtd> </mtr> <mtr> <mtd> <mtable> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <msup> <mn>1</mn> <mi>T</mi> </msup> <mover> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>~</mo> </mover> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>&ForAll;</mo> <mi>i</mi> </mrow> </mtd> </mtr> </mtable> </mtd> </mtr> </mtable> </mfenced>

Wherein, B_iIt is distance characteristic vector h to be encoded in the visual dictionary B chosen using k proximity search_iK nearest vision The local visual word matrix that word is constituted,The Image Representation collection obtained for Approximation Coding, whereinFor fusion feature h after the completion of Approximation Coding_iCode coefficient.

5. the image classification method according to claim 4 based on RGB-D fusion features and sparse coding, its feature exists In taking k=50.

6. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding Method, it is characterised in that in the step A1, PHOG feature extractions are comprised the following steps that：

1.1) marginal information of statistical picture；The edge contour of image is extracted using Canny edge detection operators, and this is taken turns Exterior feature is used for the shape for describing image；

1.2) pyramid Multi-level segmentation is carried out to image；3 layers are divided the image into, the 1st layer is whole image；2nd layer is drawn image Be divided into 4 sub-regions, each region it is in the same size；3rd layer is that 4 sub-regions are divided on the basis of the 2nd layer, Each region is further subdivided into 4 sub-regions, finally gives 4 × 4 sub-regions；

1.3) the HOG characteristic vectors of each sub-regions are extracted in each layer；

1.4) the HOG characteristic vectors of subregion in image each layer are finally subjected to cascade processing, the HOG numbers after being cascaded According to rear, the normalization operation of data is carried out, the PHOG features of entire image are finally given.

7. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding Method, it is characterised in that in the step A5, grader uses Linear SVM grader.

8. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding Method, it is characterised in that in the step B4, if the most category of poll is signed with multiple, is selected at random in these class labels One is selected as final class label.