CN107085731A - A kind of image classification method based on RGB D fusion features and sparse coding - Google Patents

A kind of image classification method based on RGB D fusion features and sparse coding Download PDF

Info

Publication number
CN107085731A
CN107085731A CN201710328468.7A CN201710328468A CN107085731A CN 107085731 A CN107085731 A CN 107085731A CN 201710328468 A CN201710328468 A CN 201710328468A CN 107085731 A CN107085731 A CN 107085731A
Authority
CN
China
Prior art keywords
image
feature
mrow
fusion
msub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710328468.7A
Other languages
Chinese (zh)
Other versions
CN107085731B (en
Inventor
周彦
向程谕
王冬丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangtan University
Original Assignee
Xiangtan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangtan University filed Critical Xiangtan University
Priority to CN201710328468.7A priority Critical patent/CN107085731B/en
Publication of CN107085731A publication Critical patent/CN107085731A/en
Application granted granted Critical
Publication of CN107085731B publication Critical patent/CN107085731B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of image classification method based on RGB D fusion features and sparse coding, and implementing step is:1) the dense SIFT features and PHOG features of coloured image and depth image are extracted;(2) feature to two kinds of image zooming-outs carries out Fusion Features in the form of linear series, finally gives four kinds of different fusion features;(3) clustering processing is carried out to different fusion features using K means++ clustering methods and obtains four kinds of different visual dictionaries;(4) local restriction uniform enconding is carried out in each visual dictionary, different Image Representation collection are obtained;(5) different Image Representation collection is classified using Linear SVM, uses ballot decision-making technique to determine final classification situation obtained multiple classification results.Nicety of grading of the present invention is high.

Description

A kind of image classification method based on RGB-D fusion features and sparse coding
Technical field
The present invention relates to the technical fields such as computer vision, pattern-recognition, and in particular to one kind is based on RGB-D fusion features With the image classification method of sparse coding.
Background technology
Today's society has been the epoch of information explosion, in addition to substantial amounts of text message, the multimedia messages of human contact (picture, video etc.) is also in explosive growth.In order to accurately and efficiently utilize, management and retrieval image, this is accomplished by computer Understand picture material exactly in the way of human intelligible.Image classification is to solve the problems, such as the important channel of image understanding, right The development of multimedia search technology has important impetus.And acquired image may be changed by viewpoint, illuminates, block The multifactor influence with background etc., this causes image classification to be all computer vision, one tool of artificial intelligence field all the time Therefore challenging problem, many characteristics of image descriptions and sorting technique are developed rapidly.
In current characteristics of image description and sorting technique, main algorithm be feature based bag (Bag-of-Feature, BOF algorithm), S.Lazebnik is in article " Spatial pyramid matching for recognizing natural Proposed in scene categories " based on BOF spatial pyramid matching (Spatial Pyramid Matching, SPM) framework, the algorithm overcomes the spatial information lost in BOF algorithms, effectively raises the accuracy rate of image classification.But It is that the algorithm based on BoW is all that feature is encoded using vector quantization (Vector Quantization, VQ), and it is this Hard coded pattern does not consider the correlation between vision word in visual dictionary, so as to cause after characteristics of image coding Error is larger, and then influences the performance of whole image sorting algorithm.
In recent years, with the theoretical increasingly maturation of sparse coding (Sparse Coding, SC), the theory also turns into figure As the technology of classification field hot topic the most.Yang is in article " Linear spatial pyramid matching using Proposed in sparse coding for image classification " a kind of based on the matching of sparse coding spatial pyramid (Sparse coding Spatial Pyramid Matching, ScSPM), the model substitutes hard point with the mode of sparse coding With pattern, the weight coefficient of visual dictionary can be optimized, so that preferably quantized image feature so that the degree of accuracy of image classification and Efficiency all has greatly improved, but due to excessively complete code book, the high several features of script similarity are possible to be cut So differently show, the stability of ScSPM models is bad.Wang etc. improves ScSPM, in article " Locality- Local restriction uniform enconding is proposed in constrained linear coding for image classification " (Locality-constrained Linear Coding, LLC), it is indicated that locality is more important than openness, uses vision word One Feature Descriptor of multiple basis representations in allusion quotation, and similar Feature Descriptor obtains similar by sharing its local base Coding, this causes ScSPM unstability to be greatly improved.
The above method have ignored the depth information in object or scene both for the classification of coloured image, and depth Information is one of image classification important clue again, because depth information is easy to separate prospect with background according to distance, can be straight It is reversed to reflect object or the three-dimensional information of scene.With Kinect rise, increasingly easier, the knot of the acquisition change of depth image The algorithm for closing depth information progress image classification also begins to catch on.Liefeng Bo etc. are in article " Kernel Descriptors for visual recognition " propose the feature from the angle extraction image of kernel method and schemed As classification, but the defect of this algorithm is to first have to carry out three-dimensional modeling to object, and this is very time-consuming, the reality of algorithm When property is not high;N.Silberman is in article " Indoor scene segmentation using a structured light First change (Scale Invariant Feature Transform, SIFT) algorithm with scale invariant feature in sensor " to distinguish The feature of depth image (Depth images) and coloured image (RGB image) is extracted, Fusion Features are then carried out again, are used afterwards SPM codings carry out image classification;A.Janoch is in article " A Category-Level 3D Object Dataset:Putting The middle histograms of oriented gradients of the Kinect to Work " (Histogram of Oriented Gradient, HOG) algorithm Feature extraction is carried out to depth image and coloured image respectively, final image classification is realized after Fusion Features;Mirdanies M etc. is in article " Object recognition system in remote controlled weapon station using The SURF features of the SIFT feature of the RGB image extracted and depth image are melted in SIFT and SURF methods " Close, and the feature after fusion is used for target classification.These algorithms are all to carry out melting for RGB feature and depth characteristic in characteristic layer Close, can effectively improve the precision of image classification.But this class algorithm similarly has certain defect, here it is right The feature that RGB image and depth image are extracted all is single feature, and is carried when using single features in the presence of the information to image Deficiency is taken, resulting fusion feature can not sufficiently state picture material, and its reason is:RGB image is vulnerable to illumination Change, visual angle change, image geometry deformation, shade and many influences such as block, depth image is easily by imaging device Influence, the problems such as causing to occur hole, noise in image, single image characteristics extraction can not in image it is all because Element keeps robustness, and this will certainly lose the information in image.
Therefore, it is necessary to design a kind of more accurate image classification method of classification.
The content of the invention
The technical problem to be solved in the present invention is that there is provided a kind of integrated RGB-D fusions are special in view of the shortcomings of the prior art The image classification method with sparse coding is levied, accuracy is high, and stability is good.
In order to solve the above-mentioned technical problem, technical scheme provided by the present invention is:
A kind of image classification method based on RGB-D fusion features and sparse coding, including training stage and test phase:
The training stage comprises the following steps:
Step A1, for each sample data, extract its RGB image and Depth images (coloured image and depth map Picture) denseSIFT (Scale-invariantfeaturetransform, Scale invariant features transform) and PHOG (PyramidHistogramofOrientedGradients, laminated gradient direction histogram) feature;The number of sample data is n;
Step A2, for each sample data, to the shape of the feature of its two kinds of image zooming-outs using linear series two-by-two Formula carries out Fusion Features, obtains four kinds of different fusion features;The fusion feature of the same race that n sample data is obtained constitutes a collection Close, obtain four kinds of fusion feature collection;
Extracted by features described above, denseSIFT the and PHOG features of RGB image, and Depth images DenseSIFT and PHOG features;Resulting feature is normalized afterwards, all features is possessed similar yardstick; The present invention is merged, i.e., to reduce the complexity of Fusion Features by the way of linear series two-by-two to feature:
F=K1·α+K2·β (1)
Wherein K1, K2It is characterized corresponding weights, and K1+K2=1, the present invention in make K1=K2.α represents RGB image extraction Feature, β represents the feature of Depth image zooming-outs;Four kinds of different fusion features are finally given, i.e.,:RGBD-dense SIFT Feature, RGB-dense SIFT feature+PHOGD features, RGB-PHOG feature+D-dense SIFT features, RGBD-PHOG are special Levy;Fusion feature, the dense of RGB image that the dense SIFT features of RGB image and Depth images are produced are represented respectively Fusion feature, the PHOG features of RGB image and the Depth images that SIFT feature and the PHOG features of Depth images are produced The fusion feature that the PHOG features of fusion feature, RGB image and Depth images that dense SIFT features are produced are produced.
Step A3, clustering processing is carried out to the fusion feature that four kinds of fusion features are concentrated respectively, obtain four kinds and different regard Feel dictionary;
Step A4, in every kind of visual dictionary, using local restriction uniform enconding model to fusion feature carry out feature volume Code, obtains four kinds of different Image Representation collection;
Step A5, the class label according to four kinds of different fusion feature collection, Image Representation collection and corresponding sample data Structural classification device, obtains four different graders.
The test phase comprises the following steps:
Step B1, according in step A2~A3 method extract and fusion image to be classified feature, obtain figure to be sorted Four kinds of fusion features of picture;
Step B2, in four kinds of visual dictionaries that step A3 is obtained, using local restriction uniform enconding model respectively to step Four kinds of fusion features that rapid B1 is obtained carry out feature coding, obtain four kinds of different Image Representations of image to be classified;
Four kinds of Image Representations that step B3, four graders obtained with step A5 are obtained to step B2 respectively are divided Class, obtains four class labels (identical class label may be included in four class labels, it is also possible to be all different class labels);
Step B4, based on four obtained class labels, the final class of the image to be classified is obtained using ballot decision-making technique Label, that is, choose the most class label of poll in four class labels and be used as final class label.
Further, in the step A3, the fusion that certain fusion feature is concentrated is directed to using K-means++ clustering methods Feature carries out clustering processing.
Tradition set up visual dictionary K-means algorithms have the advantages that simply, performance efficiency.But K-means algorithms are certainly Body there is also certain limitation, algorithm be in the selection to initial cluster center it is random, this result in cluster result by The influence of initial center point is larger, if being absorbed in locally optimal solution by the selection of initial center point, and this is to the correct classification of image Result be fatal.So not enough for this point, the present invention carries out visual dictionary foundation using K-means++ algorithms, takes A kind of method that probability is chosen replaces random selection initial cluster center.Clustering processing is carried out for any fusion feature, is obtained Concrete methods of realizing to corresponding visual dictionary is as follows:
3.1) the obtained fusion feature obtained by n sample data is constituted into a set, i.e. fusion feature collection HI= {h1,h2,h3,…,hn, and set clusters number to be m;
3.2) in fusion feature collection HI={ h1,h2,h3,…,hnIn random selection be used as first initial clustering at one o'clock Center S1;Count value t=1 is set;
3.3) to fusion feature collection HI={ h1,h2,h3,…,hnIn each point hi, hi∈HI, calculate it and StBetween Apart from d (hi);
3.4) next initial cluster center S is selectedt+1
Based on formulaCalculate point hi' it is selected as the probability of next initial cluster center, wherein hi'∈HI
The maximum point of select probability is used as next initial cluster center St+1
3.5) t=t+1 is made, repeat step (3) and (4), until t=m, i.e., m initial cluster center, which is selected, to be come;
3.6) K-means algorithms are run using the initial cluster center elected, m cluster centre is most generated finally;
3.7) it is a vision word in visual dictionary to define each cluster centre, and clusters number m is visual dictionary Size.
Further, in the step A4, feature coding is carried out to fusion feature using local restriction uniform enconding model, Model expression is as follows:
In formula:hiFor fusion feature collection HIIn fusion feature, i.e., characteristic vector to be encoded, hi∈Rd, d represent fusion The dimension of feature;B=[b1,b2,b3…bm] it is the visual dictionary set up by K-means++ algorithms, b1~bmFor visual dictionary In m vision word, bj∈Rd;C=[c1,c2,c3…cn] it is the Image Representation collection that coding is obtained, wherein ci∈RmFor coding After the completion of piece image sparse coding representation;λ is LLC penalty factor;Represent that element correspondence is multiplied;1TciIn 1 table Show the vector that whole elements are 1, then 1Tci=1 is used to enter row constraint to LLC, makes it have translation invariance;diIt is defined as:
Wherein dist (hi, B) and=[dist (hi,b1),dist(hi,b2),…dist(hi,bm)]T, dist (hi,bj) represent hiWith bjBetween Euclidean distance, σ be used for adjust local location constraint weight decrease speed.
The present invention is using local restriction uniform enconding (Locality-constrainedlinearcoding, LLC).Because The locality position constraint of feature can necessarily meet the openness of feature, and meet the openness of feature and not necessarily meet local Property position constraint, so local more important than sparse.LLC replaces sparse constraint using local restriction, can obtain good performance.
Further, in the step A4, fusion feature is carried out using approximate local restriction uniform enconding model special Assemble-publish code;Encoding model is solving c in formula (2)iDuring, characteristic vector h to be encodediTendency selection visual dictionary middle-range From nearer vision word, a Local coordinate system is formed.Therefore, it can be used according to this rule a kind of simple approximate LLC feature codings mode accelerates cataloged procedure, i.e., do not solve formula (2), for any one characteristic vector h to be encodedi, make K vision word for choosing visual dictionary B middle-range its nearest neighbours with k proximity search is used as local visual word matrix Bi, pass through Solution scale smaller linear system is encoded.Its expression formula is as follows:
Wherein,The Image Representation collection obtained for Approximation Coding, whereinFor Approximation Coding After the completion of piece image sparse coding representation, according to formula (4) analytic solutions, approximate LLC feature codings will can calculate multiple It is miscellaneous to spend from o (n2) it is reduced to o (n+k2), wherein k<<N, but last performance is more or less the same with LLC feature codings.Approximate LLC features Coded system can both retain local feature, can ensure to encode openness requirement again, so in the present invention using approximate LLC models carry out feature coding.
Further, k=50 is taken.
Further, in the step A1, image is divided and obtains equal-sized spy by denseSIFT characteristic uses grid Block (block) is levied, and overlap mode is used between block and block, the center of each characteristic block leads to as a characteristic point The point of all pixels in same characteristic block is crossed to form the SIFT feature descriptor of this feature point (as traditional SIFT feature Feature descriptor:Histogram of gradients), these last feature point groups based on SIFT feature descriptor are into entire image Dense SIFT features;
PHOG feature extractions are comprised the following steps that:
1.1) marginal information of statistical picture;The edge contour of image is extracted using Canny edge detection operators, and will This profile is used for the shape for describing image;
1.2) pyramid Multi-level segmentation is carried out to image, the block number of image segmentation depends on the number of plies of pyramid grade;This 3 layers are divided the image into invention, the 1st layer is whole image;2nd layer divides an image into 4 sub-regions, the size in each region Unanimously;3rd layer is that 4 sub-regions are divided on the basis of the 2nd layer, each region is further subdivided into 4 sub-regions, most 4 × 4 sub-regions are obtained eventually;
1.3) HOG characteristic vectors (the Histogram of Oriented of each sub-regions are extracted in each layer Gridients, histograms of oriented gradients);
1.4) the HOG characteristic vectors of subregion in image each layer are finally subjected to cascade processing (series connection), obtaining level After HOG data after connection, the normalization operation of data is carried out, the PHOG features of entire image are finally given.
Further, in the step A5, grader uses Linear SVM grader.
Further, to occur that inhomogeneity label is obtained for the ballot decision-making technique in the step B4 most and equal The problem of poll, in this case, using randomly selected method, randomly choosed in the class label of these equal polls One of class label is used as final class label.
The beneficial effects of the invention are as follows:
The present invention selects multiple fusion features, can make up lacking for the single fusion feature existence information amount deficiency of image Point, effectively raises the accuracy rate of image classification.Visual dictionary is set up from Kmeans++ algorithms, the side chosen using probability Method replaces random selection initial cluster center, effectively algorithm can be avoided to be absorbed in locally optimal solution.Finally utilize decision-making of voting Method each class result is voted, the big classification results of difference are merged, last classification performance is determined by ballot decision-making, It ensure that the stability of result.
Brief description of the drawings
Fig. 1 is the flow chart of the image classification method of integrated RGB-D fusion features and sparse coding.
Fig. 2 is LLC feature coding models in training stage step A5 of the invention.
Fig. 3 is test image categorised decision module in test phase step B4 of the present invention.
Identity confusion matrixes of the Fig. 4 for the present invention on RGB-D Scenes data sets.
Embodiment
With reference to instantiation, and with reference to drawings in detail, the present invention is described in more detail.But described example The understanding of the present invention is intended to, and does not play any restriction effect to it.
Fig. 1 is the system flow chart of the image classification of integrated RGB-D fusion features and sparse coding, and specific implementation step is such as Under:
Step S1:Extract the dense SIFT features and PHOG features of RGB image and Depth images;
Step S2:Feature to two kinds of image zooming-outs carries out Fusion Features in the form of series connection, finally gives four kinds not Same fusion feature;
Step S3:Clustering processing is carried out to different fusion features using K-means++ clustering methods and obtains four kinds of differences Visual dictionary;
Step S4:Local restriction uniform enconding is carried out in each visual dictionary, different Image Representation collection are obtained;
Step S5:Using Linear SVM to different Image Representation collection structural classification devices, finally by these four graders Classification results carry out ballot decision-making and determine final classification.
Image classification method based on integrated RGB-D fusion features and sparse coding, the present invention is using experimental data to this The method of invention is verified.
The experimental data set that the present invention is used is RGB-D Scenes data sets, and the data set is provided by University of Washington A various visual angles scene image data collection, the data set is made up of 8 classification scenes, and totally 5972 pictures, image is whole Obtained by Kinect video camera, size is 640*480.
In RGB-D Scenes data sets, all images are used to test and picture size is adjusted to 256*256.It is right In feature extraction, the dense SIFT feature sampling intervals of image zooming-out are set to 8 pixels in this experiment, and image block is 16 × 16.PHOG feature extraction parameters are set to:Tile size is 16 × 16, and the sampling interval is 8 pixels, and gradient direction is set to 9.Build During vertical visual dictionary, dictionary size is set to 200.Using the libsvm3.12 tool boxes of LIBSVM kits, data during svm classifier Concentration takes 80% picture to be used to train, and 20% picture is used to test.
In this experiment, from the inventive method from the aspect of two, first, investigate the inventive method accurate with current class The method of some higher researchers of true rate is contrasted;Second, investigate different RGB-D fusion features and the inventive method Classifying quality is contrasted.
Table 1RGB-D Scenes data set classification results compare
Sorting technique Accuracy rate/%
Linear SVM 89.6%
Gaussian kernel function SVM 90.0%
Random forest 90.1%
HOG 77.2%
SIFT+SPM 84.2%
The inventive method 91.7%
The contrast of classification accuracy and other method is as shown in table 1.Liefeng Bo are in article " Kernel It is in descriptors for visual recognition " that three kinds of features are integrated, respectively with Linear SVM (Linear SVM), gaussian kernel function SVM (Kernel SVM) and random forest (Random Forest) are trained and classified to it, This obtains 89.6%, 90.0% and 90.1% accuracy rate respectively in testing.A.Janoch is in article " A Category- Level 3D Object Dataset:Using HOG algorithms respectively to depth map in Putting the Kinect to Work " Picture and coloured image carry out feature extraction, final classification are realized using SVM classifier after Fusion Features, in this experiment The method obtains 77.2% accuracy rate.N.Silberman is in article " Indoor scene segmentation using a First extract the feature of depth image and coloured image, Ran Houzai in structured light sensor " respectively with SIFT algorithms Fusion Features are carried out, feature coding is carried out using SPM afterwards, are finally classified using SVM, this algorithm is obtained in this experiment 84.2% classification accuracy.And algorithm proposed by the present invention obtains 91.7% accuracy rate, with result best before Compared to improving 1.6%, it can be seen that inventive algorithm has good classification performance.
The difference fusion feature classification results contrast of table 2RGB-D Scenes data sets
From table 2 it can be seen that when combined depth information carries out image classification, the sorting algorithm based on single fusion feature Accuracy rate is less than the sorting algorithm based on many fusion features, and image classification algorithms based on multi-feature fusion can be obtained preferably Classification accuracy, but slightly below the image classification algorithms based on many fusion feature Decision fusions.
The specific embodiment of the present invention is described above.It should be appreciated that the invention is not limited in above-mentioned Particular implementation, all any modification, equivalent substitution and improvements done within the Spirit Essence and principle of the present invention etc., It should be included within the scope of protection of the invention.

Claims (8)

1. a kind of image classification method based on RGB-D fusion features and sparse coding, it is characterised in that including the training stage and Test phase:
The training stage comprises the following steps:
Step A1, for each sample data, extract denseSIFT and the PHOG feature of its RGB image and Depth images; The number of sample data is n;
Step A2, for each sample data, the feature to its two kinds of image zooming-outs is entered in the form of linear series two-by-two Row Fusion Features, obtain four kinds of different fusion features;The fusion feature composition one for the identical type that n sample data is obtained Individual set, obtains four kinds of fusion feature collection;
Step A3, the fusion feature progress clustering processing concentrated respectively to four kinds of fusion features, obtain four kinds of different vision words Allusion quotation;
Step A4, in every kind of visual dictionary, using local restriction uniform enconding model to fusion feature carry out feature coding, obtain To four kinds of different Image Representation collection;
Step A5, the class label configurations according to four kinds of different fusion feature collection, Image Representation collection and corresponding sample data Grader, obtains four different graders;
The test phase comprises the following steps:
Step B1, according in step A2~A3 method extract and fusion image to be classified feature, obtain image to be classified Four kinds of fusion features;
Step B2, in four kinds of visual dictionaries that step A3 is obtained, using local restriction uniform enconding model respectively to step B1 Four kinds of obtained fusion features carry out feature coding, obtain four kinds of different Image Representations of image to be classified;
Four kinds of Image Representations that step B3, four graders obtained with step A5 are obtained to step B2 respectively are classified, and are obtained To four class labels;
Step B4, based on four obtained class labels, the final class label of the image to be classified is obtained using ballot decision-making technique, Choose the most class label of poll in four class labels and be used as final class label.
2. the image classification method according to claim 1 based on RGB-D fusion features and sparse coding, its feature exists In in the step A3, at the fusion feature progress cluster concentrated using K-means++ clustering methods for certain fusion feature Reason, the method for setting up corresponding visual dictionary is as follows:
3.1) remember that this kind of fusion feature integrates as HI={ h1,h2,h3,…,hn, and set clusters number to be m;
3.2) in HIOne fusion feature of middle random selection is used as first initial cluster center S1;Count value t=1 is set;
3.3) to HIIn each fusion feature hi, hi∈HI, calculate it and StThe distance between d (hi);
3.4) next initial cluster center S is selectedt+1
Based on formulaCalculate point hi' it is selected as the probability of next initial cluster center, wherein hi'∈HI;Selection The fusion feature of maximum probability is used as next initial cluster center St+1
3.5) t=t+1 is made, repeat step (3) and (4), until t=m, i.e., m initial cluster center, which is selected, to be come;
3.6) K-means algorithms are run using the initial cluster center elected, m cluster centre is most generated finally;
3.7) it is a vision word in visual dictionary to define each cluster centre, and clusters number m is the big of visual dictionary It is small.
3. the image classification method according to claim 2 based on RGB-D fusion features and sparse coding, its feature exists In in the step A4, using local restriction uniform enconding model to fusion feature progress feature coding, model expression is such as Under:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>C</mi> </munder> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <mo>|</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>Bc</mi> <mi>i</mi> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>+</mo> <mi>&amp;lambda;</mi> <mo>|</mo> <mo>|</mo> <msub> <mi>d</mi> <mi>i</mi> </msub> <mo>&amp;CircleTimes;</mo> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> </mtd> </mtr> <mtr> <mtd> <mtable> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <msup> <mn>1</mn> <mi>T</mi> </msup> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>&amp;ForAll;</mo> <mi>i</mi> </mrow> </mtd> </mtr> </mtable> </mtd> </mtr> </mtable> </mfenced>
In formula:hiFor fusion feature collection HIIn fusion feature, i.e., characteristic vector to be encoded, hi∈Rd, d represents fusion feature Dimension;B=[b1,b2,b3…bm] it is the visual dictionary set up by K-means++ algorithms, b1~bmFor m in visual dictionary Vision word, bj∈Rd;C=[c1,c2,c3…cn] it is the Image Representation collection that coding is obtained, wherein ci∈RmTo melt after the completion of coding Close feature hiCode coefficient;λ is LLC penalty factor;Represent that element correspondence is multiplied;1TciIn 1 represent that whole elements are 1 Vector, then 1Tci=1 is used to enter row constraint to local constrained line encoding model, makes it have translation invariance;diDefinition For:
<mrow> <msub> <mi>d</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>exp</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>d</mi> <mi>i</mi> <mi>s</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>B</mi> <mo>)</mo> </mrow> </mrow> <mi>&amp;sigma;</mi> </mfrac> <mo>)</mo> </mrow> </mrow>
Wherein dist (hi, B) and=[dist (hi,b1),dist(hi,b2),…dist(hi,bm)]T, dist (hi,bj) represent hiWith bjBetween Euclidean distance, σ be used for adjust local location constraint weight decrease speed.
4. the image classification method according to claim 1 based on RGB-D fusion features and sparse coding, its feature exists In in the step A4, using approximate local restriction uniform enconding model to fusion feature progress feature coding, model tormulation Formula is as follows:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mover> <mi>C</mi> <mo>~</mo> </mover> </munder> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <mo>|</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>B</mi> <mi>i</mi> </msub> <mover> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>~</mo> </mover> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> </mrow> </mtd> </mtr> <mtr> <mtd> <mtable> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <msup> <mn>1</mn> <mi>T</mi> </msup> <mover> <msub> <mi>c</mi> <mi>i</mi> </msub> <mo>~</mo> </mover> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>&amp;ForAll;</mo> <mi>i</mi> </mrow> </mtd> </mtr> </mtable> </mtd> </mtr> </mtable> </mfenced>
Wherein, BiIt is distance characteristic vector h to be encoded in the visual dictionary B chosen using k proximity searchiK nearest vision The local visual word matrix that word is constituted,The Image Representation collection obtained for Approximation Coding, whereinFor fusion feature h after the completion of Approximation CodingiCode coefficient.
5. the image classification method according to claim 4 based on RGB-D fusion features and sparse coding, its feature exists In taking k=50.
6. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding Method, it is characterised in that in the step A1, PHOG feature extractions are comprised the following steps that:
1.1) marginal information of statistical picture;The edge contour of image is extracted using Canny edge detection operators, and this is taken turns Exterior feature is used for the shape for describing image;
1.2) pyramid Multi-level segmentation is carried out to image;3 layers are divided the image into, the 1st layer is whole image;2nd layer is drawn image Be divided into 4 sub-regions, each region it is in the same size;3rd layer is that 4 sub-regions are divided on the basis of the 2nd layer, Each region is further subdivided into 4 sub-regions, finally gives 4 × 4 sub-regions;
1.3) the HOG characteristic vectors of each sub-regions are extracted in each layer;
1.4) the HOG characteristic vectors of subregion in image each layer are finally subjected to cascade processing, the HOG numbers after being cascaded According to rear, the normalization operation of data is carried out, the PHOG features of entire image are finally given.
7. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding Method, it is characterised in that in the step A5, grader uses Linear SVM grader.
8. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding Method, it is characterised in that in the step B4, if the most category of poll is signed with multiple, is selected at random in these class labels One is selected as final class label.
CN201710328468.7A 2017-05-11 2017-05-11 Image classification method based on RGB-D fusion features and sparse coding Active CN107085731B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710328468.7A CN107085731B (en) 2017-05-11 2017-05-11 Image classification method based on RGB-D fusion features and sparse coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710328468.7A CN107085731B (en) 2017-05-11 2017-05-11 Image classification method based on RGB-D fusion features and sparse coding

Publications (2)

Publication Number Publication Date
CN107085731A true CN107085731A (en) 2017-08-22
CN107085731B CN107085731B (en) 2020-03-10

Family

ID=59611626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710328468.7A Active CN107085731B (en) 2017-05-11 2017-05-11 Image classification method based on RGB-D fusion features and sparse coding

Country Status (1)

Country Link
CN (1) CN107085731B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090926A (en) * 2017-12-31 2018-05-29 厦门大学 A kind of depth estimation method based on double dictionary study
CN108596256A (en) * 2018-04-26 2018-09-28 北京航空航天大学青岛研究院 One kind being based on RGB-D object identification grader building methods
CN108805183A (en) * 2018-05-28 2018-11-13 南京邮电大学 A kind of image classification method of fusion partial polymerization descriptor and local uniform enconding
CN108875080A (en) * 2018-07-12 2018-11-23 百度在线网络技术(北京)有限公司 A kind of image search method, device, server and storage medium
CN109741484A (en) * 2018-12-24 2019-05-10 南京理工大学 Automobile data recorder and its working method with image detection and voice alarm function
CN110443298A (en) * 2019-07-31 2019-11-12 华中科技大学 It is a kind of based on cloud-edge cooperated computing DDNN and its construction method and application
CN111160387A (en) * 2019-11-28 2020-05-15 广东工业大学 Graph model based on multi-view dictionary learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005786A (en) * 2015-06-19 2015-10-28 南京航空航天大学 Texture image classification method based on BoF and multi-feature fusion

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005786A (en) * 2015-06-19 2015-10-28 南京航空航天大学 Texture image classification method based on BoF and multi-feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
向程谕: "基于RGB-D容和特征的图像分类", 《计算机工程与应用》 *
申晓霞等: "基于Kinect和金字塔特征的行为识别算法", 《光电子•激光》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090926A (en) * 2017-12-31 2018-05-29 厦门大学 A kind of depth estimation method based on double dictionary study
CN108596256A (en) * 2018-04-26 2018-09-28 北京航空航天大学青岛研究院 One kind being based on RGB-D object identification grader building methods
CN108805183A (en) * 2018-05-28 2018-11-13 南京邮电大学 A kind of image classification method of fusion partial polymerization descriptor and local uniform enconding
CN108875080A (en) * 2018-07-12 2018-11-23 百度在线网络技术(北京)有限公司 A kind of image search method, device, server and storage medium
CN109741484A (en) * 2018-12-24 2019-05-10 南京理工大学 Automobile data recorder and its working method with image detection and voice alarm function
CN110443298A (en) * 2019-07-31 2019-11-12 华中科技大学 It is a kind of based on cloud-edge cooperated computing DDNN and its construction method and application
CN110443298B (en) * 2019-07-31 2022-02-15 华中科技大学 Cloud-edge collaborative computing-based DDNN and construction method and application thereof
CN111160387A (en) * 2019-11-28 2020-05-15 广东工业大学 Graph model based on multi-view dictionary learning
CN111160387B (en) * 2019-11-28 2022-06-03 广东工业大学 Graph model based on multi-view dictionary learning

Also Published As

Publication number Publication date
CN107085731B (en) 2020-03-10

Similar Documents

Publication Publication Date Title
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
Xia et al. AID: A benchmark data set for performance evaluation of aerial scene classification
CN107085731A (en) A kind of image classification method based on RGB D fusion features and sparse coding
Wang et al. Aggregating rich hierarchical features for scene classification in remote sensing imagery
Cheng et al. Effective and efficient midlevel visual elements-oriented land-use classification using VHR remote sensing images
Liu et al. Scene classification using hierarchical Wasserstein CNN
CN106126581A (en) Cartographical sketching image search method based on degree of depth study
CN102054178B (en) A kind of image of Chinese Painting recognition methods based on local semantic concept
CN111222434A (en) Method for obtaining evidence of synthesized face image based on local binary pattern and deep learning
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN106682233A (en) Method for Hash image retrieval based on deep learning and local feature fusion
CN110109060A (en) A kind of radar emitter signal method for separating and system based on deep learning network
CN106096557A (en) A kind of semi-supervised learning facial expression recognizing method based on fuzzy training sample
CN108197538A (en) A kind of bayonet vehicle searching system and method based on local feature and deep learning
CN108564094A (en) A kind of Material Identification method based on convolutional neural networks and classifiers combination
CN104504362A (en) Face detection method based on convolutional neural network
CN108009637B (en) Station caption segmentation method of pixel-level station caption identification network based on cross-layer feature extraction
CN104966081B (en) Spine image-recognizing method
CN105654122B (en) Based on the matched spatial pyramid object identification method of kernel function
CN108921850B (en) Image local feature extraction method based on image segmentation technology
CN114067444A (en) Face spoofing detection method and system based on meta-pseudo label and illumination invariant feature
CN106529586A (en) Image classification method based on supplemented text characteristic
CN105868706A (en) Method for identifying 3D model based on sparse coding
CN111881716A (en) Pedestrian re-identification method based on multi-view-angle generation countermeasure network
CN108416270A (en) A kind of traffic sign recognition method based on more attribute union features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant