CN107085731A - A kind of image classification method based on RGB D fusion features and sparse coding - Google Patents
A kind of image classification method based on RGB D fusion features and sparse coding Download PDFInfo
- Publication number
- CN107085731A CN107085731A CN201710328468.7A CN201710328468A CN107085731A CN 107085731 A CN107085731 A CN 107085731A CN 201710328468 A CN201710328468 A CN 201710328468A CN 107085731 A CN107085731 A CN 107085731A
- Authority
- CN
- China
- Prior art keywords
- image
- feature
- mrow
- fusion
- msub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of image classification method based on RGB D fusion features and sparse coding, and implementing step is:1) the dense SIFT features and PHOG features of coloured image and depth image are extracted;(2) feature to two kinds of image zooming-outs carries out Fusion Features in the form of linear series, finally gives four kinds of different fusion features;(3) clustering processing is carried out to different fusion features using K means++ clustering methods and obtains four kinds of different visual dictionaries;(4) local restriction uniform enconding is carried out in each visual dictionary, different Image Representation collection are obtained;(5) different Image Representation collection is classified using Linear SVM, uses ballot decision-making technique to determine final classification situation obtained multiple classification results.Nicety of grading of the present invention is high.
Description
Technical field
The present invention relates to the technical fields such as computer vision, pattern-recognition, and in particular to one kind is based on RGB-D fusion features
With the image classification method of sparse coding.
Background technology
Today's society has been the epoch of information explosion, in addition to substantial amounts of text message, the multimedia messages of human contact
(picture, video etc.) is also in explosive growth.In order to accurately and efficiently utilize, management and retrieval image, this is accomplished by computer
Understand picture material exactly in the way of human intelligible.Image classification is to solve the problems, such as the important channel of image understanding, right
The development of multimedia search technology has important impetus.And acquired image may be changed by viewpoint, illuminates, block
The multifactor influence with background etc., this causes image classification to be all computer vision, one tool of artificial intelligence field all the time
Therefore challenging problem, many characteristics of image descriptions and sorting technique are developed rapidly.
In current characteristics of image description and sorting technique, main algorithm be feature based bag (Bag-of-Feature,
BOF algorithm), S.Lazebnik is in article " Spatial pyramid matching for recognizing natural
Proposed in scene categories " based on BOF spatial pyramid matching (Spatial Pyramid Matching,
SPM) framework, the algorithm overcomes the spatial information lost in BOF algorithms, effectively raises the accuracy rate of image classification.But
It is that the algorithm based on BoW is all that feature is encoded using vector quantization (Vector Quantization, VQ), and it is this
Hard coded pattern does not consider the correlation between vision word in visual dictionary, so as to cause after characteristics of image coding
Error is larger, and then influences the performance of whole image sorting algorithm.
In recent years, with the theoretical increasingly maturation of sparse coding (Sparse Coding, SC), the theory also turns into figure
As the technology of classification field hot topic the most.Yang is in article " Linear spatial pyramid matching using
Proposed in sparse coding for image classification " a kind of based on the matching of sparse coding spatial pyramid
(Sparse coding Spatial Pyramid Matching, ScSPM), the model substitutes hard point with the mode of sparse coding
With pattern, the weight coefficient of visual dictionary can be optimized, so that preferably quantized image feature so that the degree of accuracy of image classification and
Efficiency all has greatly improved, but due to excessively complete code book, the high several features of script similarity are possible to be cut
So differently show, the stability of ScSPM models is bad.Wang etc. improves ScSPM, in article " Locality-
Local restriction uniform enconding is proposed in constrained linear coding for image classification "
(Locality-constrained Linear Coding, LLC), it is indicated that locality is more important than openness, uses vision word
One Feature Descriptor of multiple basis representations in allusion quotation, and similar Feature Descriptor obtains similar by sharing its local base
Coding, this causes ScSPM unstability to be greatly improved.
The above method have ignored the depth information in object or scene both for the classification of coloured image, and depth
Information is one of image classification important clue again, because depth information is easy to separate prospect with background according to distance, can be straight
It is reversed to reflect object or the three-dimensional information of scene.With Kinect rise, increasingly easier, the knot of the acquisition change of depth image
The algorithm for closing depth information progress image classification also begins to catch on.Liefeng Bo etc. are in article " Kernel
Descriptors for visual recognition " propose the feature from the angle extraction image of kernel method and schemed
As classification, but the defect of this algorithm is to first have to carry out three-dimensional modeling to object, and this is very time-consuming, the reality of algorithm
When property is not high;N.Silberman is in article " Indoor scene segmentation using a structured light
First change (Scale Invariant Feature Transform, SIFT) algorithm with scale invariant feature in sensor " to distinguish
The feature of depth image (Depth images) and coloured image (RGB image) is extracted, Fusion Features are then carried out again, are used afterwards
SPM codings carry out image classification;A.Janoch is in article " A Category-Level 3D Object Dataset:Putting
The middle histograms of oriented gradients of the Kinect to Work " (Histogram of Oriented Gradient, HOG) algorithm
Feature extraction is carried out to depth image and coloured image respectively, final image classification is realized after Fusion Features;Mirdanies
M etc. is in article " Object recognition system in remote controlled weapon station using
The SURF features of the SIFT feature of the RGB image extracted and depth image are melted in SIFT and SURF methods "
Close, and the feature after fusion is used for target classification.These algorithms are all to carry out melting for RGB feature and depth characteristic in characteristic layer
Close, can effectively improve the precision of image classification.But this class algorithm similarly has certain defect, here it is right
The feature that RGB image and depth image are extracted all is single feature, and is carried when using single features in the presence of the information to image
Deficiency is taken, resulting fusion feature can not sufficiently state picture material, and its reason is:RGB image is vulnerable to illumination
Change, visual angle change, image geometry deformation, shade and many influences such as block, depth image is easily by imaging device
Influence, the problems such as causing to occur hole, noise in image, single image characteristics extraction can not in image it is all because
Element keeps robustness, and this will certainly lose the information in image.
Therefore, it is necessary to design a kind of more accurate image classification method of classification.
The content of the invention
The technical problem to be solved in the present invention is that there is provided a kind of integrated RGB-D fusions are special in view of the shortcomings of the prior art
The image classification method with sparse coding is levied, accuracy is high, and stability is good.
In order to solve the above-mentioned technical problem, technical scheme provided by the present invention is:
A kind of image classification method based on RGB-D fusion features and sparse coding, including training stage and test phase:
The training stage comprises the following steps:
Step A1, for each sample data, extract its RGB image and Depth images (coloured image and depth map
Picture) denseSIFT (Scale-invariantfeaturetransform, Scale invariant features transform) and PHOG
(PyramidHistogramofOrientedGradients, laminated gradient direction histogram) feature;The number of sample data is
n;
Step A2, for each sample data, to the shape of the feature of its two kinds of image zooming-outs using linear series two-by-two
Formula carries out Fusion Features, obtains four kinds of different fusion features;The fusion feature of the same race that n sample data is obtained constitutes a collection
Close, obtain four kinds of fusion feature collection;
Extracted by features described above, denseSIFT the and PHOG features of RGB image, and Depth images
DenseSIFT and PHOG features;Resulting feature is normalized afterwards, all features is possessed similar yardstick;
The present invention is merged, i.e., to reduce the complexity of Fusion Features by the way of linear series two-by-two to feature:
F=K1·α+K2·β (1)
Wherein K1, K2It is characterized corresponding weights, and K1+K2=1, the present invention in make K1=K2.α represents RGB image extraction
Feature, β represents the feature of Depth image zooming-outs;Four kinds of different fusion features are finally given, i.e.,:RGBD-dense SIFT
Feature, RGB-dense SIFT feature+PHOGD features, RGB-PHOG feature+D-dense SIFT features, RGBD-PHOG are special
Levy;Fusion feature, the dense of RGB image that the dense SIFT features of RGB image and Depth images are produced are represented respectively
Fusion feature, the PHOG features of RGB image and the Depth images that SIFT feature and the PHOG features of Depth images are produced
The fusion feature that the PHOG features of fusion feature, RGB image and Depth images that dense SIFT features are produced are produced.
Step A3, clustering processing is carried out to the fusion feature that four kinds of fusion features are concentrated respectively, obtain four kinds and different regard
Feel dictionary;
Step A4, in every kind of visual dictionary, using local restriction uniform enconding model to fusion feature carry out feature volume
Code, obtains four kinds of different Image Representation collection;
Step A5, the class label according to four kinds of different fusion feature collection, Image Representation collection and corresponding sample data
Structural classification device, obtains four different graders.
The test phase comprises the following steps:
Step B1, according in step A2~A3 method extract and fusion image to be classified feature, obtain figure to be sorted
Four kinds of fusion features of picture;
Step B2, in four kinds of visual dictionaries that step A3 is obtained, using local restriction uniform enconding model respectively to step
Four kinds of fusion features that rapid B1 is obtained carry out feature coding, obtain four kinds of different Image Representations of image to be classified;
Four kinds of Image Representations that step B3, four graders obtained with step A5 are obtained to step B2 respectively are divided
Class, obtains four class labels (identical class label may be included in four class labels, it is also possible to be all different class labels);
Step B4, based on four obtained class labels, the final class of the image to be classified is obtained using ballot decision-making technique
Label, that is, choose the most class label of poll in four class labels and be used as final class label.
Further, in the step A3, the fusion that certain fusion feature is concentrated is directed to using K-means++ clustering methods
Feature carries out clustering processing.
Tradition set up visual dictionary K-means algorithms have the advantages that simply, performance efficiency.But K-means algorithms are certainly
Body there is also certain limitation, algorithm be in the selection to initial cluster center it is random, this result in cluster result by
The influence of initial center point is larger, if being absorbed in locally optimal solution by the selection of initial center point, and this is to the correct classification of image
Result be fatal.So not enough for this point, the present invention carries out visual dictionary foundation using K-means++ algorithms, takes
A kind of method that probability is chosen replaces random selection initial cluster center.Clustering processing is carried out for any fusion feature, is obtained
Concrete methods of realizing to corresponding visual dictionary is as follows:
3.1) the obtained fusion feature obtained by n sample data is constituted into a set, i.e. fusion feature collection HI=
{h1,h2,h3,…,hn, and set clusters number to be m;
3.2) in fusion feature collection HI={ h1,h2,h3,…,hnIn random selection be used as first initial clustering at one o'clock
Center S1;Count value t=1 is set;
3.3) to fusion feature collection HI={ h1,h2,h3,…,hnIn each point hi, hi∈HI, calculate it and StBetween
Apart from d (hi);
3.4) next initial cluster center S is selectedt+1:
Based on formulaCalculate point hi' it is selected as the probability of next initial cluster center, wherein hi'∈HI;
The maximum point of select probability is used as next initial cluster center St+1;
3.5) t=t+1 is made, repeat step (3) and (4), until t=m, i.e., m initial cluster center, which is selected, to be come;
3.6) K-means algorithms are run using the initial cluster center elected, m cluster centre is most generated finally;
3.7) it is a vision word in visual dictionary to define each cluster centre, and clusters number m is visual dictionary
Size.
Further, in the step A4, feature coding is carried out to fusion feature using local restriction uniform enconding model,
Model expression is as follows:
In formula:hiFor fusion feature collection HIIn fusion feature, i.e., characteristic vector to be encoded, hi∈Rd, d represent fusion
The dimension of feature;B=[b1,b2,b3…bm] it is the visual dictionary set up by K-means++ algorithms, b1~bmFor visual dictionary
In m vision word, bj∈Rd;C=[c1,c2,c3…cn] it is the Image Representation collection that coding is obtained, wherein ci∈RmFor coding
After the completion of piece image sparse coding representation;λ is LLC penalty factor;Represent that element correspondence is multiplied;1TciIn 1 table
Show the vector that whole elements are 1, then 1Tci=1 is used to enter row constraint to LLC, makes it have translation invariance;diIt is defined as:
Wherein dist (hi, B) and=[dist (hi,b1),dist(hi,b2),…dist(hi,bm)]T, dist (hi,bj) represent
hiWith bjBetween Euclidean distance, σ be used for adjust local location constraint weight decrease speed.
The present invention is using local restriction uniform enconding (Locality-constrainedlinearcoding, LLC).Because
The locality position constraint of feature can necessarily meet the openness of feature, and meet the openness of feature and not necessarily meet local
Property position constraint, so local more important than sparse.LLC replaces sparse constraint using local restriction, can obtain good performance.
Further, in the step A4, fusion feature is carried out using approximate local restriction uniform enconding model special
Assemble-publish code;Encoding model is solving c in formula (2)iDuring, characteristic vector h to be encodediTendency selection visual dictionary middle-range
From nearer vision word, a Local coordinate system is formed.Therefore, it can be used according to this rule a kind of simple approximate
LLC feature codings mode accelerates cataloged procedure, i.e., do not solve formula (2), for any one characteristic vector h to be encodedi, make
K vision word for choosing visual dictionary B middle-range its nearest neighbours with k proximity search is used as local visual word matrix Bi, pass through
Solution scale smaller linear system is encoded.Its expression formula is as follows:
Wherein,The Image Representation collection obtained for Approximation Coding, whereinFor Approximation Coding
After the completion of piece image sparse coding representation, according to formula (4) analytic solutions, approximate LLC feature codings will can calculate multiple
It is miscellaneous to spend from o (n2) it is reduced to o (n+k2), wherein k<<N, but last performance is more or less the same with LLC feature codings.Approximate LLC features
Coded system can both retain local feature, can ensure to encode openness requirement again, so in the present invention using approximate
LLC models carry out feature coding.
Further, k=50 is taken.
Further, in the step A1, image is divided and obtains equal-sized spy by denseSIFT characteristic uses grid
Block (block) is levied, and overlap mode is used between block and block, the center of each characteristic block leads to as a characteristic point
The point of all pixels in same characteristic block is crossed to form the SIFT feature descriptor of this feature point (as traditional SIFT feature
Feature descriptor:Histogram of gradients), these last feature point groups based on SIFT feature descriptor are into entire image
Dense SIFT features;
PHOG feature extractions are comprised the following steps that:
1.1) marginal information of statistical picture;The edge contour of image is extracted using Canny edge detection operators, and will
This profile is used for the shape for describing image;
1.2) pyramid Multi-level segmentation is carried out to image, the block number of image segmentation depends on the number of plies of pyramid grade;This
3 layers are divided the image into invention, the 1st layer is whole image;2nd layer divides an image into 4 sub-regions, the size in each region
Unanimously;3rd layer is that 4 sub-regions are divided on the basis of the 2nd layer, each region is further subdivided into 4 sub-regions, most
4 × 4 sub-regions are obtained eventually;
1.3) HOG characteristic vectors (the Histogram of Oriented of each sub-regions are extracted in each layer
Gridients, histograms of oriented gradients);
1.4) the HOG characteristic vectors of subregion in image each layer are finally subjected to cascade processing (series connection), obtaining level
After HOG data after connection, the normalization operation of data is carried out, the PHOG features of entire image are finally given.
Further, in the step A5, grader uses Linear SVM grader.
Further, to occur that inhomogeneity label is obtained for the ballot decision-making technique in the step B4 most and equal
The problem of poll, in this case, using randomly selected method, randomly choosed in the class label of these equal polls
One of class label is used as final class label.
The beneficial effects of the invention are as follows:
The present invention selects multiple fusion features, can make up lacking for the single fusion feature existence information amount deficiency of image
Point, effectively raises the accuracy rate of image classification.Visual dictionary is set up from Kmeans++ algorithms, the side chosen using probability
Method replaces random selection initial cluster center, effectively algorithm can be avoided to be absorbed in locally optimal solution.Finally utilize decision-making of voting
Method each class result is voted, the big classification results of difference are merged, last classification performance is determined by ballot decision-making,
It ensure that the stability of result.
Brief description of the drawings
Fig. 1 is the flow chart of the image classification method of integrated RGB-D fusion features and sparse coding.
Fig. 2 is LLC feature coding models in training stage step A5 of the invention.
Fig. 3 is test image categorised decision module in test phase step B4 of the present invention.
Identity confusion matrixes of the Fig. 4 for the present invention on RGB-D Scenes data sets.
Embodiment
With reference to instantiation, and with reference to drawings in detail, the present invention is described in more detail.But described example
The understanding of the present invention is intended to, and does not play any restriction effect to it.
Fig. 1 is the system flow chart of the image classification of integrated RGB-D fusion features and sparse coding, and specific implementation step is such as
Under:
Step S1:Extract the dense SIFT features and PHOG features of RGB image and Depth images;
Step S2:Feature to two kinds of image zooming-outs carries out Fusion Features in the form of series connection, finally gives four kinds not
Same fusion feature;
Step S3:Clustering processing is carried out to different fusion features using K-means++ clustering methods and obtains four kinds of differences
Visual dictionary;
Step S4:Local restriction uniform enconding is carried out in each visual dictionary, different Image Representation collection are obtained;
Step S5:Using Linear SVM to different Image Representation collection structural classification devices, finally by these four graders
Classification results carry out ballot decision-making and determine final classification.
Image classification method based on integrated RGB-D fusion features and sparse coding, the present invention is using experimental data to this
The method of invention is verified.
The experimental data set that the present invention is used is RGB-D Scenes data sets, and the data set is provided by University of Washington
A various visual angles scene image data collection, the data set is made up of 8 classification scenes, and totally 5972 pictures, image is whole
Obtained by Kinect video camera, size is 640*480.
In RGB-D Scenes data sets, all images are used to test and picture size is adjusted to 256*256.It is right
In feature extraction, the dense SIFT feature sampling intervals of image zooming-out are set to 8 pixels in this experiment, and image block is 16 ×
16.PHOG feature extraction parameters are set to:Tile size is 16 × 16, and the sampling interval is 8 pixels, and gradient direction is set to 9.Build
During vertical visual dictionary, dictionary size is set to 200.Using the libsvm3.12 tool boxes of LIBSVM kits, data during svm classifier
Concentration takes 80% picture to be used to train, and 20% picture is used to test.
In this experiment, from the inventive method from the aspect of two, first, investigate the inventive method accurate with current class
The method of some higher researchers of true rate is contrasted;Second, investigate different RGB-D fusion features and the inventive method
Classifying quality is contrasted.
Table 1RGB-D Scenes data set classification results compare
Sorting technique | Accuracy rate/% |
Linear SVM | 89.6% |
Gaussian kernel function SVM | 90.0% |
Random forest | 90.1% |
HOG | 77.2% |
SIFT+SPM | 84.2% |
The inventive method | 91.7% |
The contrast of classification accuracy and other method is as shown in table 1.Liefeng Bo are in article " Kernel
It is in descriptors for visual recognition " that three kinds of features are integrated, respectively with Linear SVM (Linear
SVM), gaussian kernel function SVM (Kernel SVM) and random forest (Random Forest) are trained and classified to it,
This obtains 89.6%, 90.0% and 90.1% accuracy rate respectively in testing.A.Janoch is in article " A Category-
Level 3D Object Dataset:Using HOG algorithms respectively to depth map in Putting the Kinect to Work "
Picture and coloured image carry out feature extraction, final classification are realized using SVM classifier after Fusion Features, in this experiment
The method obtains 77.2% accuracy rate.N.Silberman is in article " Indoor scene segmentation using a
First extract the feature of depth image and coloured image, Ran Houzai in structured light sensor " respectively with SIFT algorithms
Fusion Features are carried out, feature coding is carried out using SPM afterwards, are finally classified using SVM, this algorithm is obtained in this experiment
84.2% classification accuracy.And algorithm proposed by the present invention obtains 91.7% accuracy rate, with result best before
Compared to improving 1.6%, it can be seen that inventive algorithm has good classification performance.
The difference fusion feature classification results contrast of table 2RGB-D Scenes data sets
From table 2 it can be seen that when combined depth information carries out image classification, the sorting algorithm based on single fusion feature
Accuracy rate is less than the sorting algorithm based on many fusion features, and image classification algorithms based on multi-feature fusion can be obtained preferably
Classification accuracy, but slightly below the image classification algorithms based on many fusion feature Decision fusions.
The specific embodiment of the present invention is described above.It should be appreciated that the invention is not limited in above-mentioned
Particular implementation, all any modification, equivalent substitution and improvements done within the Spirit Essence and principle of the present invention etc.,
It should be included within the scope of protection of the invention.
Claims (8)
1. a kind of image classification method based on RGB-D fusion features and sparse coding, it is characterised in that including the training stage and
Test phase:
The training stage comprises the following steps:
Step A1, for each sample data, extract denseSIFT and the PHOG feature of its RGB image and Depth images;
The number of sample data is n;
Step A2, for each sample data, the feature to its two kinds of image zooming-outs is entered in the form of linear series two-by-two
Row Fusion Features, obtain four kinds of different fusion features;The fusion feature composition one for the identical type that n sample data is obtained
Individual set, obtains four kinds of fusion feature collection;
Step A3, the fusion feature progress clustering processing concentrated respectively to four kinds of fusion features, obtain four kinds of different vision words
Allusion quotation;
Step A4, in every kind of visual dictionary, using local restriction uniform enconding model to fusion feature carry out feature coding, obtain
To four kinds of different Image Representation collection;
Step A5, the class label configurations according to four kinds of different fusion feature collection, Image Representation collection and corresponding sample data
Grader, obtains four different graders;
The test phase comprises the following steps:
Step B1, according in step A2~A3 method extract and fusion image to be classified feature, obtain image to be classified
Four kinds of fusion features;
Step B2, in four kinds of visual dictionaries that step A3 is obtained, using local restriction uniform enconding model respectively to step B1
Four kinds of obtained fusion features carry out feature coding, obtain four kinds of different Image Representations of image to be classified;
Four kinds of Image Representations that step B3, four graders obtained with step A5 are obtained to step B2 respectively are classified, and are obtained
To four class labels;
Step B4, based on four obtained class labels, the final class label of the image to be classified is obtained using ballot decision-making technique,
Choose the most class label of poll in four class labels and be used as final class label.
2. the image classification method according to claim 1 based on RGB-D fusion features and sparse coding, its feature exists
In in the step A3, at the fusion feature progress cluster concentrated using K-means++ clustering methods for certain fusion feature
Reason, the method for setting up corresponding visual dictionary is as follows:
3.1) remember that this kind of fusion feature integrates as HI={ h1,h2,h3,…,hn, and set clusters number to be m;
3.2) in HIOne fusion feature of middle random selection is used as first initial cluster center S1;Count value t=1 is set;
3.3) to HIIn each fusion feature hi, hi∈HI, calculate it and StThe distance between d (hi);
3.4) next initial cluster center S is selectedt+1:
Based on formulaCalculate point hi' it is selected as the probability of next initial cluster center, wherein hi'∈HI;Selection
The fusion feature of maximum probability is used as next initial cluster center St+1;
3.5) t=t+1 is made, repeat step (3) and (4), until t=m, i.e., m initial cluster center, which is selected, to be come;
3.6) K-means algorithms are run using the initial cluster center elected, m cluster centre is most generated finally;
3.7) it is a vision word in visual dictionary to define each cluster centre, and clusters number m is the big of visual dictionary
It is small.
3. the image classification method according to claim 2 based on RGB-D fusion features and sparse coding, its feature exists
In in the step A4, using local restriction uniform enconding model to fusion feature progress feature coding, model expression is such as
Under:
<mfenced open = "" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<munder>
<mrow>
<mi>m</mi>
<mi>i</mi>
<mi>n</mi>
</mrow>
<mi>C</mi>
</munder>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<mo>|</mo>
<mo>|</mo>
<msub>
<mi>h</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<msub>
<mi>Bc</mi>
<mi>i</mi>
</msub>
<mo>|</mo>
<msup>
<mo>|</mo>
<mn>2</mn>
</msup>
<mo>+</mo>
<mi>&lambda;</mi>
<mo>|</mo>
<mo>|</mo>
<msub>
<mi>d</mi>
<mi>i</mi>
</msub>
<mo>&CircleTimes;</mo>
<msub>
<mi>c</mi>
<mi>i</mi>
</msub>
<mo>|</mo>
<msup>
<mo>|</mo>
<mn>2</mn>
</msup>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>s</mi>
<mo>.</mo>
<mi>t</mi>
<mo>.</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<msup>
<mn>1</mn>
<mi>T</mi>
</msup>
<msub>
<mi>c</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mn>1</mn>
<mo>,</mo>
<mo>&ForAll;</mo>
<mi>i</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mtd>
</mtr>
</mtable>
</mfenced>
In formula:hiFor fusion feature collection HIIn fusion feature, i.e., characteristic vector to be encoded, hi∈Rd, d represents fusion feature
Dimension;B=[b1,b2,b3…bm] it is the visual dictionary set up by K-means++ algorithms, b1~bmFor m in visual dictionary
Vision word, bj∈Rd;C=[c1,c2,c3…cn] it is the Image Representation collection that coding is obtained, wherein ci∈RmTo melt after the completion of coding
Close feature hiCode coefficient;λ is LLC penalty factor;Represent that element correspondence is multiplied;1TciIn 1 represent that whole elements are 1
Vector, then 1Tci=1 is used to enter row constraint to local constrained line encoding model, makes it have translation invariance;diDefinition
For:
<mrow>
<msub>
<mi>d</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mi>exp</mi>
<mrow>
<mo>(</mo>
<mfrac>
<mrow>
<mi>d</mi>
<mi>i</mi>
<mi>s</mi>
<mi>t</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>h</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<mi>B</mi>
<mo>)</mo>
</mrow>
</mrow>
<mi>&sigma;</mi>
</mfrac>
<mo>)</mo>
</mrow>
</mrow>
Wherein dist (hi, B) and=[dist (hi,b1),dist(hi,b2),…dist(hi,bm)]T, dist (hi,bj) represent hiWith
bjBetween Euclidean distance, σ be used for adjust local location constraint weight decrease speed.
4. the image classification method according to claim 1 based on RGB-D fusion features and sparse coding, its feature exists
In in the step A4, using approximate local restriction uniform enconding model to fusion feature progress feature coding, model tormulation
Formula is as follows:
<mfenced open = "" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<munder>
<mrow>
<mi>m</mi>
<mi>i</mi>
<mi>n</mi>
</mrow>
<mover>
<mi>C</mi>
<mo>~</mo>
</mover>
</munder>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<mo>|</mo>
<mo>|</mo>
<msub>
<mi>h</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<msub>
<mi>B</mi>
<mi>i</mi>
</msub>
<mover>
<msub>
<mi>c</mi>
<mi>i</mi>
</msub>
<mo>~</mo>
</mover>
<mo>|</mo>
<msup>
<mo>|</mo>
<mn>2</mn>
</msup>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>s</mi>
<mo>.</mo>
<mi>t</mi>
<mo>.</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<msup>
<mn>1</mn>
<mi>T</mi>
</msup>
<mover>
<msub>
<mi>c</mi>
<mi>i</mi>
</msub>
<mo>~</mo>
</mover>
<mo>=</mo>
<mn>1</mn>
<mo>,</mo>
<mo>&ForAll;</mo>
<mi>i</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mtd>
</mtr>
</mtable>
</mfenced>
Wherein, BiIt is distance characteristic vector h to be encoded in the visual dictionary B chosen using k proximity searchiK nearest vision
The local visual word matrix that word is constituted,The Image Representation collection obtained for Approximation Coding, whereinFor fusion feature h after the completion of Approximation CodingiCode coefficient.
5. the image classification method according to claim 4 based on RGB-D fusion features and sparse coding, its feature exists
In taking k=50.
6. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding
Method, it is characterised in that in the step A1, PHOG feature extractions are comprised the following steps that:
1.1) marginal information of statistical picture;The edge contour of image is extracted using Canny edge detection operators, and this is taken turns
Exterior feature is used for the shape for describing image;
1.2) pyramid Multi-level segmentation is carried out to image;3 layers are divided the image into, the 1st layer is whole image;2nd layer is drawn image
Be divided into 4 sub-regions, each region it is in the same size;3rd layer is that 4 sub-regions are divided on the basis of the 2nd layer,
Each region is further subdivided into 4 sub-regions, finally gives 4 × 4 sub-regions;
1.3) the HOG characteristic vectors of each sub-regions are extracted in each layer;
1.4) the HOG characteristic vectors of subregion in image each layer are finally subjected to cascade processing, the HOG numbers after being cascaded
According to rear, the normalization operation of data is carried out, the PHOG features of entire image are finally given.
7. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding
Method, it is characterised in that in the step A5, grader uses Linear SVM grader.
8. according to the image classification side according to any one of claims 1 to 5 based on RGB-D fusion features and sparse coding
Method, it is characterised in that in the step B4, if the most category of poll is signed with multiple, is selected at random in these class labels
One is selected as final class label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710328468.7A CN107085731B (en) | 2017-05-11 | 2017-05-11 | Image classification method based on RGB-D fusion features and sparse coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710328468.7A CN107085731B (en) | 2017-05-11 | 2017-05-11 | Image classification method based on RGB-D fusion features and sparse coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107085731A true CN107085731A (en) | 2017-08-22 |
CN107085731B CN107085731B (en) | 2020-03-10 |
Family
ID=59611626
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710328468.7A Active CN107085731B (en) | 2017-05-11 | 2017-05-11 | Image classification method based on RGB-D fusion features and sparse coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107085731B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108090926A (en) * | 2017-12-31 | 2018-05-29 | 厦门大学 | A kind of depth estimation method based on double dictionary study |
CN108596256A (en) * | 2018-04-26 | 2018-09-28 | 北京航空航天大学青岛研究院 | One kind being based on RGB-D object identification grader building methods |
CN108805183A (en) * | 2018-05-28 | 2018-11-13 | 南京邮电大学 | A kind of image classification method of fusion partial polymerization descriptor and local uniform enconding |
CN108875080A (en) * | 2018-07-12 | 2018-11-23 | 百度在线网络技术(北京)有限公司 | A kind of image search method, device, server and storage medium |
CN109741484A (en) * | 2018-12-24 | 2019-05-10 | 南京理工大学 | Automobile data recorder and its working method with image detection and voice alarm function |
CN110443298A (en) * | 2019-07-31 | 2019-11-12 | 华中科技大学 | It is a kind of based on cloud-edge cooperated computing DDNN and its construction method and application |
CN111160387A (en) * | 2019-11-28 | 2020-05-15 | 广东工业大学 | Graph model based on multi-view dictionary learning |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105005786A (en) * | 2015-06-19 | 2015-10-28 | 南京航空航天大学 | Texture image classification method based on BoF and multi-feature fusion |
-
2017
- 2017-05-11 CN CN201710328468.7A patent/CN107085731B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105005786A (en) * | 2015-06-19 | 2015-10-28 | 南京航空航天大学 | Texture image classification method based on BoF and multi-feature fusion |
Non-Patent Citations (2)
Title |
---|
向程谕: "基于RGB-D容和特征的图像分类", 《计算机工程与应用》 * |
申晓霞等: "基于Kinect和金字塔特征的行为识别算法", 《光电子•激光》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108090926A (en) * | 2017-12-31 | 2018-05-29 | 厦门大学 | A kind of depth estimation method based on double dictionary study |
CN108596256A (en) * | 2018-04-26 | 2018-09-28 | 北京航空航天大学青岛研究院 | One kind being based on RGB-D object identification grader building methods |
CN108805183A (en) * | 2018-05-28 | 2018-11-13 | 南京邮电大学 | A kind of image classification method of fusion partial polymerization descriptor and local uniform enconding |
CN108875080A (en) * | 2018-07-12 | 2018-11-23 | 百度在线网络技术(北京)有限公司 | A kind of image search method, device, server and storage medium |
CN109741484A (en) * | 2018-12-24 | 2019-05-10 | 南京理工大学 | Automobile data recorder and its working method with image detection and voice alarm function |
CN110443298A (en) * | 2019-07-31 | 2019-11-12 | 华中科技大学 | It is a kind of based on cloud-edge cooperated computing DDNN and its construction method and application |
CN110443298B (en) * | 2019-07-31 | 2022-02-15 | 华中科技大学 | Cloud-edge collaborative computing-based DDNN and construction method and application thereof |
CN111160387A (en) * | 2019-11-28 | 2020-05-15 | 广东工业大学 | Graph model based on multi-view dictionary learning |
CN111160387B (en) * | 2019-11-28 | 2022-06-03 | 广东工业大学 | Graph model based on multi-view dictionary learning |
Also Published As
Publication number | Publication date |
---|---|
CN107085731B (en) | 2020-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110443143B (en) | Multi-branch convolutional neural network fused remote sensing image scene classification method | |
Xia et al. | AID: A benchmark data set for performance evaluation of aerial scene classification | |
CN107085731A (en) | A kind of image classification method based on RGB D fusion features and sparse coding | |
Wang et al. | Aggregating rich hierarchical features for scene classification in remote sensing imagery | |
Cheng et al. | Effective and efficient midlevel visual elements-oriented land-use classification using VHR remote sensing images | |
Liu et al. | Scene classification using hierarchical Wasserstein CNN | |
CN106126581A (en) | Cartographical sketching image search method based on degree of depth study | |
CN102054178B (en) | A kind of image of Chinese Painting recognition methods based on local semantic concept | |
CN111222434A (en) | Method for obtaining evidence of synthesized face image based on local binary pattern and deep learning | |
CN109063649B (en) | Pedestrian re-identification method based on twin pedestrian alignment residual error network | |
CN106682233A (en) | Method for Hash image retrieval based on deep learning and local feature fusion | |
CN110109060A (en) | A kind of radar emitter signal method for separating and system based on deep learning network | |
CN106096557A (en) | A kind of semi-supervised learning facial expression recognizing method based on fuzzy training sample | |
CN108197538A (en) | A kind of bayonet vehicle searching system and method based on local feature and deep learning | |
CN108564094A (en) | A kind of Material Identification method based on convolutional neural networks and classifiers combination | |
CN104504362A (en) | Face detection method based on convolutional neural network | |
CN108009637B (en) | Station caption segmentation method of pixel-level station caption identification network based on cross-layer feature extraction | |
CN104966081B (en) | Spine image-recognizing method | |
CN105654122B (en) | Based on the matched spatial pyramid object identification method of kernel function | |
CN108921850B (en) | Image local feature extraction method based on image segmentation technology | |
CN114067444A (en) | Face spoofing detection method and system based on meta-pseudo label and illumination invariant feature | |
CN106529586A (en) | Image classification method based on supplemented text characteristic | |
CN105868706A (en) | Method for identifying 3D model based on sparse coding | |
CN111881716A (en) | Pedestrian re-identification method based on multi-view-angle generation countermeasure network | |
CN108416270A (en) | A kind of traffic sign recognition method based on more attribute union features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |