CN104866855A

CN104866855A - Image feature extraction method and apparatus

Info

Publication number: CN104866855A
Application number: CN201510229858.XA
Authority: CN
Inventors: 张世周; 龚怡宏; 柴振华
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2015-05-07
Filing date: 2015-05-07
Publication date: 2015-08-26

Abstract

The embodiment of the invention discloses an image feature extraction method and apparatus which are used for carrying out efficient image feature extraction in an unsupervised image training process. The method of the invention includes the following steps that: sparse coding is performed on regional pixels in an image region where pixels to be extracted of an input image are located, so that an M-dimensionality coding coefficient of the regional pixels can be obtained; M sparse coding patterns corresponding to the image region are obtained according to the coding coefficient of the regional pixels; pooling processing is performed on the M sparse coding patterns respectively, so that M-dimensionality pooling features of the pixels to be extracted can be obtained; and dimensionality reduction is performed on the pooling features, so that reduction features of the pixels to be extracted, which are used for representing the pooling features, can be obtained, and the dimensionality of the reduction features is smaller than that of the pooling features.

Description

A kind of image characteristic extracting method and device

Technical field

The present invention relates to the communication technology, particularly relate to a kind of image characteristic extracting method and device.

Background technology

Feature extraction is the committed step of many computer vision analysis tasks (as scene Recognition, object identification, target detection).Therefore, how to extract the feature that identification is strong, robustness is high is a study hotspot always.In traditional sense, the energy of researchist is mainly placed on the characteristics of image how hand-designed has difference etc. unchangeability in yardstick, viewpoint, illumination and class.But hand-designed characteristics of image, need researchist to have very professional domain knowledge, challenge is very strong.Therefore, picture scale invariability Feature Conversion (SIFT, Scale-invariant Feature Transformation), histograms of oriented gradients (HOG, Histogram of Oriented Gradients), local binary pattern (LBP, Local Binary Pattern) etc. the good characteristics of image of robustness very rare, usually within every 5 to 10 years, just can design a good manual feature.In addition, the feature of hand-designed does well usually in some visual task, but performance is general in other task, and such as LBP has won reputation widely in field of face identification, and HOG is mainly famous because of pedestrian detection.

Study hotspot is in recent years paid close attention to always and how directly to be learnt to obtain character representation from data.Extensively that concerned is convolutional neural networks model (CNN, Convolutional Neural Network), and in extensive visual identity challenge match, the champion of Images Classification match is the model based on CNN.

Figure 1 shows that the structural representation of a convolutional neural networks, CNN, by carrying out a series of convolution to local receptor field, the operations such as Chi Hua, obtains the character representation of image.After characteristics of image carries out pond, need the characteristics of image of Chi Huahou and the image pattern of mark to compare, carry out error-duration model according to comparison result, gradually adjusting and optimizing model parameter, finally desirable characteristics of image.CNN needs the data sample having mark in a large number in the training process, and such as, the training picture in the extensive visual identity challenge match of ImageNet has more than 1,000,000, and whole ImageNet data set has the mark image data more than 10,000,000.However, for the CNN model of a training super large, labeled data amount still seems not enough.More much less the manpower, the financial resources that expend required for the data set that mark one is ultra-large.Therefore, research obtains the stronger feature of identification how from magnanimity without the study (unsupervisedlearning) without supervised in the middle of the data of markup information, and seeming more meaningful, is also exactly always the focus that academia is studied.

In actual applications, train a large-scale CNN model its training result has good performance is simultaneously not an easy thing, because the large data training sample that the training heavy dependence of CNN has marked, although ImageNet has had numerous mark samples, be also nowhere near for the data in real life.On internet, all can there be the picture of magnanimity every day, video data uploads, and is difficult to the picture sample newly-increased to magnanimity and marks.

Summary of the invention

Embodiments provide a kind of image characteristic extracting method and device, for carrying out efficient image characteristics extraction in without supervision image training process.

The image characteristic extracting method that embodiment of the present invention first aspect provides, comprising:

Sparse coding is carried out to the area pixel in the image-region at the pixel place to be extracted of input picture, thus obtain described area pixel M dimension code coefficient, M be greater than 0 integer;

According to the code coefficient of described area pixel, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M;

Respectively pondization process is carried out to described M sparse coding figure, thus obtain the pond feature of the M dimension of described pixel to be extracted, M dimension one_to_one corresponding of described M sparse coding figure and described pond feature;

Carry out dimension to described pond feature about to subtract, thus obtain described pixel to be extracted about subtract feature for what represent described pond feature, the described dimension about subtracting feature is less than the dimension of described pond feature.

In conjunction with first aspect, in the implementation that the first is possible, the area pixel in the image-region at the described pixel place to be extracted to input picture carries out sparse coding, thus the code coefficient obtaining the M dimension of described area pixel comprises:

The pixel samples corresponding to described area pixel carries out sparse coding, thus obtains the code coefficient of the M dimension of described area pixel;

Wherein, the code coefficient of described area pixel is asked for according to following formula;

c_{ij} = \arg \min_{c} \frac{1}{2} {| | x_{ij} - {Zc}_{ij} | |}^{2}, s . t . {| | c_{ij} | |}_{0} \leq K;

Described x _ijbe the column vector changed listed by the value of the coordinate points in pixel samples corresponding to described area pixel, (i, j) is the coordinate of described area pixel, and described K is sparse function degree of rarefication, described c _iifor the code coefficient of described area pixel, described Z is described x _ijdictionary.

In conjunction with the first implementation of first aspect or first aspect, in the implementation that the second is possible, describedly respectively pondization process is carried out to described M sparse coding figure, thus obtains the pond feature that described M ties up, comprising:

The pond feature of described M dimension is asked for according to following formula;

Ω_{ij}^{c} = {[\max (c_{pq}^{1}), . . ., \max (c_{pq}^{M})]}_{p, q &Element; N (i, j)};

Described represent described pond feature, N (i, j) represents the coordinates regional of described sparse coding figure relative to described input picture place, described p, q ∈ N (i, j) denotation coordination (p, q) within described coordinates regional, described in volume coordinate for correspondence is the value of code coefficient under the 1st dimension of (p, q), described in volume coordinate for correspondence is the value of code coefficient under M dimension of (p, q).

In conjunction with first aspect, or any one possible implementation of first aspect the first to two, in the implementation that the third is possible, dimension is carried out to described pond feature and about subtracts, thus obtain about subtracting feature for what represent described pond feature, comprising:

The function f that the scrambler obtaining about subtracting for dimension according to following formula is corresponding, utilizes function f to carry out dimension to described pond feature and about subtracts, thus obtains about subtracting feature for what represent described pond feature;

f, g = \arg \min_{f, g} {| | n - g (f (n)) | |}^{2};

Wherein, g represents that n represents described pond feature for about subtracting described the function that characteristic recovery is the demoder of described pond feature.

In conjunction with first aspect, or any one possible implementation of first aspect the first to two, in the 4th kind of possible implementation, also comprise before dimension about subtracts described to carry out described pond feature:

The pond feature that described M ties up is normalized.

The characteristic pattern extracting method of the image that embodiment of the present invention second aspect provides, comprising:

Extract input picture at least part of region in pixel to be extracted about subtract feature, about subtracting of described pixel to be extracted is characterized as L dimensional vector, L be greater than 0 integer;

About feature is subtracted according to described pixel to be extracted, L that obtains described input picture corresponding to described at least part of region about subtracts characteristic pattern, wherein, the value that described L p of about to subtract in characteristic pattern about subtracts arbitrary coordinate point in characteristic pattern be the corresponding pixel in described at least part of region of described arbitrary coordinate point about subtract the value of feature under p dimension, p is the positive integer being less than or equal to L;

Pixel to be extracted at least part of region of described extraction input picture about subtract feature, comprising:

Sparse coding is carried out to the area pixel in the image-region at described pixel place to be extracted, thus obtain described area pixel M dimension code coefficient M be greater than 0 integer;

In conjunction with second aspect, in the implementation that the first is possible, after acquisition described L about subtracts characteristic pattern, also comprise:

Described L is about subtracted characteristic pattern as new input picture, and then obtains L and new about subtract characteristic pattern.

The image characteristics extraction device that the embodiment of the present invention third aspect provides, comprising:

Sparse coding module, pond module and dimension about subtract module;

Described sparse coding module be used for sparse coding is carried out to the area pixel in the image-region at the pixel place to be extracted of input picture, thus obtain described area pixel M dimension code coefficient, M be greater than 0 integer; According to the code coefficient of described area pixel, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M;

Described pond module is used for carrying out pondization process to described M sparse coding figure respectively, thus obtains the pond feature of the M dimension of described pixel to be extracted, M dimension one_to_one corresponding of described M sparse coding figure and described pond feature;

Described dimension about subtracts module and about subtracts for carrying out dimension to described pond feature, thus obtain described pixel to be extracted about subtract feature for what represent described pond feature, the described dimension about subtracting feature is less than the dimension of described pond feature.

In conjunction with the third aspect, in the implementation that the first is possible, described sparse coding module specifically for:

The pixel samples corresponding to described area pixel carries out sparse coding, thus obtains the code coefficient of the M dimension of described area pixel, wherein, asks for the code coefficient of described area pixel according to following formula;

c_{ij} = \arg \min_{c} \frac{1}{2} {| | x_{ij} - {Zc}_{ij} | |}^{2}, s . t . {| | c_{ij} | |}_{0} \leq K;

Described x _ijbe the column vector changed listed by the value of the coordinate points in pixel samples corresponding to described area pixel, (i, j) is the coordinate of described area pixel, and described K is sparse function degree of rarefication, described c _ijfor the code coefficient of described area pixel, described Z is described x _ijdictionary.

In conjunction with the third aspect, or the first implementation of the third aspect, in the implementation that the second is possible, described pond module specifically for:

Ω_{ij}^{c} = {[\max (c_{pq}^{1}), . . ., \max (c_{pq}^{M})]}_{p, q &Element; N (i, j)};

In conjunction with the third aspect, or any one possible implementation of the third aspect the first to two, in the implementation that the third is possible, described dimension about subtract module specifically for:

f, g = \arg \min_{f, g} {| | n - g (f (n)) | |}^{2};

In conjunction with the third aspect, or any one possible implementation of the third aspect the first to two, in the 4th kind of possible implementation, described image characteristics extraction device also comprises: normalization module;

Described normalization module is used for being normalized the pond feature that described M ties up.

The image characteristics extraction device that embodiment of the present invention fourth aspect provides, comprising: the first graphics processing unit;

Described first graphics processing unit comprises: fisrt feature extraction module, fisrt feature figure acquisition module;

Described fisrt feature extraction module about subtracts feature for what extract pixel to be extracted at least part of region of input picture, and about subtracting of described pixel to be extracted is characterized as L dimensional vector, L be greater than 0 integer;

Described fisrt feature figure acquisition module is used for about subtracting feature according to described pixel to be extracted, L that obtains described input picture corresponding to described at least part of region about subtracts characteristic pattern, wherein, the value that described L p of about to subtract in characteristic pattern about subtracts arbitrary coordinate point in characteristic pattern be the corresponding pixel in described at least part of region of described arbitrary coordinate point about subtract the value of feature under p dimension, p is the positive integer being less than or equal to L;

Described fisrt feature extraction module concrete also for:

In conjunction with fourth aspect, in the implementation that the first is possible, described image characteristics extraction device also comprises: the second graphics processing unit;

Described second graphics processing unit comprises: second feature extraction module, second feature figure acquisition module;

The described L that described second feature extraction module is used for described first graphics processing unit to export is individual about subtracts characteristic pattern as new input picture, and that extracts the pixel to be extracted at least part of region of input picture about subtracts feature;

Described second feature figure acquisition module is used for about subtracting feature according to described pixel to be extracted, obtains L corresponding to described at least part of region and new about subtracts characteristic pattern.

As can be seen from the above technical solutions, the embodiment of the present invention has the following advantages:

In embodiments of the present invention, the pixel samples corresponding by the area pixel in the image-region at the pixel place to be extracted to input picture carries out sparse coding, thus obtain the code coefficient of described area pixel, and represent sample data efficiently by this code coefficient, without the need to carrying out error back pass by the image pattern of mark; Further, after pond is carried out to sparse coding figure, dimension is carried out to described pond feature and about subtracts, the image dimension of pond feature is reduced, reduces the consumption to internal memory in image characteristics extraction process, improve the efficiency of image characteristics extraction.

Term " first ", " second ", " the 3rd " " 4th " etc. (if existence) in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged in the appropriate case, so as embodiments of the invention described herein if with except here diagram or describe those except order implement.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.

Refer to Fig. 2, in the embodiment of the present invention, image characteristic extracting method can be applied to image classification system shown in Fig. 2, concrete:

In actual applications, image classification system can be divided into the sorter training process of off-line and online test process, as shown in Figure 2: first off-line training process is carrying out feature extraction to all pictures on training dataset, then training classifier.The sorter that training stage obtains preserves and uses for on-line testing; The on-line testing stage, first similar with training process, feature extraction is carried out to input picture, here characteristic extraction procedure must be completely the same with the characteristic extraction procedure in training process, after obtaining the character representation of input picture, the sorter calling training stage generation differentiates, to identify the classification belonging to input picture.Image characteristic extracting method in the embodiment of the present invention can realize the operation of unsupervised image characteristics extraction in image training process.

Further, according to the precision needs of reality, multiple graphics processing unit can be included in image characteristics extraction device 11, described multiple graphics processing unit is linked in sequence, the feature extraction result of an above graphics processing unit is as the input picture of next graphics processing unit, further refining extraction is carried out to characteristics of image, make the extraction result degree of accuracy of characteristics of image higher.

In the embodiment of the present invention, an embodiment of image characteristic extracting method comprises:

Refer to Fig. 3, in the embodiment of the present invention, another embodiment of image characteristic extracting method comprises:

301, sparse coding is carried out to the area pixel in the image-region at the pixel place to be extracted of input picture;

Concrete, the pixel samples that graphics processing unit is corresponding to described area pixel carries out sparse coding, thus obtains the code coefficient of the M dimension of described area pixel, and the code coefficient of described area pixel is M dimensional vector, M be greater than 0 integer.

In embodiments of the present invention, pixel samples can be single pixel, and also can be the pixel region of multiple pixel composition, can also be data block; Concrete, if pixel samples is data block, then described data block can be one dimension or two-dimentional above data.

Exemplary, the image-region at described pixel place to be extracted can have following several form:

One, centered by the volume coordinate of described pixel to be extracted, take K as the image-region of neighborhood diameter, K be greater than 1 integer;

Two, using the volume coordinate of described pixel to be extracted as origin coordinates, the length with a and b being respectively and the image-region of width, a and b is respectively the integer being greater than 1; Wherein, described origin coordinates for being positioned at the origin in the image-region lower left corner, also can be able to being the coordinate of any one end points in described image-region, not limiting herein;

Three, image-region can be obtained by plucking out in the image-region described by above-mentioned one or two any one or more pixel.

Be understandable that, the above-mentioned restriction to image-region is only exemplary, and in actual applications, image-region can also have other representation, is not construed as limiting herein.

In embodiments of the present invention, the characteristic of input picture in edge, end points, striped etc. can be described with the form of sparse coding; From the angle of mathematics, sparse coding is a kind of multidimensional data describing method, and data only have minority component to be in obvious state of activation after sparse coding simultaneously, and the component after this is roughly equivalent to coding presents super-Gaussian distribution.In actual applications, sparse coding has following several advantage: encoding scheme storage capacity is large, has associative memory ability, and calculates easy, make the structure of natural sign more clear.

Exemplary, refer to Fig. 4, with RGB (RGB, Red Green Blue) be example as color input channel, original image (leftmost side of Fig. 4) is input to graphics processing unit by RGB tri-passages and processes, graphics processing unit carries out sparse coding according to the sparse coding function preset and dictionary to input picture, obtains the code coefficient of a M dimension.Concrete, the value of M is relevant to actual application demand (namely can preset), is not construed as limiting herein.

In the process of sparse coding, graphics processing unit sets the image coordinate of a pixel in the image-region at pixel place to be extracted in the input image as (i, j), (i will be positioned at, j) pixel samples (in actual applications, a pixel samples can be place's local receptor field) of position is designated as X _ij, X _ijbe the data block of a C × w × h, wherein C is port number, and w × h represents the one piece of image-region (that is, described pixel samples) in described input picture centered by (i, j), w and h represents respectively.If input picture is original image, C is the Color Channel number of image, and if input picture is the characteristic pattern that upper graphics processing unit exports, C is then the number of this characteristic pattern.By X _ijrow change into a column vector x _ij, wherein x _jjdimension be D=C × w × h.

In embodiments of the present invention, (quantity of pixel samples is according to the size of input picture to have multiple pixel samples in input picture, and the size of w and h and determine), in the process of carrying out sparse coding, graphics processing unit can according to coordinate (i, j) each pixel samples is traveled through, obtain the code coefficient that each pixel is corresponding respectively.

Exemplary, sparse coding function can with reference to following formula:

Formula one:

c_{ij} = \arg \min_{c} \frac{1}{2} {| | x_{ij} - {Zc}_{ij} | |}^{2}, s . t . {| | c_{ij} | |}_{0} \leq K;

In formula one, described in represent and meet function minimum value, || x _ij-Zc _ij|| ²representing matrix x _ij-Zc _ijtwo norms.Described s.t. represents " being limited to ", || c _ij|| ₀represent c _ijzero norm.Described x _ijbe the column vector changed listed by the value of the coordinate points in pixel samples corresponding to described area pixel, (i, j) is the coordinate of described area pixel, and described K is sparse function degree of rarefication, described c _ijfor the code coefficient of described area pixel; Described Z is described x _ijdictionary, and wherein, when using original image as input picture, Z is default initial value, and in the process of follow-up elementary area process, Z can carry out renewal adjustment according to the sparse coding result of previous elementary area process.Described it is a real number space.

In formula one, code coefficient c _ijfor the weighted value of Z, what formula one was expressed is find an optimum code coefficient, makes x _ij-Zc _ijdifference minimum.

302, according to the code coefficient of described area pixel, M corresponding to a described image-region sparse coding figure is obtained;

Graphics processing unit is according to the code coefficient of described area pixel, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M.

Exemplary, sparse coding figure can be the matrix image of K*K, also can be the image of other shape, be not construed as limiting herein.

303, respectively pondization process is carried out to described M sparse coding figure;

Graphics processing unit carries out pondization process to described M sparse coding figure respectively, thus obtains the pond feature of the M dimension of described pixel to be extracted, M dimension one_to_one corresponding of described M sparse coding figure and described pond feature.

Concrete, pondization process expression is by each for described sparse coding figure pixel real number, and the maximal value of real number in all pixels after obtaining described real number, namely in the region of sparse coding figure, obtain the strongest pixel of feature.

Exemplary, described pondization process specifically can with reference to following formula:

Formula two:

Ω_{ij}^{c} = {[\max (c_{pq}^{1}), . . ., \max (c_{pq}^{M})]}_{p, q &Element; N (i, j)};

In formula two, described in represent described pond feature, N (i, j) represents the coordinates regional of described sparse coding figure relative to described input picture place, described p, q ∈ N (i, j) denotation coordination (p, q) within described coordinates regional, described in volume coordinate for correspondence is the value of code coefficient under the 1st dimension of (p, q), described in volume coordinate for correspondence is the value of code coefficient under M dimension of (p, q), and max () function returns the greatest member value of input vector.

304, the pond feature that described M ties up is normalized;

Optionally, at input picture after sparse coding, code coefficient may may cause amplitude to differ greatly due to yardstick difference, therefore needs to process after normalization operation again.

Concrete, being normalized described pond feature can with reference to following formula:

Formula 3:

n_{ij} = \frac{Ω_{ij}}{{| | Ω_{ij} | |}^{2} + ϵ};

Wherein, described n _ijfor the pond feature after normalization, ε is the positive number (in order to ensure that denominator is non-zero) that a numerical value is very little.

Be understandable that, in actual applications, also have other normalization processing method, be specifically not construed as limiting herein.

305, carry out dimension to described pond feature about to subtract, thus obtain described pixel to be extracted about subtract feature for what represent described pond feature;

Graphics processing unit carries out dimension to described pond feature and about subtracts, thus obtains about subtracting feature for what represent described pond feature, and the described dimension about subtracting feature is less than the dimension of described pond feature.

For making sparse coding have stable solution, the dictionary in sparse coding was generally complete dictionary, and the dimension of output coefficient of namely encoding generally will much larger than the dimension (M > > D) of input.Along with the superposition of the number of plies, the dimension of sparse coding input coefficient increases too fast, makes dictionary become can not bear the consumption of internal memory in the training process.Therefore, need that dimension is carried out to pond feature and about subtract, the dimension of data to be encoded is about subtracted.

Concrete, employing can adopt degree of depth own coding device to realize dimension about to subtract.As shown in Figure 5, scrambler is designated as the structural representation of degree of depth own coding device demoder is designated as wherein, M represents that the dimension before about subtracting, L represent the dimension after about subtracting, specifically can by following formula optimization coder parameters:

Formula four:

f, g = \arg \min_{f, g} {| | n - g (f (n)) | |}^{2};

Wherein, the function of f presentation code device, g represents that n represents described pond feature for about subtracting described the function that characteristic recovery is the demoder of described pond feature.(that is, the characteristics of image inputted after pondization processes or after normalized).

Be understandable that, in actual applications, also have the method that other dimension about subtracts, be specifically not construed as limiting herein.

306, using described about subtract after pond feature export as the feature extraction result of described input picture.

As shown in Figure 2, first off-line training process is carrying out feature extraction to all pictures on training dataset, then training classifier.The sorter that training stage obtains preserves and uses for on-line testing; The on-line testing stage, first similar with training process, feature extraction is carried out to input picture, here characteristic extraction procedure must be completely the same with the characteristic extraction procedure in training process, after obtaining the character representation of input picture, the sorter calling training stage generation differentiates, to identify the classification belonging to input picture.Image characteristic extracting method described by the embodiment of the present invention is applied to the feature extraction phases shown in Fig. 2.

In actual applications, can also carry out feature extraction, refer to Fig. 6 at least part of region of input picture, in the embodiment of the present invention, an embodiment of the characteristic pattern extracting method of image comprises:

That 601, extracts the pixel to be extracted at least part of region of input picture about subtracts feature;

Graphics processing unit extract input picture at least part of region in pixel to be extracted about subtract feature, about subtracting of described pixel to be extracted is characterized as L dimensional vector, L be greater than 0 integer.

Concrete, described at least part of region representation can be local difference or the Zone Full of described input picture.

Concrete, the method about subtracting feature extracting the pixel to be extracted at least part of region of input picture can be:

Sparse coding is carried out to the area pixel in the image-region at described pixel place to be extracted, thus obtain described area pixel M dimension code coefficient M be greater than 0 integer; According to the code coefficient of described area pixel, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M; Respectively pondization process is carried out to described M sparse coding figure, thus obtain the pond feature of the M dimension of described pixel to be extracted, M dimension one_to_one corresponding of described M sparse coding figure and described pond feature; Carry out dimension to described pond feature about to subtract, thus obtain described pixel to be extracted about subtract feature for what represent described pond feature, the described dimension about subtracting feature is less than the dimension of described pond feature.

In embodiments of the present invention, the detailed process about subtracting feature acquisition is described later in detail in Fig. 3 embodiment, repeats no more herein.

602, about subtract feature according to described pixel to be extracted, L that obtains described input picture corresponding to described at least part of region about subtracts characteristic pattern.

Graphics processing unit about subtracts feature according to described pixel to be extracted, L that obtains described input picture corresponding to described at least part of region about subtracts characteristic pattern, wherein, the value that described L p of about to subtract in characteristic pattern about subtracts arbitrary coordinate point in characteristic pattern be the corresponding pixel in described at least part of region of described arbitrary coordinate point about subtract the value of feature under p dimension, p is the positive integer being less than or equal to L.

Optionally, if current graphics processing unit is also connected with another graphics processing unit, after then described graphics processing unit obtains pond characteristic pattern, described pond characteristic pattern is exported to another graphics processing unit, and using the input picture of described pond characteristic pattern as another graphics processing unit described, and then obtain L new about subtract characteristic pattern, make the extraction result degree of accuracy of characteristics of image higher.

Further, on the basis of image characteristic extracting method in embodiments of the present invention, can also by multiple dimensioned and many localities technology strengthen extract discriminating power and the robustness of feature, consult Fig. 7.

Concrete, multiple dimensioned referring to carries out multi-resolution hierarchy to input picture, then feature extraction is carried out by the input of multiple resolution respectively by image characteristic extracting method in the embodiment of the present invention, and the feature that the most multiple resolution is extracted combines, as final image feature representation.

Concrete, many localities refer to, the image-region that setting N group is different, obtain corresponding N group respectively by image characteristic extracting method in the embodiment of the present invention again and about subtract characteristic pattern, N stack features figure in same spatial location is combined and carries out follow-up operation together, make degree of depth sparse coding network can extract the character representation of complementation information further like this.

In actual applications, the image feature representation identification extracted by the embodiment of the present invention is comparatively strong, and multiple common test data set arrives higher discrimination.In addition, the image characteristic extracting method of the embodiment of the present invention makes the requirement of dictionary learning to internal memory greatly reduce in sparse coding layer, and complexity is from O (M ²) be reduced to O (L ²), coding algorithm complexity is from O (M) to O (L), and M > > L, wherein, L is the dimension after dimension about subtracts.

Be below concrete experimental data, table 1 is the scene of MITScenes67 data set, specifically comprises 67 class scenes, adds up to 15620 pictures.

Table 1

Algorithm	Performance (%)
		DPM(Deformable Part Model)	30.4
SPM(Spatial Pyramid Matching)	34.4
		ScSPM	36.9
RBoW(Reconfigurable Models)	37.9
		DPM+Gist+SPM	43.1

HMP(Hierarchical Matching Pursuit)	41.8
		VC(Visual Coneept)	46.4
CNN-AlexNet(no pretraining)	19.3
		CNN-AlexNet(with pretrain on ImageNet)	51.5
M-HMP	51.2
		The method of the embodiment of the present invention	49.4
Our Method+Multi-Path HMP	52.3

Table 1 is the scene of UIUC Sports Event data set, and specifically include the data set of eight class sport events, this data set adds up to 1792 pictures, and each class contains 137 to 250 pictures.

Table 2

Algorithm	Performance (%)
		HIK+OCSVM	83.5
ScSPM	82.7
		LScSPM	85.3
Sc ⁺SPM	83.7
		HMP(Hierarchical Matching Pursuit)	85.7
CA-TM	78.0
		VC(Visual Concept)	84.8

CNN-AlexNet(no pretrain)	65.1
		CNN-AlexNet(with pretrain on ImageNet)	89.6
The method of the embodiment of the present invention	87.1

Be described the image characteristics extraction device of the image characteristic extracting method realized in the embodiment of the present invention below, it should be noted that, method described in each embodiment of above-mentioned image characteristic extracting method may be implemented in image characteristics extraction device of the present invention.Refer to Fig. 8, an embodiment of the image characteristics extraction device in the embodiment of the present invention comprises:

Sparse coding module 801, pond module 802 and dimension about subtract module 803;

Described sparse coding module 801 carries out sparse coding for the area pixel in the image-region at the pixel place to be extracted to input picture, thus obtain described area pixel M dimension code coefficient, M be greater than 0 integer; According to the code coefficient of described area pixel, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M;

Described pond module 802 for carrying out pondization process to described M sparse coding figure respectively, thus obtains the pond feature of the M dimension of described pixel to be extracted, M dimension one_to_one corresponding of described M sparse coding figure and described pond feature;

Described dimension about subtracts module 803 and about subtracts for carrying out dimension to described pond feature, thus obtain described pixel to be extracted about subtract feature for what represent described pond feature, the described dimension about subtracting feature is less than the dimension of described pond feature.

Further, described sparse coding module 801 specifically for:

c_{ij} = \arg \min_{c} \frac{1}{2} {| | x_{ij} - {Zc}_{ij} | |}^{2}, s . t . {| | c_{ij} | |}_{0} \leq K;

Further, described pond module 802 specifically for:

Ω_{ij}^{c} = {[\max (c_{pq}^{1}), . . ., \max (c_{pq}^{M})]}_{p, q &Element; N (i, j)};

Further, described dimension about subtract module 803 specifically for:

f, g = \arg \min_{f, g} {| | n - g (f (n)) | |}^{2};

Further, described image characteristics extraction device also comprises: normalization module 804;

Described normalization module 804 is normalized for the pond feature tieed up described M.

In embodiments of the present invention, the concrete operations flow process of the modules of image characteristics extraction device can reference diagram 3 embodiment, repeats no more herein.

Refer to Fig. 9, another embodiment of the image characteristics extraction device in the embodiment of the present invention comprises:

First graphics processing unit 901;

Described first graphics processing unit 901 comprises: fisrt feature extraction module 9011, fisrt feature figure acquisition module 9012;

Described fisrt feature extraction module 9011 about subtracts feature for what extract pixel to be extracted at least part of region of input picture, and about subtracting of described pixel to be extracted is characterized as L dimensional vector, L be greater than 0 integer;

Described fisrt feature figure acquisition module 9012 is for about subtracting feature according to described pixel to be extracted, L that obtains described input picture corresponding to described at least part of region about subtracts characteristic pattern, wherein, the value that described L p of about to subtract in characteristic pattern about subtracts arbitrary coordinate point in characteristic pattern be the corresponding pixel in described at least part of region of described coordinate points about subtract the value of feature under p dimension, p is the positive integer being less than or equal to L;

Described fisrt feature extraction module 9011 concrete also for:

Carry out sparse coding to the area pixel in the image-region at described pixel place to be extracted, thus obtain the code coefficient of described area pixel, the code coefficient of described area pixel is M dimensional vector, M be greater than 0 integer;

According to the code coefficient of described pixel to be extracted, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M;

Carry out dimension to described pond feature about to subtract, thus obtain about subtracting feature for what represent described pond feature, the described dimension about subtracting feature is less than the dimension of described pond feature.

Further, described image characteristics extraction device also comprises: the second graphics processing unit 902;

Described second graphics processing unit 902 comprises: second feature extraction module 9021, second feature figure acquisition module 9022;

Described second feature extraction module 9021 about subtracts characteristic pattern as new input picture for described L of being exported by described first graphics processing unit, and that extracts the pixel to be extracted at least part of region of input picture about subtracts feature;

Described second feature figure acquisition module 9022, for about subtracting feature according to described pixel to be extracted, obtains L corresponding to described at least part of region and new about subtracts characteristic pattern.

In embodiments of the present invention, the concrete operations flow process of the modules of image characteristics extraction device can reference diagram 6 embodiment, repeats no more herein.

Fig. 9 is the computer organization schematic diagram of the image characteristics extraction device based on image characteristic extracting method in the embodiment of the present invention, and image characteristics extraction device can comprise input equipment 1010, output device 1020, processor 1030 and storer 1040.

Storer 1040 can comprise ROM (read-only memory) and random access memory, and provides instruction and data to processor 1030.A part for storer 1040 can also comprise nonvolatile RAM (NVRAM).

Storer 1040 stores following element, executable module or data structure, or their subset, or their superset:

Operational order: comprise various operational order, for realizing various operation.

Operating system: comprise various system program, for realizing various basic business and processing hardware based task.

In embodiments of the present invention, the operational order (this operational order can store in an operating system) that processor 1030 stores by calling storer 1040, performs and operates as follows:

Described processor 1030 carries out sparse coding specifically for the area pixel in the image-region at the pixel place to be extracted to input picture, thus obtain described area pixel M dimension code coefficient, M be greater than 0 integer;

According to the code coefficient of described area pixel, obtain M corresponding to a described image-region sparse coding figure, wherein, in a kth sparse coding figure in described M sparse coding figure, the value of arbitrary coordinate point is the value of code coefficient under a kth dimension of the corresponding pixel in described image-region of described arbitrary coordinate point, and k is the positive integer being less than or equal to M; Respectively pondization process is carried out to described M sparse coding figure, thus obtain the pond feature of the M dimension of described pixel to be extracted, M dimension one_to_one corresponding of described M sparse coding figure and described pond feature; Carry out dimension to described pond feature about to subtract, thus obtain described pixel to be extracted about subtract feature for what represent described pond feature, the described dimension about subtracting feature is less than the dimension of described pond feature.

Processor 930 controls the operation of image feature deriving means, and processor 1030 can also be called CPU (Central Processing Unit, CPU (central processing unit)).Storer 1040 can comprise ROM (read-only memory) and random access memory, and provides instruction and data to processor 1030.A part for storer 1040 can also comprise nonvolatile RAM (NVRAM).In concrete application, each assembly of image characteristics extraction device is coupled by bus system 1050, and wherein bus system 1050 is except comprising data bus, can also comprise power bus, control bus and status signal bus in addition etc.But for the purpose of clearly demonstrating, in the drawings various bus is all designated as bus system 1050.

The method that the invention described above embodiment discloses can be applied in processor 1030, or is realized by processor 1030.Processor 1030 may be a kind of integrated circuit (IC) chip, has the processing power of signal.In implementation procedure, each step of said method can be completed by the instruction of the integrated logic circuit of the hardware in processor 1030 or software form.Above-mentioned processor 1030 can be general processor, digital signal processor (DSP), special IC (ASIC), ready-made programmable gate array (FPGA) or other programmable logic device (PLD), discrete gate or transistor logic, discrete hardware components.Can realize or perform disclosed each method, step and the logic diagram in the embodiment of the present invention.The processor etc. of general processor can be microprocessor or this processor also can be any routine.Step in conjunction with the method disclosed in the embodiment of the present invention directly can be presented as that hardware decoding processor is complete, or combines complete by the hardware in decoding processor and software module.Software module can be positioned at random access memory, flash memory, ROM (read-only memory), in the storage medium of this area maturations such as programmable read only memory or electrically erasable programmable storer, register.This storage medium is positioned at storer 1040, and processor 1030 reads the information in storer 1040, completes the step of said method in conjunction with its hardware.

In several embodiments that the application provides, should be understood that, disclosed apparatus and method can realize by another way.Such as, device embodiment described above is only schematic, such as, the division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical, machinery or other form.

The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form of SFU software functional unit also can be adopted to realize.

If described integrated unit using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words or all or part of of this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform all or part of step of method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, portable hard drive, ROM (read-only memory) (ROM, Read-OnlyMemory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. various can be program code stored medium.

The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should described be as the criterion with the protection domain of claim.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

Fig. 1 is image characteristics extraction schematic diagram of the prior art;

Fig. 2 is the application schematic diagram of image characteristics extraction in the embodiment of the present invention;

Fig. 3 is a schematic flow sheet of image characteristic extracting method in the embodiment of the present invention;

Fig. 4 is an image characteristics extraction schematic diagram of image characteristic extracting method in the embodiment of the present invention;

Fig. 5 is that in the embodiment of the present invention, dimension about subtracts the schematic diagram recovered with dimension;

Fig. 6 is a schematic flow sheet of image characteristic extracting method in the embodiment of the present invention;

Fig. 7 is another image characteristics extraction schematic diagram of image characteristic extracting method in the embodiment of the present invention;

Fig. 8 is the schematic diagram of image characteristics extraction device in the embodiment of the present invention;

Fig. 9 is the schematic diagram of image characteristics extraction device in the embodiment of the present invention;

Figure 10 is the computer organization schematic diagram based on image characteristic extracting method in the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

Claims

1. an image characteristic extracting method, is characterized in that, comprising:

2. method according to claim 1, is characterized in that, the area pixel in the image-region at the described pixel place to be extracted to input picture carries out sparse coding, thus the code coefficient obtaining the M dimension of described area pixel comprises:

c_{ij} = \arg \min_{c} \frac{1}{2} {| | x_{ij} - {Zc}_{ij} | |}^{2}, s . t . {| | c_{ij} | |}_{0} \leq K;

3. method according to claim 1 and 2, is characterized in that, described respectively to described M sparse coding figure carry out pondization process, thus obtain described M tie up pond feature, comprising:

Ω_{ij}^{c} = {[\max (c_{pq}^{1}), . . ., \max (c_{pq}^{M})]}_{p, q &Element; N (i, j)};

4. the method according to any one of claims 1 to 3, is characterized in that, carries out dimension about subtract described pond feature, thus obtains about subtracting feature for what represent described pond feature, comprising:

f, g = \arg \min_{f, g} {| | n - g (f (n)) | |}^{2};

5. the method according to any one of claims 1 to 3, is characterized in that, also comprises before dimension about subtracts described to carry out described pond feature:

The pond feature that described M ties up is normalized.

6. a characteristic pattern extracting method for image, is characterized in that, comprising:

7. method according to claim 6, is characterized in that, after acquisition described L about subtracts characteristic pattern, also comprises:

8. an image characteristics extraction device, is characterized in that, comprising:

Sparse coding module, pond module and dimension about subtract module;

9. device according to claim 8, is characterized in that, described sparse coding module specifically for:

c_{ij} = \arg \min_{c} \frac{1}{2} {| | x_{ij} - {Zc}_{ij} | |}^{2}, s . t . {| | c_{ij} | |}_{0} \leq K;

10. device according to claim 8 or claim 9, is characterized in that, described pond module specifically for:

Ω_{ij}^{c} = {[\max (c_{pq}^{1}), . . ., \max (c_{pq}^{M})]}_{p, q &Element; N (i, j)};

Device described in 11. any one of according to Claim 8 to 10, is characterized in that, described dimension about subtract module specifically for:

f, g = \arg \min_{f, g} {| | n - g (f (n)) | |}^{2};

Device described in 12. any one of according to Claim 8 to 10, it is characterized in that, described image characteristics extraction device also comprises: normalization module;

13. 1 kinds of image characteristics extraction devices, is characterized in that, comprising: the first graphics processing unit;

Described fisrt feature extraction module concrete also for:

14. devices according to claim 13, is characterized in that, described image characteristics extraction device also comprises: the second graphics processing unit;