CN106557779A - A kind of object identification method based on marking area bag of words - Google Patents
A kind of object identification method based on marking area bag of words Download PDFInfo
- Publication number
- CN106557779A CN106557779A CN201610921396.2A CN201610921396A CN106557779A CN 106557779 A CN106557779 A CN 106557779A CN 201610921396 A CN201610921396 A CN 201610921396A CN 106557779 A CN106557779 A CN 106557779A
- Authority
- CN
- China
- Prior art keywords
- image
- words
- marking area
- method based
- object identification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of object identification method based on marking area bag of words, comprises the following steps:Corner Detection, positions the marking area of image, and SIFT feature is extracted, image area characteristics similarity system design.The method is to extract local feature in target area, is on the one hand avoided using complicated image Segmentation Technology;On the other hand greatly reduce the characteristic point unrelated with object.
Description
Technical field
The present invention relates to the technical field of Digital Image Processing, particularly a kind of object based on marking area bag of words
Recognition methodss.
Background technology
With developing rapidly for machine learning and area of pattern recognition correlation technique, computer vision technique is constantly carried
Height, imitated using computer the mankind cognitive competence so that process mitigate or assist people complete routine work task into
For possibility.Object identification has become a particularly important research direction in pattern recognition, has in military and civilian field
Extensive demand and application.Such as:Intelligent video monitoring, self-driving navigation, man-machine interaction, magnanimity based on content in the Internet
Image retrieval etc..
How object is accurately and effectively recognized, to meet growing collection machine vision, psychology and practical application request, still
It is a challenging job.Research shows that the mankind obtain the information in the external world, have 60% from visual information.Meter
In calculation machine visual field, the form of expression of visual information is mainly image and video, and wherein image is the main load of visual information
Body.Image is analyzed, human visual system is simulated, cognitive identification is carried out to numerous and diverse ten million object, extract object what
The feature of sample, and effective object representation is realized, the object model of reasonable simplicity is set up, more highlightedly goes to distinguish an object
With another object, these are all the key problems for recognizing object.
Object identification is most to talk about one of research topic of jump in computer vision field, quickly and accurately identifies object
It is an important research direction.Visual angle is often received in object identification, and yardstick blocks, the factor interference such as background mixes, be reply this
A little to challenge, many scholars propose to set up feature description to image using local feature.Feature bag BOW tables based on local key point
Good performance is shown up to method in various visualization classification tasks.Traditional recognition methodss based on bag of words are general
Realized using SIFT description, K mean cluster and grader.One of defect of bag of words is front background regardless of for shape
Into in the vision word of BOW expression, some are extracted from background parts.SIFT description extract characteristic point enormous amount, energy
The impact that change of scale and the anglec of rotation are brought is resisted very well, and analysis SIFT describes son and characteristics of image can be effectively ensured in quantity
The adequacy and robustness of expression.SIFT description also have the aspect of its deficiency as the representative in local feature description's algorithm,
The extraction of key point is that, based on entire image, the point of interest for detecting much comes from background parts.And based on general emerging
There is the defect of lazy weight in the characteristic point detected by interest point detective operators, make image feature representation not abundant enough, but often
The characteristic point for detecting concentrates on target location.
The patent document of Publication No. CN105654122A is disclosed to be known based on the spatial pyramid object that kernel function is matched
Other method.Comprise the steps of:Extract ED-SIFT (the Efficient Dense Scale-invariant of subject image
Feature Transform) description;Using k-means++ clustering algorithms by the ED-SIFT description son clusters of training sample,
Obtain visual dictionary;Spatial pyramid is introduced, and the verse word for acquisition training sample and test sample being matched using kernel function is straight
Fang Tu;The training and test sample identification of training sample are completed using SVM classifier.ED-SIFT description that the method is proposed
It is that, based on entire image, the point of interest for detecting much comes from background parts in the extraction of key point, front background can be caused
Regardless of;K-means++ clusters are as a kind of unsupervised learning algorithm, sensitive to abnormal data in addition, once occur in data set
Exceptional value can cause the impact that can not be despised to experimental result;And it relies on the selection of k value, so that k value is determined in advance,
And impact of the selection of k value to classification is most important, its appropriate level directly decides the quality of classification.
The content of the invention
In order to solve above-mentioned technical problem, the present invention is showed for the conventional local feature for building visual dictionary
Unstable, unreliable or unrelated with object problem, proposes a kind of object identification method based on marking area bag of words.It is first
First, the method utilizes strong Corner Detection device, determines the marking area of image.Then, local feature is extracted simultaneously from marking area
Bag of words are modeled as, and finally recognition result are provided using nearest neighbor classifier.
The present invention provides a kind of object identification method based on marking area bag of words, comprises the following steps:
Step 1:Corner Detection;
Step 2:The marking area of positioning image;
Step 3:SIFT feature is extracted;
Step 4:Image area characteristics similarity system design.
Preferably, the angle point is ShiTomasi angle points, and its rate of change for passing through to calculate gradient direction is calculated.
In any of the above-described scheme preferably, the ShiTomasi angle points are that brightness of image changes violent or curvature
Very big point.
In any of the above-described scheme preferably, the step 2 is that positioning image key area is converted to angle in image
The region of point distribution.
In any of the above-described scheme preferably, the localization method is:Image is divided into into m × n blocks, angle in counting per block
The quantity of point, the angle point quantity included per block is recorded in a m * n matrix.If piecemeal interior angle point quantity >=q, it is believed that angle
Continuous concentrated area that point is located is image key area, and wherein q is the threshold value for judging piecemeal interior angle point quantity, is included for screening
The background area of isolated or a small amount of angle point.
In any of the above-described scheme preferably, the step 3 includes that DoG extreme points extractions and characteristic vector are formed.
In any of the above-described scheme preferably, the method for the DoG extreme points extractions is by carrying out to original image
Scaling, obtains multi-scale image space representation sequence, completes the feature extraction on different resolution.
In any of the above-described scheme preferably, the yardstick includes at least one of large scale and little yardstick.
In any of the above-described scheme preferably, the large scale (low resolution), embodies the general picture feature of object.
In any of the above-described scheme preferably, the little yardstick (high-resolution), embodies the minutia of object.
In any of the above-described scheme preferably, the metric space L (x, y, σ) of an image be defined as original image I (x,
Y) 2 dimension Gaussian function G (x, y, the σ) convolution algorithms with a variable dimension.Wherein σ represents scale size, Gaussian function definition
It is as follows:
Gaussian convolution core and original image convolution, obtain metric space, are defined as:
L (x, y, σ)=G (x, y, σ) * I (x, y)
It is that stable key point is found on multiscale space, it is necessary first to build image pyramid, then in pyramid
Each layer it is adjacent make difference construct Gaussian difference scale space (DOG),
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ)
In any of the above-described scheme preferably, using histogram of gradients statistic law, extreme point is origin, counts certain area
In domain, pixel generates made contribution to extreme point direction.Rectangular histogram totally 36 post, is a post per 10 degree, the width that will be put in neighborhood
Value is added in rectangular histogram according to the post corresponding to angle, and the length of post represents the size of gradient magnitude.
In any of the above-described scheme preferably, the method that the characteristic vector is formed is to be tried to achieve by DoG metric spaces
Extreme point so as to possess scale invariability, using the gradient direction distribution characteristic of extreme point neighborhood territory pixel, Jing statistics with histogram is true
Determine the principal direction of extreme point, be each extreme point assigned direction parameter, so as to obtain rotational invariance.Each is calculated using following
Gradient magnitude m (x, y) and gradient direction θ (x, y) of pixel:
In any of the above-described scheme preferably, before setting up key point description vectors, rotatable coordinate axis are to the pass first
The principal direction of key point, centered on key point, counts the gradient direction distribution rectangular histogram of 4 × 4 sub-regions.Wherein, per height
Region is the image block of 4 × 4 pixel compositions, and the total size of subregion is 16 × 16 pixels.
In any of the above-described scheme preferably, the step 4 is that the image in image library is trained and was tested
Journey.
In any of the above-described scheme preferably, the training process is comprised the following steps:
Step a1:Training object picture is read in, the marking area of image is determined;
Step a2:The SIFT feature of training sample is extracted in marking area, if the total i width of training picture, each image
SIFT feature point number be n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni);
Step a3:It is (n with a size1+n2+…+niThe original training matrix of) × 128. is depositing all samples
SIFT feature, creates the visual dictionary needed for BOW models using k means clustering algorithms.Sizes of the k for visual dictionary, as BOW
Histogrammic dimension;
Step a4:Mapped on visual dictionary, the BOW rectangular histograms of statistics training picture, each image are big by one
Little k dimensional vectors represent that all of training picture can be stored with the new eigenmatrix of an i*k dimension.
In any of the above-described scheme preferably, the test process is comprised the following steps:
Step b1:Training object picture is read in, the marking area of image is determined;
Step b2:The SIFT feature of training sample is extracted in marking area, if the total i width of training picture, each image
SIFT feature point number be n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni);
Step b3:SIFT feature is extracted in target area, by the Projection Character for extracting to visual dictionary, forms test image
BOW expression.The visual dictionary of test process projection is that the training process is set up;
Step b4:Mapped on visual dictionary, the BOW rectangular histograms of statistics training picture, each image are big by one
Little k dimensional vectors represent that all of training picture can be stored with the new eigenmatrix of an i*k dimension.
In any of the above-described scheme preferably, in the BOW models, image is reached by a vector table, and vector is root
What the frequency for being mapped to visual dictionary according to Feature Words was calculated.For example image j can be expressed as:
dj=(nJ, 0,nJ, 1,...., nJ, k-1)
Wherein k represents visual dictionary size, nJ, i(i=0,1 ... .k-1) represent that image j is mapped to i-th visual vocabulary
Frequency, frequency also becomes code word.
In any of the above-described scheme preferably, with the tf-idf methods of weighting in information retrieval, the BOW for generating cum rights is special
Levy.
In any of the above-described scheme preferably, the tf refer to certain key word frequency of occurrences in an article it is high and
Seldom occur in other articles, then it is assumed that the word discrimination is high, it is big to article classification contribution.
It is preferably in any of the above-described scheme, the idf is referred to if the file comprising the word is few in document data bank,
Then idf weights are bigger, illustrate that the word has good difference degree.
In any of the above-described scheme preferably, tf-idf weights of the vocabulary i in image j are:
wJ, i=wTf, j, i×wIdf, j, i(i=0,1 ... .k-1)
wTf, j, iFor the contribution degree weights that i-th vocabulary is classified to image j, wIdf, j, iFor i-th word pair in document data bank
The contribution degree weights of image j classification.
In any of the above-described scheme preferably, specifically it is calculated as follows:
nJ, iIt is number of times that i-th vocabulary occurs in image j, njFor vocabulary number summation in image j, i.e.,:
N be train storehouse in image number, ndIt is the image number for wherein including vocabulary d.Discrimination of the vocabulary in storehouse
The frequency occurred in different images with it is inversely proportional to.To prevent the situation except zero, general denominator takes nd+1。
In any of the above-described scheme preferably, image j cum rights BOW features are designated as:
bofj=wJ, i×dj(i=0,1 ... .., k-1)
In bag of words, the image comprising the feature quantity such as not can be expressed with the vector of fixed dimension, can be by degree
Amount test image realizes object identification with image similarity in training storehouse.
In any of the above-described scheme preferably, test vector:
dt=(nT, 0, nT, 1,..., nT, k-1)
Test image weighting BOW features are designated as bofi, computational methods are identical with training image.Euclidean distance is to calculate two
Absolute distance between characteristic vector, it is directly related with characteristics of image concrete numerical value:COS distance is to calculate two characteristic vectors
Between angle judging similarity.Euclidean distance is compared, COS distance stresses difference of two characteristic vectors on direction, and
Non- distance or length.The difference of tolerance high dimension vector generally adopts COS distance, but COS distance override feature numerical value, this
Kind is insensitive to cause error.Therefore, characteristic vector is owned using the COS distance dis condition S measured similarities of adjustment herein
Numerical value in dimension deducts the average of eigenvalue.The COS distance dis of adjustment is bigger, illustrates that the angle between characteristic vector is got over
It is little, represent that two width images are more similar.
Method proposed by the present invention is to extract local feature in target area, is on the one hand avoided using complicated image point
Cut technology;On the other hand the characteristic point unrelated with object is greatly reduced, is had:Parameter is few, calculates simple, and processing speed is fast,
The features such as image recognition effect is good.
Description of the drawings
Fig. 1 is the stream of a preferred embodiment of the object identification method based on marking area bag of words according to the present invention
Cheng Tu.
Fig. 2 is the excellent of the structure information bank of the object identification method based on marking area bag of words according to the present invention
The flow chart for selecting embodiment.
Fig. 3 is a preferred embodiment of the object identification method based on marking area bag of words according to the present invention
DoG metric space structural maps.
Fig. 4 is a preferred embodiment of the object identification method based on marking area bag of words according to the present invention
DoG spatial extremas point detection figure.
Fig. 5 is the side of a preferred embodiment of the object identification method based on marking area bag of words according to the present invention
Figure is generated to histogrammic.
Fig. 6 is the pass of a preferred embodiment of the object identification method based on marking area bag of words according to the present invention
Key point characteristic vector forms figure.
Specific embodiment
The present invention is further elaborated with specific embodiment below in conjunction with the accompanying drawings.
Embodiment one
Marking area positioning, feature extraction and description, figure are included based on the object identification method of marking area bag of words
As 3 parts such as provincial characteristicss similarity system designs.
Execution step 100, carries out Corner Detection.
ShiTomasi angle points calculate angle point by the rate of change of calculating gradient direction, and exactly brightness of image change is violent
Or the point that curvature is very big, its main thought is the version that signal in image is determined using auto-associating matrix.Assume
Signal in image at x is I (x), using picture signal and Gaussian function G (x, σD) differential carry out convolution algorithm and obtain single order
Derivative.
L (x, σD)=I (x) * Gu(x, σD) (1)
Lv(x, σD)=I (x) * Gv(x, σD) (2)
Lu(x, σD)Lv(x, σD)=Iu(x, σD) (3)
Wherein σDBe differential yardstick, incidence formula is obtained using formula (1)~(3)
Wherein, σiIt is integral scale, G is Gauss operator.
Calculate the eigenvalue λ of correlation matrix C1,, λ2, retain min (λ1, λ2), obtain image holds value indicative image. when
During ShiTomasi angle point grids during given threshold value:I, a little meets min (λ in image1, λ2)>During λ, then it is assumed that the point is strong angle
Point.
Execution step 110, positions the marking area of image.
Positioning image key area is converted to into the region of angle point distribution in image, localization method is:By image be divided into m ×
N blocks, the quantity of angle point in counting per block, the angle point quantity included per block is recorded in a m * n matrix.If piecemeal interior angle
Point quantity >=q, it is believed that continuous concentrated area that angle point is located is image key area, and wherein q is to judge piecemeal interior angle point quantity
Threshold value, for screening the background area comprising isolated or a small amount of angle point.
Execution step 120, carries out SIFT feature extraction, and SIFT feature is extracted includes following two parts:
(1) DoG extreme points extractions.
One sparse combination of the local feature region obtained by property detector is the basis for building BOW.Accurately to show
Object will be reflected using certain yardstick.The main thought of Scale-space theory is by carrying out yardstick contracting to original image
Put, obtain multi-scale image space representation sequence, complete the feature extraction on different resolution.Large scale (low resolution), body
The general picture feature of existing object:Little yardstick (high-resolution), embodies the minutia of object.One image metric space L (x, y,
2 dimension Gaussian function G (x, y, σ) convolution algorithms of original image I (x, y) and a variable dimension are defined as σ).Wherein σ represents chi
Degree size, Gaussian function are defined as follows:
Gaussian convolution core and original image convolution, obtain metric space, are defined as:
L (x, y, σ)=G (x, y, σ) * I (x, y)
(6)
It is that stable key point is found on multiscale space, it is necessary first to build image pyramid, then in pyramid
Each layer it is adjacent make difference construct Gaussian difference scale space (DOG),
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ) (7)
The Local Extremum that DoG is spatially detected as key point, be find extreme point, each pixel will with
With corresponding 9 × 2 points of 8 consecutive points and neighbouring yardstick of yardstick, totally 26 points are compared for it, to ensure in yardstick
Space and two dimensional image space all detect extreme point, and the pixel is all bigger than 26 points or all little, that is, be defined as a pass
Key point.
(2) characteristic vector is formed.
Extreme point is tried to achieve by DoG metric spaces so as to possess scale invariability, using the gradient of extreme point neighborhood territory pixel
Direction Distribution Characteristics, Jing statistics with histogram determine the principal direction of extreme point, are each extreme point assigned direction parameter, so as to obtain
Rotational invariance.Gradient magnitude m (x, y) and gradient direction θ (x, y) of each pixel are calculated using formula (4), (5):
Using histogram of gradients statistic law, extreme point is origin, and in statistics certain area, pixel is given birth to extreme point direction
The contribution of Cheng Suozuo.Rectangular histogram totally 36 post, is a post per 10 degree, the amplitude put in neighborhood is added to according to the post corresponding to angle
In rectangular histogram, the length of post represents the size of gradient magnitude.
Before setting up key point description vectors, the principal direction of rotatable coordinate axis to the key point first, in key point being
The heart, counts the gradient direction distribution rectangular histogram of 4 × 4 sub-regions.Wherein, every sub-regions are the figures of 4 × 4 pixels compositions
As block, the total size of subregion is 16 × 16 pixels.
Execution step 130, image area characteristics similarity system design.
We need to be trained the image in image library and test process, finally give the object of identification.
Embodiment two
As shown in Fig. 2 image area characteristics similarity system design includes the image in image library is trained and was tested
Journey.
(1) training process
Execution step 200, reads in training object picture, determines the marking area of image.
Execution step 210, extracts the SIFT feature of training sample in marking area, if the total i width of training picture, per width
The SIFT feature point number of image is n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni)。
Execution step 211, is (n with a size1+n2+…+niThe original training matrix of) × 128. is depositing all samples
This SIFT feature, creates the visual dictionary needed for BOW models using k means clustering algorithms.Sizes of the k for visual dictionary, i.e.,
For the histogrammic dimensions of BOW.Execution step 230, is mapped on visual dictionary, the BOW rectangular histograms of statistics training picture, often
Width image is represented that by a size k dimensional vector all of training picture can be stored with the new eigenmatrix of an i*k dimension.
(2) test process
Execution step 200, reads in training object picture, determines the marking area of image.
Execution step 220, extracts the SIFT feature of training sample in marking area, if the total i width of training picture, per width
The SIFT feature point number of image is n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni)。
Execution step 221, extracts SIFT feature in target area, by the Projection Character for extracting to visual dictionary, is formed
The BOW expression of test image.
In BOW models, image is reached by a vector table, and vector is the cymometer that visual dictionary is mapped to according to Feature Words
Calculate.For example image j can be expressed as:
dj=(nJ, 0,nJ, 1,...., nJ, k-1)
(6)
Wherein k represents visual dictionary size, nJ, i(i=0,1 ... .k-1) represent that image j is mapped to i-th visual vocabulary
Frequency, frequency also becomes code word.
To project the Feature Words high to image expression contribution degree, herein using the tf-idf methods of weighting in information retrieval,
Generate the BOW features of cum rights.Tf thoughts are:Certain key word frequency of occurrences in an article is high and seldom go out in other articles
It is existing, then it is assumed that the word discrimination is high, it is big to article classification contribution.Idf thoughts are:If the text comprising the word in document data bank
Part is few, then idf weights are bigger, illustrates that the word has good difference degree.Tf-idf weights of the vocabulary i in image j are:
wJ, i=wTf, j, i×wIdf, j, i(i=0,1 ... .k-1)
(7)
W in formula (7)Tf, j, iFor the contribution degree weights that i-th vocabulary is classified to image j, wIdf, j, iFor in document data bank
The contribution degree weights that i word is classified to image j.Specifically it is calculated as follows:
In formula (8), nJ, iIt is number of times that i-th vocabulary occurs in image j, njFor vocabulary number summation in image j, i.e.,:
In formula (10), N be train storehouse in image number, ndIt is the image number for wherein including vocabulary d.Vocabulary is in storehouse
The frequency that occurs in different images with it of discrimination be inversely proportional to.To prevent the situation except zero, general denominator takes nd+1.
Image j cum rights BOW features are designated as:
bofj=wJ, i×dj(i=0,1 ... .., k-1) (11)
In bag of words, the image comprising the feature quantity such as not can be expressed with the vector of fixed dimension, can be by degree
Amount test image realizes object identification with image similarity in training storehouse.Test vector:
dt=(nT, 0, nT, 1... ..., nT, k-1)
Test image weighting BOW features are designated as bofi, computational methods are identical with training image.Euclidean distance is to calculate two
Absolute distance between characteristic vector, it is directly related with characteristics of image concrete numerical value:COS distance is to calculate two characteristic vectors
Between angle judging similarity.Euclidean distance is compared, COS distance stresses difference of two characteristic vectors on direction, and
Non- distance or length.The difference of tolerance high dimension vector generally adopts COS distance, but COS distance override feature numerical value, this
Kind is insensitive to cause error.Therefore, characteristic vector is owned using the COS distance dis condition S measured similarities of adjustment herein
Numerical value in dimension deducts the average of eigenvalue.The COS distance dis of adjustment is bigger, illustrates that the angle between characteristic vector is got over
It is little, represent that two width images are more similar.
The visual dictionary of test process projection is that training process is set up.
Embodiment three
As shown in figure 3, being that stable key point is found on multiscale space, pyramidal each layer is constructed first adjacent
Make difference and construct Gaussian difference scale space (DoG):
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ)
DoG difference pyramids are obtained by gaussian pyramid.
As shown in figure 4, the Local Extremum for spatially detecting DoG is used as key point, it is to find extreme point, each picture
Vegetarian refreshments will with corresponding 9 × 2 points of 8 consecutive points and neighbouring yardstick of yardstick, totally 26 points are compared with it, with
Ensure all to detect extreme point in metric space and two dimensional image space, the pixel is all bigger than 26 points or all little, i.e., really
It is set to a key point.
Example IV
As shown in figure 5, histogram of gradients statistic law is adopted, and with extreme point as origin, pixel pair in statistics certain area
Extreme point direction generates done contribution.Rectangular histogram totally 36 post, it is per 10 degree of one post, the amplitude put in neighborhood is right according to angle institute
The post answered is added in rectangular histogram, and the length of post represents the size of gradient magnitude.The extreme value gone out using the statistics with histogram of 7 posts
The schematic diagram of point principal direction, the principal direction of neighborhood histogram of gradients are the peak value of statistic histogram.
Embodiment five
As shown in fig. 6, what the left side represented is 8 × 8 pixel sizes, 2 × 2 sub-regions form the example of SIFT description.
Each little lattice represents a pixel of crucial vertex neighborhood place metric space, and the direction of arrow represents pixel gradient direction,
Arrow length represents the amplitude of the pixel.The gradient orientation histogram in 8 directions is calculated in 4 × 4 pixel size windows, is drawn
One seed point of cumulative formation of each gradient direction.The right represent key point formed by 4 seed points, each seed point by
4 × 4 pixel statistics direction histogram is formed, and the characteristic vector of formation is 2 × 2 × 8=32 dimensions.Actual pixel size is
16 × 16,4 × 4=16 seed point can be formed, characteristic vector is tieed up for 4 × 4 × 8=128.
For a better understanding of the present invention, it is described in detail above in association with the specific embodiment of the present invention, but is not
Limitation of the present invention.Every technical spirit according to the present invention still belongs to any simple modification made for any of the above embodiments
In the scope of technical solution of the present invention.What in this specification, each embodiment was stressed be it is different from other embodiments it
Place, same or analogous part cross-reference between each embodiment.For system embodiment, due to itself and method
Embodiment is corresponded to substantially, so description is fairly simple, related part is illustrated referring to the part of embodiment of the method.
The method of the present invention, device and system may be achieved in many ways.For example, software, hardware, firmware can be passed through
Or any combinations of software, hardware, firmware are realizing the method for the present invention and system.For above-mentioned the step of methods described
Order is not limited to order described in detail above merely to illustrate, the step of the method for the present invention, unless with other sides
Formula is illustrated.Additionally, in certain embodiments, also the present invention can be embodied as recording program in the recording medium, these
Program is included for realizing the machine readable instructions of the method according to the invention.Thus, the present invention also covers storage for performing
The recording medium of the program of the method according to the invention.
Description of the invention is given for the sake of example and description, and is not exhaustively or by the present invention
It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Select and retouch
It is, for the principle and practical application that more preferably illustrate the present invention, and one of ordinary skill in the art is managed to state embodiment
The present invention is solved so as to design the various embodiments with various modifications for being suitable to special-purpose.
Claims (10)
1. a kind of object identification method based on marking area bag of words, comprises the following steps:
Step 1:Corner Detection;
Step 2:The marking area of positioning image;
Step 3:SIFT feature is extracted;
Step 4:Image area characteristics similarity system design.
2. the object identification method based on marking area bag of words as claimed in claim 1, it is characterised in that:The angle point
For ShiTomasi angle points, its rate of change for passing through to calculate gradient direction is calculated.
3. the object identification method based on marking area bag of words as claimed in claim 2, it is characterised in that:It is described
ShiTomasi angle points are that brightness of image changes violent or curvature point very greatly.
4. the object identification method based on marking area bag of words as claimed in claim 1, it is characterised in that:The step
2 is the region that positioning image key area is converted to angle point distribution in image.
5. the object identification method based on marking area bag of words as claimed in claim 4, it is characterised in that:The positioning
Method is:Image is divided into into m × n blocks, the angle point quantity included per block is recorded a m × n by the quantity of angle point in counting per block
In matrix.If piecemeal interior angle point quantity >=q, it is believed that continuous concentrated area that angle point is located is image key area, and wherein q is to sentence
The threshold value of disconnected piecemeal interior angle point quantity, for screening the background area comprising isolated or a small amount of angle point.
6. the object identification method based on marking area bag of words as claimed in claim 1, it is characterised in that:The step
3 include that DoG extreme points extractions and characteristic vector are formed.
7. the object identification method based on marking area bag of words as claimed in claim 6, it is characterised in that:The DoG
The method of extreme points extraction is by scaling being carried out to original image, multi-scale image space representation sequence being obtained, is completed
Feature extraction on different resolution.
8. the object identification method based on marking area bag of words as claimed in claim 7, it is characterised in that:The yardstick
Including at least one of large scale and little yardstick.
9. the object identification method based on marking area bag of words as claimed in claim 8, it is characterised in that:The big chi
Degree (low resolution), embodies the general picture feature of object.
10. the object identification method based on marking area bag of words as claimed in claim 9, it is characterised in that:It is described little
Yardstick (high-resolution), embodies the minutia of object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610921396.2A CN106557779A (en) | 2016-10-21 | 2016-10-21 | A kind of object identification method based on marking area bag of words |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610921396.2A CN106557779A (en) | 2016-10-21 | 2016-10-21 | A kind of object identification method based on marking area bag of words |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106557779A true CN106557779A (en) | 2017-04-05 |
Family
ID=58443857
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610921396.2A Pending CN106557779A (en) | 2016-10-21 | 2016-10-21 | A kind of object identification method based on marking area bag of words |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106557779A (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100169576A1 (en) * | 2008-12-31 | 2010-07-01 | Yurong Chen | System and method for sift implementation and optimization |
CN101840507A (en) * | 2010-04-09 | 2010-09-22 | 江苏东大金智建筑智能化***工程有限公司 | Target tracking method based on character feature invariant and graph theory clustering |
CN102043960A (en) * | 2010-12-03 | 2011-05-04 | 杭州淘淘搜科技有限公司 | Image grey scale and gradient combining improved sift characteristic extracting method |
CN102156888A (en) * | 2011-04-27 | 2011-08-17 | 西安电子科技大学 | Image sorting method based on local colors and distribution characteristics of characteristic points |
CN102682132A (en) * | 2012-05-18 | 2012-09-19 | 合一网络技术(北京)有限公司 | Method and system for searching information based on word frequency, play amount and creation time |
US20120301014A1 (en) * | 2011-05-27 | 2012-11-29 | Microsoft Corporation | Learning to rank local interest points |
CN102865859A (en) * | 2012-09-21 | 2013-01-09 | 西北工业大学 | Aviation sequence image position estimating method based on SURF (Speeded Up Robust Features) |
CN103077512A (en) * | 2012-10-18 | 2013-05-01 | 北京工业大学 | Feature extraction and matching method and device for digital image based on PCA (principal component analysis) |
CN103295240A (en) * | 2013-06-26 | 2013-09-11 | 山东农业大学 | Method for evaluating similarity of free-form surfaces |
CN103530603A (en) * | 2013-09-24 | 2014-01-22 | 杭州电子科技大学 | Video abnormality detection method based on causal loop diagram model |
CN104166853A (en) * | 2014-04-24 | 2014-11-26 | 中国人民解放军海军航空工程学院 | Method for quickly extracting regularized ship section from high resolution remote sensing image |
CN105631892A (en) * | 2016-02-23 | 2016-06-01 | 武汉大学 | Aviation image building damage detection method based on shadow and texture characteristics |
CN105631471A (en) * | 2015-12-23 | 2016-06-01 | 西安电子科技大学 | Aurora sequence classification method with fusion of single frame feature and dynamic texture model |
CN105856230A (en) * | 2016-05-06 | 2016-08-17 | 简燕梅 | ORB key frame closed-loop detection SLAM method capable of improving consistency of position and pose of robot |
-
2016
- 2016-10-21 CN CN201610921396.2A patent/CN106557779A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100169576A1 (en) * | 2008-12-31 | 2010-07-01 | Yurong Chen | System and method for sift implementation and optimization |
CN101840507A (en) * | 2010-04-09 | 2010-09-22 | 江苏东大金智建筑智能化***工程有限公司 | Target tracking method based on character feature invariant and graph theory clustering |
CN102043960A (en) * | 2010-12-03 | 2011-05-04 | 杭州淘淘搜科技有限公司 | Image grey scale and gradient combining improved sift characteristic extracting method |
CN102156888A (en) * | 2011-04-27 | 2011-08-17 | 西安电子科技大学 | Image sorting method based on local colors and distribution characteristics of characteristic points |
US20120301014A1 (en) * | 2011-05-27 | 2012-11-29 | Microsoft Corporation | Learning to rank local interest points |
CN102682132A (en) * | 2012-05-18 | 2012-09-19 | 合一网络技术(北京)有限公司 | Method and system for searching information based on word frequency, play amount and creation time |
CN102865859A (en) * | 2012-09-21 | 2013-01-09 | 西北工业大学 | Aviation sequence image position estimating method based on SURF (Speeded Up Robust Features) |
CN103077512A (en) * | 2012-10-18 | 2013-05-01 | 北京工业大学 | Feature extraction and matching method and device for digital image based on PCA (principal component analysis) |
CN103295240A (en) * | 2013-06-26 | 2013-09-11 | 山东农业大学 | Method for evaluating similarity of free-form surfaces |
CN103530603A (en) * | 2013-09-24 | 2014-01-22 | 杭州电子科技大学 | Video abnormality detection method based on causal loop diagram model |
CN104166853A (en) * | 2014-04-24 | 2014-11-26 | 中国人民解放军海军航空工程学院 | Method for quickly extracting regularized ship section from high resolution remote sensing image |
CN105631471A (en) * | 2015-12-23 | 2016-06-01 | 西安电子科技大学 | Aurora sequence classification method with fusion of single frame feature and dynamic texture model |
CN105631892A (en) * | 2016-02-23 | 2016-06-01 | 武汉大学 | Aviation image building damage detection method based on shadow and texture characteristics |
CN105856230A (en) * | 2016-05-06 | 2016-08-17 | 简燕梅 | ORB key frame closed-loop detection SLAM method capable of improving consistency of position and pose of robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sun et al. | PBNet: Part-based convolutional neural network for complex composite object detection in remote sensing imagery | |
Liao et al. | Rotation-sensitive regression for oriented scene text detection | |
Mu et al. | Discriminative local binary patterns for human detection in personal album | |
CN108319964B (en) | Fire image recognition method based on mixed features and manifold learning | |
CN101976258B (en) | Video semantic extraction method by combining object segmentation and feature weighing | |
CN103077512B (en) | Based on the feature extracting and matching method of the digital picture that major component is analysed | |
JP5604256B2 (en) | Human motion detection device and program thereof | |
CN101714254A (en) | Registering control point extracting method combining multi-scale SIFT and area invariant moment features | |
CN102622607A (en) | Remote sensing image classification method based on multi-feature fusion | |
CN110263712A (en) | A kind of coarse-fine pedestrian detection method based on region candidate | |
CN103390164A (en) | Object detection method based on depth image and implementing device thereof | |
CN104182973A (en) | Image copying and pasting detection method based on circular description operator CSIFT (Colored scale invariant feature transform) | |
CN104574401A (en) | Image registration method based on parallel line matching | |
CN102945374B (en) | Method for automatically detecting civil aircraft in high-resolution remote sensing image | |
CN106682641A (en) | Pedestrian identification method based on image with FHOG- LBPH feature | |
Li et al. | Place recognition based on deep feature and adaptive weighting of similarity matrix | |
CN104881671A (en) | High resolution remote sensing image local feature extraction method based on 2D-Gabor | |
CN103632149A (en) | Face recognition method based on image feature analysis | |
CN110826575A (en) | Underwater target identification method based on machine learning | |
Qi et al. | Exploring illumination robust descriptors for human epithelial type 2 cell classification | |
CN107784263A (en) | Based on the method for improving the Plane Rotation Face datection for accelerating robust features | |
CN105760828A (en) | Visual sense based static gesture identification method | |
Su et al. | Object detection in aerial images using a multiscale keypoint detection network | |
CN110097067B (en) | Weak supervision fine-grained image classification method based on layer-feed feature transformation | |
CN105139013A (en) | Object recognition method integrating shape features and interest points |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170405 |