CN106557779A - A kind of object identification method based on marking area bag of words - Google Patents

A kind of object identification method based on marking area bag of words Download PDF

Info

Publication number
CN106557779A
CN106557779A CN201610921396.2A CN201610921396A CN106557779A CN 106557779 A CN106557779 A CN 106557779A CN 201610921396 A CN201610921396 A CN 201610921396A CN 106557779 A CN106557779 A CN 106557779A
Authority
CN
China
Prior art keywords
image
words
marking area
method based
object identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610921396.2A
Other languages
Chinese (zh)
Inventor
袁家政
刘宏哲
郭燕飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Union University
Original Assignee
Beijing Union University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Union University filed Critical Beijing Union University
Priority to CN201610921396.2A priority Critical patent/CN106557779A/en
Publication of CN106557779A publication Critical patent/CN106557779A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of object identification method based on marking area bag of words, comprises the following steps:Corner Detection, positions the marking area of image, and SIFT feature is extracted, image area characteristics similarity system design.The method is to extract local feature in target area, is on the one hand avoided using complicated image Segmentation Technology;On the other hand greatly reduce the characteristic point unrelated with object.

Description

A kind of object identification method based on marking area bag of words
Technical field
The present invention relates to the technical field of Digital Image Processing, particularly a kind of object based on marking area bag of words Recognition methodss.
Background technology
With developing rapidly for machine learning and area of pattern recognition correlation technique, computer vision technique is constantly carried Height, imitated using computer the mankind cognitive competence so that process mitigate or assist people complete routine work task into For possibility.Object identification has become a particularly important research direction in pattern recognition, has in military and civilian field Extensive demand and application.Such as:Intelligent video monitoring, self-driving navigation, man-machine interaction, magnanimity based on content in the Internet Image retrieval etc..
How object is accurately and effectively recognized, to meet growing collection machine vision, psychology and practical application request, still It is a challenging job.Research shows that the mankind obtain the information in the external world, have 60% from visual information.Meter In calculation machine visual field, the form of expression of visual information is mainly image and video, and wherein image is the main load of visual information Body.Image is analyzed, human visual system is simulated, cognitive identification is carried out to numerous and diverse ten million object, extract object what The feature of sample, and effective object representation is realized, the object model of reasonable simplicity is set up, more highlightedly goes to distinguish an object With another object, these are all the key problems for recognizing object.
Object identification is most to talk about one of research topic of jump in computer vision field, quickly and accurately identifies object It is an important research direction.Visual angle is often received in object identification, and yardstick blocks, the factor interference such as background mixes, be reply this A little to challenge, many scholars propose to set up feature description to image using local feature.Feature bag BOW tables based on local key point Good performance is shown up to method in various visualization classification tasks.Traditional recognition methodss based on bag of words are general Realized using SIFT description, K mean cluster and grader.One of defect of bag of words is front background regardless of for shape Into in the vision word of BOW expression, some are extracted from background parts.SIFT description extract characteristic point enormous amount, energy The impact that change of scale and the anglec of rotation are brought is resisted very well, and analysis SIFT describes son and characteristics of image can be effectively ensured in quantity The adequacy and robustness of expression.SIFT description also have the aspect of its deficiency as the representative in local feature description's algorithm, The extraction of key point is that, based on entire image, the point of interest for detecting much comes from background parts.And based on general emerging There is the defect of lazy weight in the characteristic point detected by interest point detective operators, make image feature representation not abundant enough, but often The characteristic point for detecting concentrates on target location.
The patent document of Publication No. CN105654122A is disclosed to be known based on the spatial pyramid object that kernel function is matched Other method.Comprise the steps of:Extract ED-SIFT (the Efficient Dense Scale-invariant of subject image Feature Transform) description;Using k-means++ clustering algorithms by the ED-SIFT description son clusters of training sample, Obtain visual dictionary;Spatial pyramid is introduced, and the verse word for acquisition training sample and test sample being matched using kernel function is straight Fang Tu;The training and test sample identification of training sample are completed using SVM classifier.ED-SIFT description that the method is proposed It is that, based on entire image, the point of interest for detecting much comes from background parts in the extraction of key point, front background can be caused Regardless of;K-means++ clusters are as a kind of unsupervised learning algorithm, sensitive to abnormal data in addition, once occur in data set Exceptional value can cause the impact that can not be despised to experimental result;And it relies on the selection of k value, so that k value is determined in advance, And impact of the selection of k value to classification is most important, its appropriate level directly decides the quality of classification.
The content of the invention
In order to solve above-mentioned technical problem, the present invention is showed for the conventional local feature for building visual dictionary Unstable, unreliable or unrelated with object problem, proposes a kind of object identification method based on marking area bag of words.It is first First, the method utilizes strong Corner Detection device, determines the marking area of image.Then, local feature is extracted simultaneously from marking area Bag of words are modeled as, and finally recognition result are provided using nearest neighbor classifier.
The present invention provides a kind of object identification method based on marking area bag of words, comprises the following steps:
Step 1:Corner Detection;
Step 2:The marking area of positioning image;
Step 3:SIFT feature is extracted;
Step 4:Image area characteristics similarity system design.
Preferably, the angle point is ShiTomasi angle points, and its rate of change for passing through to calculate gradient direction is calculated.
In any of the above-described scheme preferably, the ShiTomasi angle points are that brightness of image changes violent or curvature Very big point.
In any of the above-described scheme preferably, the step 2 is that positioning image key area is converted to angle in image The region of point distribution.
In any of the above-described scheme preferably, the localization method is:Image is divided into into m × n blocks, angle in counting per block The quantity of point, the angle point quantity included per block is recorded in a m * n matrix.If piecemeal interior angle point quantity >=q, it is believed that angle Continuous concentrated area that point is located is image key area, and wherein q is the threshold value for judging piecemeal interior angle point quantity, is included for screening The background area of isolated or a small amount of angle point.
In any of the above-described scheme preferably, the step 3 includes that DoG extreme points extractions and characteristic vector are formed.
In any of the above-described scheme preferably, the method for the DoG extreme points extractions is by carrying out to original image Scaling, obtains multi-scale image space representation sequence, completes the feature extraction on different resolution.
In any of the above-described scheme preferably, the yardstick includes at least one of large scale and little yardstick.
In any of the above-described scheme preferably, the large scale (low resolution), embodies the general picture feature of object.
In any of the above-described scheme preferably, the little yardstick (high-resolution), embodies the minutia of object.
In any of the above-described scheme preferably, the metric space L (x, y, σ) of an image be defined as original image I (x, Y) 2 dimension Gaussian function G (x, y, the σ) convolution algorithms with a variable dimension.Wherein σ represents scale size, Gaussian function definition It is as follows:
Gaussian convolution core and original image convolution, obtain metric space, are defined as:
L (x, y, σ)=G (x, y, σ) * I (x, y)
It is that stable key point is found on multiscale space, it is necessary first to build image pyramid, then in pyramid Each layer it is adjacent make difference construct Gaussian difference scale space (DOG),
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ)
In any of the above-described scheme preferably, using histogram of gradients statistic law, extreme point is origin, counts certain area In domain, pixel generates made contribution to extreme point direction.Rectangular histogram totally 36 post, is a post per 10 degree, the width that will be put in neighborhood Value is added in rectangular histogram according to the post corresponding to angle, and the length of post represents the size of gradient magnitude.
In any of the above-described scheme preferably, the method that the characteristic vector is formed is to be tried to achieve by DoG metric spaces Extreme point so as to possess scale invariability, using the gradient direction distribution characteristic of extreme point neighborhood territory pixel, Jing statistics with histogram is true Determine the principal direction of extreme point, be each extreme point assigned direction parameter, so as to obtain rotational invariance.Each is calculated using following Gradient magnitude m (x, y) and gradient direction θ (x, y) of pixel:
In any of the above-described scheme preferably, before setting up key point description vectors, rotatable coordinate axis are to the pass first The principal direction of key point, centered on key point, counts the gradient direction distribution rectangular histogram of 4 × 4 sub-regions.Wherein, per height Region is the image block of 4 × 4 pixel compositions, and the total size of subregion is 16 × 16 pixels.
In any of the above-described scheme preferably, the step 4 is that the image in image library is trained and was tested Journey.
In any of the above-described scheme preferably, the training process is comprised the following steps:
Step a1:Training object picture is read in, the marking area of image is determined;
Step a2:The SIFT feature of training sample is extracted in marking area, if the total i width of training picture, each image SIFT feature point number be n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni);
Step a3:It is (n with a size1+n2+…+niThe original training matrix of) × 128. is depositing all samples SIFT feature, creates the visual dictionary needed for BOW models using k means clustering algorithms.Sizes of the k for visual dictionary, as BOW Histogrammic dimension;
Step a4:Mapped on visual dictionary, the BOW rectangular histograms of statistics training picture, each image are big by one Little k dimensional vectors represent that all of training picture can be stored with the new eigenmatrix of an i*k dimension.
In any of the above-described scheme preferably, the test process is comprised the following steps:
Step b1:Training object picture is read in, the marking area of image is determined;
Step b2:The SIFT feature of training sample is extracted in marking area, if the total i width of training picture, each image SIFT feature point number be n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni);
Step b3:SIFT feature is extracted in target area, by the Projection Character for extracting to visual dictionary, forms test image BOW expression.The visual dictionary of test process projection is that the training process is set up;
Step b4:Mapped on visual dictionary, the BOW rectangular histograms of statistics training picture, each image are big by one Little k dimensional vectors represent that all of training picture can be stored with the new eigenmatrix of an i*k dimension.
In any of the above-described scheme preferably, in the BOW models, image is reached by a vector table, and vector is root What the frequency for being mapped to visual dictionary according to Feature Words was calculated.For example image j can be expressed as:
dj=(nJ, 0,nJ, 1,...., nJ, k-1)
Wherein k represents visual dictionary size, nJ, i(i=0,1 ... .k-1) represent that image j is mapped to i-th visual vocabulary Frequency, frequency also becomes code word.
In any of the above-described scheme preferably, with the tf-idf methods of weighting in information retrieval, the BOW for generating cum rights is special Levy.
In any of the above-described scheme preferably, the tf refer to certain key word frequency of occurrences in an article it is high and Seldom occur in other articles, then it is assumed that the word discrimination is high, it is big to article classification contribution.
It is preferably in any of the above-described scheme, the idf is referred to if the file comprising the word is few in document data bank, Then idf weights are bigger, illustrate that the word has good difference degree.
In any of the above-described scheme preferably, tf-idf weights of the vocabulary i in image j are:
wJ, i=wTf, j, i×wIdf, j, i(i=0,1 ... .k-1)
wTf, j, iFor the contribution degree weights that i-th vocabulary is classified to image j, wIdf, j, iFor i-th word pair in document data bank The contribution degree weights of image j classification.
In any of the above-described scheme preferably, specifically it is calculated as follows:
nJ, iIt is number of times that i-th vocabulary occurs in image j, njFor vocabulary number summation in image j, i.e.,:
N be train storehouse in image number, ndIt is the image number for wherein including vocabulary d.Discrimination of the vocabulary in storehouse The frequency occurred in different images with it is inversely proportional to.To prevent the situation except zero, general denominator takes nd+1。
In any of the above-described scheme preferably, image j cum rights BOW features are designated as:
bofj=wJ, i×dj(i=0,1 ... .., k-1)
In bag of words, the image comprising the feature quantity such as not can be expressed with the vector of fixed dimension, can be by degree Amount test image realizes object identification with image similarity in training storehouse.
In any of the above-described scheme preferably, test vector:
dt=(nT, 0, nT, 1,..., nT, k-1)
Test image weighting BOW features are designated as bofi, computational methods are identical with training image.Euclidean distance is to calculate two Absolute distance between characteristic vector, it is directly related with characteristics of image concrete numerical value:COS distance is to calculate two characteristic vectors Between angle judging similarity.Euclidean distance is compared, COS distance stresses difference of two characteristic vectors on direction, and Non- distance or length.The difference of tolerance high dimension vector generally adopts COS distance, but COS distance override feature numerical value, this Kind is insensitive to cause error.Therefore, characteristic vector is owned using the COS distance dis condition S measured similarities of adjustment herein Numerical value in dimension deducts the average of eigenvalue.The COS distance dis of adjustment is bigger, illustrates that the angle between characteristic vector is got over It is little, represent that two width images are more similar.
Method proposed by the present invention is to extract local feature in target area, is on the one hand avoided using complicated image point Cut technology;On the other hand the characteristic point unrelated with object is greatly reduced, is had:Parameter is few, calculates simple, and processing speed is fast, The features such as image recognition effect is good.
Description of the drawings
Fig. 1 is the stream of a preferred embodiment of the object identification method based on marking area bag of words according to the present invention Cheng Tu.
Fig. 2 is the excellent of the structure information bank of the object identification method based on marking area bag of words according to the present invention The flow chart for selecting embodiment.
Fig. 3 is a preferred embodiment of the object identification method based on marking area bag of words according to the present invention DoG metric space structural maps.
Fig. 4 is a preferred embodiment of the object identification method based on marking area bag of words according to the present invention DoG spatial extremas point detection figure.
Fig. 5 is the side of a preferred embodiment of the object identification method based on marking area bag of words according to the present invention Figure is generated to histogrammic.
Fig. 6 is the pass of a preferred embodiment of the object identification method based on marking area bag of words according to the present invention Key point characteristic vector forms figure.
Specific embodiment
The present invention is further elaborated with specific embodiment below in conjunction with the accompanying drawings.
Embodiment one
Marking area positioning, feature extraction and description, figure are included based on the object identification method of marking area bag of words As 3 parts such as provincial characteristicss similarity system designs.
Execution step 100, carries out Corner Detection.
ShiTomasi angle points calculate angle point by the rate of change of calculating gradient direction, and exactly brightness of image change is violent Or the point that curvature is very big, its main thought is the version that signal in image is determined using auto-associating matrix.Assume Signal in image at x is I (x), using picture signal and Gaussian function G (x, σD) differential carry out convolution algorithm and obtain single order Derivative.
L (x, σD)=I (x) * Gu(x, σD) (1)
Lv(x, σD)=I (x) * Gv(x, σD) (2)
Lu(x, σD)Lv(x, σD)=Iu(x, σD) (3)
Wherein σDBe differential yardstick, incidence formula is obtained using formula (1)~(3)
Wherein, σiIt is integral scale, G is Gauss operator.
Calculate the eigenvalue λ of correlation matrix C1,, λ2, retain min (λ1, λ2), obtain image holds value indicative image. when During ShiTomasi angle point grids during given threshold value:I, a little meets min (λ in image1, λ2)>During λ, then it is assumed that the point is strong angle Point.
Execution step 110, positions the marking area of image.
Positioning image key area is converted to into the region of angle point distribution in image, localization method is:By image be divided into m × N blocks, the quantity of angle point in counting per block, the angle point quantity included per block is recorded in a m * n matrix.If piecemeal interior angle Point quantity >=q, it is believed that continuous concentrated area that angle point is located is image key area, and wherein q is to judge piecemeal interior angle point quantity Threshold value, for screening the background area comprising isolated or a small amount of angle point.
Execution step 120, carries out SIFT feature extraction, and SIFT feature is extracted includes following two parts:
(1) DoG extreme points extractions.
One sparse combination of the local feature region obtained by property detector is the basis for building BOW.Accurately to show Object will be reflected using certain yardstick.The main thought of Scale-space theory is by carrying out yardstick contracting to original image Put, obtain multi-scale image space representation sequence, complete the feature extraction on different resolution.Large scale (low resolution), body The general picture feature of existing object:Little yardstick (high-resolution), embodies the minutia of object.One image metric space L (x, y, 2 dimension Gaussian function G (x, y, σ) convolution algorithms of original image I (x, y) and a variable dimension are defined as σ).Wherein σ represents chi Degree size, Gaussian function are defined as follows:
Gaussian convolution core and original image convolution, obtain metric space, are defined as:
L (x, y, σ)=G (x, y, σ) * I (x, y)
(6)
It is that stable key point is found on multiscale space, it is necessary first to build image pyramid, then in pyramid Each layer it is adjacent make difference construct Gaussian difference scale space (DOG),
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ) (7)
The Local Extremum that DoG is spatially detected as key point, be find extreme point, each pixel will with With corresponding 9 × 2 points of 8 consecutive points and neighbouring yardstick of yardstick, totally 26 points are compared for it, to ensure in yardstick Space and two dimensional image space all detect extreme point, and the pixel is all bigger than 26 points or all little, that is, be defined as a pass Key point.
(2) characteristic vector is formed.
Extreme point is tried to achieve by DoG metric spaces so as to possess scale invariability, using the gradient of extreme point neighborhood territory pixel Direction Distribution Characteristics, Jing statistics with histogram determine the principal direction of extreme point, are each extreme point assigned direction parameter, so as to obtain Rotational invariance.Gradient magnitude m (x, y) and gradient direction θ (x, y) of each pixel are calculated using formula (4), (5):
Using histogram of gradients statistic law, extreme point is origin, and in statistics certain area, pixel is given birth to extreme point direction The contribution of Cheng Suozuo.Rectangular histogram totally 36 post, is a post per 10 degree, the amplitude put in neighborhood is added to according to the post corresponding to angle In rectangular histogram, the length of post represents the size of gradient magnitude.
Before setting up key point description vectors, the principal direction of rotatable coordinate axis to the key point first, in key point being The heart, counts the gradient direction distribution rectangular histogram of 4 × 4 sub-regions.Wherein, every sub-regions are the figures of 4 × 4 pixels compositions As block, the total size of subregion is 16 × 16 pixels.
Execution step 130, image area characteristics similarity system design.
We need to be trained the image in image library and test process, finally give the object of identification.
Embodiment two
As shown in Fig. 2 image area characteristics similarity system design includes the image in image library is trained and was tested Journey.
(1) training process
Execution step 200, reads in training object picture, determines the marking area of image.
Execution step 210, extracts the SIFT feature of training sample in marking area, if the total i width of training picture, per width The SIFT feature point number of image is n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni)。
Execution step 211, is (n with a size1+n2+…+niThe original training matrix of) × 128. is depositing all samples This SIFT feature, creates the visual dictionary needed for BOW models using k means clustering algorithms.Sizes of the k for visual dictionary, i.e., For the histogrammic dimensions of BOW.Execution step 230, is mapped on visual dictionary, the BOW rectangular histograms of statistics training picture, often Width image is represented that by a size k dimensional vector all of training picture can be stored with the new eigenmatrix of an i*k dimension.
(2) test process
Execution step 200, reads in training object picture, determines the marking area of image.
Execution step 220, extracts the SIFT feature of training sample in marking area, if the total i width of training picture, per width The SIFT feature point number of image is n1,n2..., ni, the SIFT feature sum of extraction is (n1+n2+…+ni)。
Execution step 221, extracts SIFT feature in target area, by the Projection Character for extracting to visual dictionary, is formed The BOW expression of test image.
In BOW models, image is reached by a vector table, and vector is the cymometer that visual dictionary is mapped to according to Feature Words Calculate.For example image j can be expressed as:
dj=(nJ, 0,nJ, 1,...., nJ, k-1)
(6)
Wherein k represents visual dictionary size, nJ, i(i=0,1 ... .k-1) represent that image j is mapped to i-th visual vocabulary Frequency, frequency also becomes code word.
To project the Feature Words high to image expression contribution degree, herein using the tf-idf methods of weighting in information retrieval, Generate the BOW features of cum rights.Tf thoughts are:Certain key word frequency of occurrences in an article is high and seldom go out in other articles It is existing, then it is assumed that the word discrimination is high, it is big to article classification contribution.Idf thoughts are:If the text comprising the word in document data bank Part is few, then idf weights are bigger, illustrates that the word has good difference degree.Tf-idf weights of the vocabulary i in image j are:
wJ, i=wTf, j, i×wIdf, j, i(i=0,1 ... .k-1)
(7)
W in formula (7)Tf, j, iFor the contribution degree weights that i-th vocabulary is classified to image j, wIdf, j, iFor in document data bank The contribution degree weights that i word is classified to image j.Specifically it is calculated as follows:
In formula (8), nJ, iIt is number of times that i-th vocabulary occurs in image j, njFor vocabulary number summation in image j, i.e.,:
In formula (10), N be train storehouse in image number, ndIt is the image number for wherein including vocabulary d.Vocabulary is in storehouse The frequency that occurs in different images with it of discrimination be inversely proportional to.To prevent the situation except zero, general denominator takes nd+1.
Image j cum rights BOW features are designated as:
bofj=wJ, i×dj(i=0,1 ... .., k-1) (11)
In bag of words, the image comprising the feature quantity such as not can be expressed with the vector of fixed dimension, can be by degree Amount test image realizes object identification with image similarity in training storehouse.Test vector:
dt=(nT, 0, nT, 1... ..., nT, k-1)
Test image weighting BOW features are designated as bofi, computational methods are identical with training image.Euclidean distance is to calculate two Absolute distance between characteristic vector, it is directly related with characteristics of image concrete numerical value:COS distance is to calculate two characteristic vectors Between angle judging similarity.Euclidean distance is compared, COS distance stresses difference of two characteristic vectors on direction, and Non- distance or length.The difference of tolerance high dimension vector generally adopts COS distance, but COS distance override feature numerical value, this Kind is insensitive to cause error.Therefore, characteristic vector is owned using the COS distance dis condition S measured similarities of adjustment herein Numerical value in dimension deducts the average of eigenvalue.The COS distance dis of adjustment is bigger, illustrates that the angle between characteristic vector is got over It is little, represent that two width images are more similar.
The visual dictionary of test process projection is that training process is set up.
Embodiment three
As shown in figure 3, being that stable key point is found on multiscale space, pyramidal each layer is constructed first adjacent Make difference and construct Gaussian difference scale space (DoG):
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ)
DoG difference pyramids are obtained by gaussian pyramid.
As shown in figure 4, the Local Extremum for spatially detecting DoG is used as key point, it is to find extreme point, each picture Vegetarian refreshments will with corresponding 9 × 2 points of 8 consecutive points and neighbouring yardstick of yardstick, totally 26 points are compared with it, with Ensure all to detect extreme point in metric space and two dimensional image space, the pixel is all bigger than 26 points or all little, i.e., really It is set to a key point.
Example IV
As shown in figure 5, histogram of gradients statistic law is adopted, and with extreme point as origin, pixel pair in statistics certain area Extreme point direction generates done contribution.Rectangular histogram totally 36 post, it is per 10 degree of one post, the amplitude put in neighborhood is right according to angle institute The post answered is added in rectangular histogram, and the length of post represents the size of gradient magnitude.The extreme value gone out using the statistics with histogram of 7 posts The schematic diagram of point principal direction, the principal direction of neighborhood histogram of gradients are the peak value of statistic histogram.
Embodiment five
As shown in fig. 6, what the left side represented is 8 × 8 pixel sizes, 2 × 2 sub-regions form the example of SIFT description. Each little lattice represents a pixel of crucial vertex neighborhood place metric space, and the direction of arrow represents pixel gradient direction, Arrow length represents the amplitude of the pixel.The gradient orientation histogram in 8 directions is calculated in 4 × 4 pixel size windows, is drawn One seed point of cumulative formation of each gradient direction.The right represent key point formed by 4 seed points, each seed point by 4 × 4 pixel statistics direction histogram is formed, and the characteristic vector of formation is 2 × 2 × 8=32 dimensions.Actual pixel size is 16 × 16,4 × 4=16 seed point can be formed, characteristic vector is tieed up for 4 × 4 × 8=128.
For a better understanding of the present invention, it is described in detail above in association with the specific embodiment of the present invention, but is not Limitation of the present invention.Every technical spirit according to the present invention still belongs to any simple modification made for any of the above embodiments In the scope of technical solution of the present invention.What in this specification, each embodiment was stressed be it is different from other embodiments it Place, same or analogous part cross-reference between each embodiment.For system embodiment, due to itself and method Embodiment is corresponded to substantially, so description is fairly simple, related part is illustrated referring to the part of embodiment of the method.
The method of the present invention, device and system may be achieved in many ways.For example, software, hardware, firmware can be passed through Or any combinations of software, hardware, firmware are realizing the method for the present invention and system.For above-mentioned the step of methods described Order is not limited to order described in detail above merely to illustrate, the step of the method for the present invention, unless with other sides Formula is illustrated.Additionally, in certain embodiments, also the present invention can be embodied as recording program in the recording medium, these Program is included for realizing the machine readable instructions of the method according to the invention.Thus, the present invention also covers storage for performing The recording medium of the program of the method according to the invention.
Description of the invention is given for the sake of example and description, and is not exhaustively or by the present invention It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Select and retouch It is, for the principle and practical application that more preferably illustrate the present invention, and one of ordinary skill in the art is managed to state embodiment The present invention is solved so as to design the various embodiments with various modifications for being suitable to special-purpose.

Claims (10)

1. a kind of object identification method based on marking area bag of words, comprises the following steps:
Step 1:Corner Detection;
Step 2:The marking area of positioning image;
Step 3:SIFT feature is extracted;
Step 4:Image area characteristics similarity system design.
2. the object identification method based on marking area bag of words as claimed in claim 1, it is characterised in that:The angle point For ShiTomasi angle points, its rate of change for passing through to calculate gradient direction is calculated.
3. the object identification method based on marking area bag of words as claimed in claim 2, it is characterised in that:It is described ShiTomasi angle points are that brightness of image changes violent or curvature point very greatly.
4. the object identification method based on marking area bag of words as claimed in claim 1, it is characterised in that:The step 2 is the region that positioning image key area is converted to angle point distribution in image.
5. the object identification method based on marking area bag of words as claimed in claim 4, it is characterised in that:The positioning Method is:Image is divided into into m × n blocks, the angle point quantity included per block is recorded a m × n by the quantity of angle point in counting per block In matrix.If piecemeal interior angle point quantity >=q, it is believed that continuous concentrated area that angle point is located is image key area, and wherein q is to sentence The threshold value of disconnected piecemeal interior angle point quantity, for screening the background area comprising isolated or a small amount of angle point.
6. the object identification method based on marking area bag of words as claimed in claim 1, it is characterised in that:The step 3 include that DoG extreme points extractions and characteristic vector are formed.
7. the object identification method based on marking area bag of words as claimed in claim 6, it is characterised in that:The DoG The method of extreme points extraction is by scaling being carried out to original image, multi-scale image space representation sequence being obtained, is completed Feature extraction on different resolution.
8. the object identification method based on marking area bag of words as claimed in claim 7, it is characterised in that:The yardstick Including at least one of large scale and little yardstick.
9. the object identification method based on marking area bag of words as claimed in claim 8, it is characterised in that:The big chi Degree (low resolution), embodies the general picture feature of object.
10. the object identification method based on marking area bag of words as claimed in claim 9, it is characterised in that:It is described little Yardstick (high-resolution), embodies the minutia of object.
CN201610921396.2A 2016-10-21 2016-10-21 A kind of object identification method based on marking area bag of words Pending CN106557779A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610921396.2A CN106557779A (en) 2016-10-21 2016-10-21 A kind of object identification method based on marking area bag of words

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610921396.2A CN106557779A (en) 2016-10-21 2016-10-21 A kind of object identification method based on marking area bag of words

Publications (1)

Publication Number Publication Date
CN106557779A true CN106557779A (en) 2017-04-05

Family

ID=58443857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610921396.2A Pending CN106557779A (en) 2016-10-21 2016-10-21 A kind of object identification method based on marking area bag of words

Country Status (1)

Country Link
CN (1) CN106557779A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169576A1 (en) * 2008-12-31 2010-07-01 Yurong Chen System and method for sift implementation and optimization
CN101840507A (en) * 2010-04-09 2010-09-22 江苏东大金智建筑智能化***工程有限公司 Target tracking method based on character feature invariant and graph theory clustering
CN102043960A (en) * 2010-12-03 2011-05-04 杭州淘淘搜科技有限公司 Image grey scale and gradient combining improved sift characteristic extracting method
CN102156888A (en) * 2011-04-27 2011-08-17 西安电子科技大学 Image sorting method based on local colors and distribution characteristics of characteristic points
CN102682132A (en) * 2012-05-18 2012-09-19 合一网络技术(北京)有限公司 Method and system for searching information based on word frequency, play amount and creation time
US20120301014A1 (en) * 2011-05-27 2012-11-29 Microsoft Corporation Learning to rank local interest points
CN102865859A (en) * 2012-09-21 2013-01-09 西北工业大学 Aviation sequence image position estimating method based on SURF (Speeded Up Robust Features)
CN103077512A (en) * 2012-10-18 2013-05-01 北京工业大学 Feature extraction and matching method and device for digital image based on PCA (principal component analysis)
CN103295240A (en) * 2013-06-26 2013-09-11 山东农业大学 Method for evaluating similarity of free-form surfaces
CN103530603A (en) * 2013-09-24 2014-01-22 杭州电子科技大学 Video abnormality detection method based on causal loop diagram model
CN104166853A (en) * 2014-04-24 2014-11-26 中国人民解放军海军航空工程学院 Method for quickly extracting regularized ship section from high resolution remote sensing image
CN105631892A (en) * 2016-02-23 2016-06-01 武汉大学 Aviation image building damage detection method based on shadow and texture characteristics
CN105631471A (en) * 2015-12-23 2016-06-01 西安电子科技大学 Aurora sequence classification method with fusion of single frame feature and dynamic texture model
CN105856230A (en) * 2016-05-06 2016-08-17 简燕梅 ORB key frame closed-loop detection SLAM method capable of improving consistency of position and pose of robot

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169576A1 (en) * 2008-12-31 2010-07-01 Yurong Chen System and method for sift implementation and optimization
CN101840507A (en) * 2010-04-09 2010-09-22 江苏东大金智建筑智能化***工程有限公司 Target tracking method based on character feature invariant and graph theory clustering
CN102043960A (en) * 2010-12-03 2011-05-04 杭州淘淘搜科技有限公司 Image grey scale and gradient combining improved sift characteristic extracting method
CN102156888A (en) * 2011-04-27 2011-08-17 西安电子科技大学 Image sorting method based on local colors and distribution characteristics of characteristic points
US20120301014A1 (en) * 2011-05-27 2012-11-29 Microsoft Corporation Learning to rank local interest points
CN102682132A (en) * 2012-05-18 2012-09-19 合一网络技术(北京)有限公司 Method and system for searching information based on word frequency, play amount and creation time
CN102865859A (en) * 2012-09-21 2013-01-09 西北工业大学 Aviation sequence image position estimating method based on SURF (Speeded Up Robust Features)
CN103077512A (en) * 2012-10-18 2013-05-01 北京工业大学 Feature extraction and matching method and device for digital image based on PCA (principal component analysis)
CN103295240A (en) * 2013-06-26 2013-09-11 山东农业大学 Method for evaluating similarity of free-form surfaces
CN103530603A (en) * 2013-09-24 2014-01-22 杭州电子科技大学 Video abnormality detection method based on causal loop diagram model
CN104166853A (en) * 2014-04-24 2014-11-26 中国人民解放军海军航空工程学院 Method for quickly extracting regularized ship section from high resolution remote sensing image
CN105631471A (en) * 2015-12-23 2016-06-01 西安电子科技大学 Aurora sequence classification method with fusion of single frame feature and dynamic texture model
CN105631892A (en) * 2016-02-23 2016-06-01 武汉大学 Aviation image building damage detection method based on shadow and texture characteristics
CN105856230A (en) * 2016-05-06 2016-08-17 简燕梅 ORB key frame closed-loop detection SLAM method capable of improving consistency of position and pose of robot

Similar Documents

Publication Publication Date Title
Sun et al. PBNet: Part-based convolutional neural network for complex composite object detection in remote sensing imagery
Liao et al. Rotation-sensitive regression for oriented scene text detection
Mu et al. Discriminative local binary patterns for human detection in personal album
CN108319964B (en) Fire image recognition method based on mixed features and manifold learning
CN101976258B (en) Video semantic extraction method by combining object segmentation and feature weighing
CN103077512B (en) Based on the feature extracting and matching method of the digital picture that major component is analysed
JP5604256B2 (en) Human motion detection device and program thereof
CN101714254A (en) Registering control point extracting method combining multi-scale SIFT and area invariant moment features
CN102622607A (en) Remote sensing image classification method based on multi-feature fusion
CN110263712A (en) A kind of coarse-fine pedestrian detection method based on region candidate
CN103390164A (en) Object detection method based on depth image and implementing device thereof
CN104182973A (en) Image copying and pasting detection method based on circular description operator CSIFT (Colored scale invariant feature transform)
CN104574401A (en) Image registration method based on parallel line matching
CN102945374B (en) Method for automatically detecting civil aircraft in high-resolution remote sensing image
CN106682641A (en) Pedestrian identification method based on image with FHOG- LBPH feature
Li et al. Place recognition based on deep feature and adaptive weighting of similarity matrix
CN104881671A (en) High resolution remote sensing image local feature extraction method based on 2D-Gabor
CN103632149A (en) Face recognition method based on image feature analysis
CN110826575A (en) Underwater target identification method based on machine learning
Qi et al. Exploring illumination robust descriptors for human epithelial type 2 cell classification
CN107784263A (en) Based on the method for improving the Plane Rotation Face datection for accelerating robust features
CN105760828A (en) Visual sense based static gesture identification method
Su et al. Object detection in aerial images using a multiscale keypoint detection network
CN110097067B (en) Weak supervision fine-grained image classification method based on layer-feed feature transformation
CN105139013A (en) Object recognition method integrating shape features and interest points

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170405