CN104751171B - The naive Bayesian scanning certificate image classification method of feature based weighting - Google Patents

The naive Bayesian scanning certificate image classification method of feature based weighting Download PDF

Info

Publication number
CN104751171B
CN104751171B CN201510100700.2A CN201510100700A CN104751171B CN 104751171 B CN104751171 B CN 104751171B CN 201510100700 A CN201510100700 A CN 201510100700A CN 104751171 B CN104751171 B CN 104751171B
Authority
CN
China
Prior art keywords
certificate
image
rsqb
probability
scanning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510100700.2A
Other languages
Chinese (zh)
Other versions
CN104751171A (en
Inventor
龙军
祝莉媛
张昊
刘献如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201510100700.2A priority Critical patent/CN104751171B/en
Publication of CN104751171A publication Critical patent/CN104751171A/en
Application granted granted Critical
Publication of CN104751171B publication Critical patent/CN104751171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of naive Bayesian scanning certificate image classification method of feature based weighting, by to justifying Zhang Dingwei, segmentation, size adjustment through pretreated certificate imagery exploitation Hough transform, extract color feature vector and the image length breadth ratio of the HSV space in circle chapter region; Set up certificate image data base, each width certificate graphs picture in database is processed according to above-mentioned steps, obtain round chapter hsv color proper vector and the image length breadth ratio of every width scanning certificate graphs picture in database, calculate according to the proper vector obtained the probability that in certificate image data base, different pieces of information combination occurs, after weighting process, preserve data; The image category most possible according to the probability calculation image to be classified of different pieces of information combination appearance in NB Algorithm and certificate image data base, and this probability meets the threshold requirement of setting, judges the classification of picture; This method quickly and easily to certificate Images Classification, can improve certificate graphs as effectiveness of retrieval.

Description

The naive Bayesian scanning certificate image classification method of feature based weighting
Technical field
The present invention relates to a kind of image classification method, particularly be a kind of scanning certificate image classification method.
Background technology
Recent years, image retrieval is a very welcome topic, its searching object comprises trip in the sea, circle in the air on high and walk on the ground.Images Classification is a preprocessing process of image retrieval, effectively can improve the accuracy of image retrieval.Although existing numerous Images Classification searching systems for variety classes image data set, scanning certificate graphs is then paid close attention to less as systematic searching aspect, and the important auxiliary material that award or company expand applied for often by these scanning certificate graphs pictures.In order to ensure the legal utilization of this kind of certificate graphs picture, avoid repeatedly being utilized with a certificate, it is very important for some searching system that the scan image in special scanning certificate data collection is looked into heavy, and this is similar to a little the similarity inspection of file.The characteristics of image being applicable to popular content-based image classification searching system at present has color, texture, shape and spatial relation, but scanning certificate picture quality is low, of a great variety, format is various, both the logos with certain sense had been comprised, comprise again the brief and concise description for prize-winning situation, therefore, only utilizing existing algorithm will realize searching from large nuber of images storehouse, whether to there is the image file similar to certificate to be measured be inconvenient simultaneously.Therefore, we must make a concrete analysis of the feature of scan image, choose the feature stating certificate feature of image better.How computer technology is quick and precisely to annex testimonial material-scan image--and carrying out that similarity detects is national science technology evaluation of award problem in the urgent need to address.
Summary of the invention
The invention provides a kind of scanning certificate image classification method, can classify fast and effectively to certificate image, and the accuracy rate of certificate image retrieval can be significantly improved.
For achieving the above object, technical scheme of the present invention is as follows:
A naive Bayesian scanning certificate image classification method for feature based weighting, comprises the steps:
Step 1: set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
Step 2: read scanning certificate graphs picture to be sorted, carry out pre-service;
Step 3: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
Step 4: hsv color proper vector notable feature item is weighted;
Step 5: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Step 6: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting.The invention has the beneficial effects as follows: the naive Bayesian scanning certificate image classification method that the present invention is based on characteristic weighing, by to justifying Zhang Dingwei, segmentation, size adjustment through pretreated certificate imagery exploitation Hough transform, extract color feature vector and the image length breadth ratio of the HSV space in circle chapter region; Set up certificate image data base, each width certificate graphs picture in database is processed according to above-mentioned steps, obtain round chapter hsv color proper vector and the image length breadth ratio of every width scanning certificate graphs picture in database, calculate according to the proper vector obtained the probability that in certificate image data base, different pieces of information combination occurs, after weighting process, preserve data; The image category most possible according to the probability calculation image to be classified of different pieces of information combination appearance in NB Algorithm and certificate image data base, and this probability meets the threshold requirement of setting, judges the classification of picture; By this sorting technique, can classify to certificate image quickly and easily, effectively improve certificate graphs as effectiveness of retrieval.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of embodiment of the present invention image classification method.
Embodiment
Below in conjunction with accompanying drawing and example, the present invention will be further described.
See Fig. 1, the naive Bayesian scanning certificate image classification method of the present embodiment feature based weighting contains following steps: a kind of naive Bayesian scanning certificate image classification method of feature based weighting, comprises the steps:
A: input scanning certificate graphs picture to be sorted, carry out pre-service;
B: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
C: hsv color proper vector notable feature item is weighted;
D: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Each width certificate graphs picture in certificate image data base is according to above-mentioned steps A ~ D process, calculate and in database of record every class scanning certificate graphs picture prior probability and extract circle chapter region hsv color proper vector in the probability that occurs of different pieces of information combination, namely set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
E: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting;
The NB Algorithm that this method utilizes is as follows:
v NB = arg max P ( v j ) Π i P ( a i | v j )
P ( v j | L k ) = P ( v j ) Π i P ( L i | v j )
The target of this sorting technique obtains certificate graphs as most probable classification, P (v in the round chapter proper vector according to image to be classified j) be prior probability, as long as calculate each classification, to appear at the frequency of certificate image data base just passable.V nBrepresent the desired value that Naive Bayes Classifier exports.Generally, based on they probability on the training data, naive Bayesian learning method needs to estimate different P (v j) and P (a i| v j) item, these estimate correspondence hypothesis to be learned, and the rule then using naive Bayesian to propose is classified.The NB Algorithm that we use just is only to need the frequency of occurrences of different pieces of information combination in calculation training sample simply just passable with other sorting algorithm difference, does not need search.
(L k0, L k1... L k16) be hsv color proper vector and the picture length breadth ratio in the round chapter region of image to be checked, (L i0, L i2... L i16) be hsv color proper vector and the picture length breadth ratio in the round chapter region scanning certificate graphs picture in database.
In described steps A, pre-service utilizes existing noise filtering and sloped correcting method to carry out pre-service;
To the method through the existing round Zhang Dingwei of pretreated certificate imagery exploitation in described step B, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region, extract the hsv color proper vector in circle chapter region;
Concrete operation step is as follows:
1) utilize the method for existing round Zhang Dingwei, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region;
2) by colourity H, saturation degree S and brightness V tri-components respectively non-uniform quantizing be 8 parts, 4 parts and 4 parts:
H = 0 H ∈ [ 315,23 ] 1 H ∈ [ 24,50 ] 2 H ∈ [ 51,75 ] 3 H ∈ [ 76,155 ] 4 H ∈ [ 156,195 ] 5 H ∈ [ 196,275 ] 6 H ∈ [ 276,290 ] 7 H ∈ [ 290,316 ] S = 0 S ∈ [ 0,0.08 ] 1 S ∈ ( 0.08,0.4 ] 2 S ∈ ( 0.4,0.67 ] 3 S ∈ ( 0.67,1.0 ] V = 0 V ∈ [ 0,0.08 ] 1 V ∈ ( 0.08,0.4 ] 2 V ∈ ( 0.4,0.67 ] 3 V ∈ ( 0.67,1.0 ] ;
The HSV space in so round chapter region is divided into L h+ L s+ L vindividual interval, L h, L s, L vbe the quantification progression of H, S and V respectively, so we obtain the color feature vector of ten 6 DOFs, add scan image picture length breadth ratio, final extraction ten 7 degree of freedom proper vectors;
3) Nae Bayesianmethod adds up each data occurred, adds up the frequency that it occurs.For the ease of calculating, through repetition test, best effect can be obtained to the integer of all characteristics extraction one digit numbers.The ten 7 degree of freedom feature (L that this method is chosen k0, L k1... L k16) represent, span is the integer between [0,9].
In described step C, proper vector notable feature item is weighted.
Characteristics of image distribution has such characteristic: in same image category, if the statistical distribution of certain feature is than comparatively dense, dispersion degree is smaller, so this feature relatively and this classification be reigning, be an important feature.On the contrary, if certain characteristic statistics compares dispersion, dispersion degree is higher, is exactly a unessential feature.The standard deviation of data can the discrete case of data of description well.This method adopts standard deviation to weigh characteristics of image weight.W i={ w ko, w k1... w k16the weight of representation feature vector.In sample set, classification is the standard deviation sigma of i-th dimension of j i, its computing formula is:
σ i = Σ k = 1 n j ( L ki - x i ‾ ) / ( n j - 1 )
N jfor j class sample number, L kibe the i-th dimensional feature value of a kth sample of j for image category, for the mean value of this dimensional feature.Use e irepresentation feature importance, e i∈ [0,1] is formula: thus the computing method obtaining the every dimensional feature weighting of each sample are: w ki = e i / Σ i = 0 16 e i .
Wherein, calculate and record the probability that in the proper vector extracting circle chapter region, different pieces of information combination occurs, its concrete operation step is as follows:
1) probability that in statistical nature vector, different pieces of information occurs, such as the 1st class the 2nd ties up the probability of appearance 4 is 30%;
2) probable value obtained is multiplied by the weight calculated in step C, and the probability occurred as different pieces of information combination is preserved.
The naive Bayesian scanning certificate image classification method of feature based weighting, its concrete operation step is as follows:
1) according to probability and the NB Algorithm of the different pieces of information combination appearance obtained in step D, the probability that certificate graphs picture to be sorted is every class image is calculated.For example, assuming that A image is the 1st class image, there is numeral 4 in the 2nd dimension, in the probability that step D preserves, find corresponding probable value, calculated by occurred data assemblies according to the probability search of step D;
2) obtain the probability that certificate is each class, and maximal value is greater than threshold value, then judges that certificate is the classification of maximum probability.Threshold value is set as 0.048.
The present embodiment scans certificate graphs as classification results as following table.
Test picture number To classify correct number Classification error number Accuracy rate
One class software copyright scanning certificate graphs picture 10 10 0 100%
Two class software copyright scanning certificate graphs pictures 10 10 0 100%
Patent scanning certificate graphs picture 10 10 0 100%
Other interfering pictures 10 9 1 90%

Claims (5)

1. a naive Bayesian scanning certificate image classification method for feature based weighting, is characterized in that, comprise the steps:
Step 1: set up the likelihood probability index that a scanning certificate graphs combines as different pieces of information;
Step 2: read scanning certificate graphs picture to be sorted, carry out pre-service;
Step 3: justify Zhang Dingwei to through pretreated certificate imagery exploitation Hough transform, obtains circle chapter circumscribed rectangular region, extracts the hsv color proper vector in circle chapter region;
Step 4: hsv color proper vector notable feature item is weighted;
Step 5: calculate and record the probability that in the hsv color proper vector extracting circle chapter region, different pieces of information combination occurs;
Step 6: the likelihood probability index that the scanning certificate graphs obtained according to prior probability and the training process of the hsv color proper vector of image to be classified, every class scanning certificate graphs picture combines as different pieces of information, utilize NB Algorithm to calculate the classification situation of image to be classified, return the result of scanning certificate graphs picture as classification of the threshold requirement meeting setting;
Wherein, the concrete operation step of described step 3 is as follows:
1) utilize the method for existing round Zhang Dingwei, segmentation is carried out to the boundary rectangle of locating the round chapter place obtained and extracts, obtain circle chapter region;
2) by colourity H, saturation degree S and brightness V tri-components respectively non-uniform quantizing be 8 parts, 4 parts and 4 parts:
H = 0 H ∈ [ 315 , 23 ] 1 H ∈ [ 24 , 50 ] 2 H ∈ [ 51 , 75 ] 3 H ∈ [ 76 , 155 ] 4 H ∈ [ 156 , 195 ] 5 H ∈ [ 196 , 275 ] 6 H ∈ [ 276 , 290 ] 7 H ∈ [ 290 , 316 ] S = 0 S ∈ [ 0 , 0.08 ] 1 S ∈ ( 0.08 , 0.4 ] 2 S ∈ ( 0.4 , 0.67 ] 3 S ∈ ( 0.67 , 1.0 ] V = 0 V ∈ [ 0 , 0.08 ] 1 V ∈ ( 0.08 , 0.4 ] 2 V ∈ ( 0.4 , 0.67 ] 3 V ∈ ( 0.67 , 1.0 ] ;
The HSV space in so round chapter region is divided into L h+ L s+ L vindividual interval, L h, L s, L vbe the quantification progression of H, S and V respectively, obtain the color feature vector of ten 6 DOFs, add scan image picture length breadth ratio, final extraction ten 7 degree of freedom proper vectors;
3) the ten 7 degree of freedom feature (L extracted k0, L k1... L k16) represent, span is the integer between [0,9];
Wherein, the described step 4 pair concrete operation step that proper vector notable feature item is weighted is: adopt standard deviation to weigh characteristics of image weight, w i={ w ko, w k1... w k16the weight of representation feature vector, in sample set, classification is the standard deviation sigma of i-th dimension of j i, its computing formula is:
σ i = Σ k = 1 n j ( L k i - x i ‾ ) / ( n j - 1 )
N jfor j class sample number, L kibe the i-th dimensional feature value of a kth sample of j for image category, for the mean value of this dimensional feature,
Use e irepresentation feature importance, e i∈ [0,1] is formula: thus the computing method obtaining the every dimensional feature weighting of each sample are: w k i = e i / Σ i = 0 16 e i .
2. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, each the width certificate graphs picture in certificate image data base carries out processing obtaining to 5 according to step 2 by the likelihood probability index that step 1 foundation scanning certificate graphs combines as different pieces of information.
3. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, in described step 2, pre-service utilizes existing noise filtering and sloped correcting method.
4. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, described step 5 calculates and records the concrete operation step of probability that in the proper vector extracting circle chapter region, different pieces of information combination occurs and is: the probability that in statistical nature vector, different pieces of information occurs; The probable value obtained is multiplied by the weight calculated in step 4, and the probability occurred as different pieces of information combination is preserved.
5. the naive Bayesian scanning certificate image classification method of feature based weighting according to claim 1, it is characterized in that, described step 6 is specially: the probability occurred according to the different pieces of information combination obtained in step 5 and NB Algorithm, calculate the probability that certificate graphs picture to be sorted is every class image; Obtain the probability that certificate is each class, and maximal value is greater than threshold value, then judge that certificate is the classification of maximum probability, threshold value is set as 0.048.
CN201510100700.2A 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting Active CN104751171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510100700.2A CN104751171B (en) 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510100700.2A CN104751171B (en) 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting

Publications (2)

Publication Number Publication Date
CN104751171A CN104751171A (en) 2015-07-01
CN104751171B true CN104751171B (en) 2016-04-20

Family

ID=53590824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510100700.2A Active CN104751171B (en) 2015-03-09 2015-03-09 The naive Bayesian scanning certificate image classification method of feature based weighting

Country Status (1)

Country Link
CN (1) CN104751171B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117732B (en) * 2015-07-24 2018-09-07 中南大学 Scanning certificate image-recognizing method based on extreme learning machine
CN108416316B (en) * 2018-03-19 2022-04-05 中南大学 Detection method and system for black smoke vehicle
CN108596276A (en) * 2018-05-10 2018-09-28 重庆邮电大学 The naive Bayesian microblog users sorting technique of feature based weighting
US11080379B2 (en) 2019-02-13 2021-08-03 International Business Machines Corporation User authentication
CN110659654A (en) * 2019-09-24 2020-01-07 福州大学 Drawing duplicate checking and plagiarism preventing method based on computer vision
CN110907909B (en) * 2019-10-30 2023-09-12 南京市德赛西威汽车电子有限公司 Radar target identification method based on probability statistics
CN112150445B (en) * 2020-09-27 2023-12-15 西安工程大学 Yarn hairiness detection method based on Bayes threshold

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103745201A (en) * 2014-01-06 2014-04-23 Tcl集团股份有限公司 Method and device for program recognition
CN104079587A (en) * 2014-07-21 2014-10-01 深圳天祥质量技术服务有限公司 Certificate identification device and certificate check system
KR101477649B1 (en) * 2013-10-08 2014-12-30 재단법인대구경북과학기술원 Object detection device of using sampling and posterior probability, and the method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101477649B1 (en) * 2013-10-08 2014-12-30 재단법인대구경북과학기술원 Object detection device of using sampling and posterior probability, and the method thereof
CN103745201A (en) * 2014-01-06 2014-04-23 Tcl集团股份有限公司 Method and device for program recognition
CN104079587A (en) * 2014-07-21 2014-10-01 深圳天祥质量技术服务有限公司 Certificate identification device and certificate check system

Also Published As

Publication number Publication date
CN104751171A (en) 2015-07-01

Similar Documents

Publication Publication Date Title
CN104751171B (en) The naive Bayesian scanning certificate image classification method of feature based weighting
CN103297851B (en) The express statistic of object content and automatic auditing method and device in long video
CN101551856B (en) SAR target recognition method based on sparse least squares support vector machine
Chen et al. Using binarization and hashing for efficient SIFT matching
CN105184298A (en) Image classification method through fast and locality-constrained low-rank coding process
CN102867183B (en) Method and device for detecting littered objects of vehicle and intelligent traffic monitoring system
CN102236675A (en) Method for processing matched pairs of characteristic points of images, image retrieval method and image retrieval equipment
CN103678274A (en) Feature extraction method for text categorization based on improved mutual information and entropy
CN104252625A (en) Sample adaptive multi-feature weighted remote sensing image method
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN103473545A (en) Text-image similarity-degree measurement method based on multiple features
CN111339924B (en) Polarized SAR image classification method based on superpixel and full convolution network
CN112633392A (en) Terahertz human body security inspection image target detection model training data augmentation method
CN112926592A (en) Trademark retrieval method and device based on improved Fast algorithm
Huang et al. Superpixel-based change detection in high resolution sar images using region covariance features
CN108985346A (en) Fusion low layer pictures feature and showing for CNN feature survey image search method
CN104282012A (en) Wavelet domain based semi-reference image quality evaluating algorithm
CN103258187A (en) Television station caption identification method based on HOG characteristics
CN102929977A (en) Event tracing method aiming at news website
CN110688481A (en) Text classification feature selection method based on chi-square statistic and IDF
Liang et al. Multi-resolution local binary patterns for image classification
CN105303199A (en) Data fragment type identification method based on content characteristics and K-means
He et al. A new traffic signs classification approach based on local and global features extraction
Yin et al. Multispectral remote sensing image classification with multiple features
CN102662955A (en) Image retrieval method based on fractal image coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant