CN103942797B - Scene image text detection method and system based on histogram and super-pixels - Google Patents

Scene image text detection method and system based on histogram and super-pixels Download PDF

Info

Publication number
CN103942797B
CN103942797B CN201410168244.0A CN201410168244A CN103942797B CN 103942797 B CN103942797 B CN 103942797B CN 201410168244 A CN201410168244 A CN 201410168244A CN 103942797 B CN103942797 B CN 103942797B
Authority
CN
China
Prior art keywords
pixel
edge
module
stroke width
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410168244.0A
Other languages
Chinese (zh)
Other versions
CN103942797A (en
Inventor
张永铮
周宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201410168244.0A priority Critical patent/CN103942797B/en
Publication of CN103942797A publication Critical patent/CN103942797A/en
Application granted granted Critical
Publication of CN103942797B publication Critical patent/CN103942797B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to a scene image text detection method based on a histogram and super-pixels. The scene image text detection method comprises the steps that firstly, stroke width values of text which may exist in a target image are estimated, and the stroke histogram is generated on the basis of the stroke width values; secondly, edge detection is conducted on the target image, comparison and correction are conducted, and connected domains with the highest edge detection quality are obtained; thirdly, skeletonization is conducted on the connected domains, skeleton pixels are obtained, and a high-precision stroke width is estimated according to the skeleton pixels; fourthly, characters and non-characters are filtered according to the high-precision stroke width; fifthly, the characters and the non-characters are further filtered through spatial distribution of the connected domains by means of geometric constraint, and text lines and non-text lines are filtered; sixthly, detection of the characters and the text lines in the target image is completed. According to the scene image text detection method based on the histogram and the super-pixels, a high-speed and high-precision stroke width calculation method is provided, and therefore precision and efficiency of filtering the connected domains between text and non-text can be improved.

Description

Scene image words detection method based on rectangular histogram and super-pixel and system
Technical field
The present invention relates to the scene image words detection method based on rectangular histogram and super-pixel and system, belong to information security And computer vision field.
Background technology
In recent years, with the increase of the mobile device of built-in camera, all kinds of number of pictures shooting in natural scene become Explosive increase.A lot of very valuable applications, for example: the picture query based on Word message, intelligent driving auxiliary, vision Understanding reading auxiliary and scene of obstacle personnel etc., all relies on the method obtaining Word message from picture.Therefore, natural Word Input in scene and identification, as the key problem processing this new data source, become computer vision in recent years and grind The much-talked-about topic studied carefully.
Character detecting method includes the method based on connected domain analysis and the method based on sliding window.Divided based on connected domain The method of analysis by being analyzed to the connected domain in picture, and by filtering to text space distribution constraint and geometrical property Character and non-character.Epshtein etc. [1] proposes to extract the edge in picture using edge detection algorithm, and using gradient letter Breath etc. comes as classification foundation calculating " stroke " width of these edge compositing areas;On the basis of epshtein work, Huang Lin etc. [2] proposes to need when calculating " stroke " width to keep the colour consistency of " stroke ", and is retouched using covariance State symbol the line of text detecting and character are filtered.The algorithm of another kind of text detection mainly passes through sliding window cause for gossip Existing, what such as cunzhao shi etc. [3] proposed constructs the tree construction text detection calculation based on part using histogram of gradients Method;What jung etc. [4] proposed carries out multiple dimensioned text detection using stroke wave filter.With the method phase based on sliding window The method computation complexity based on connected domain is low, but compares the quality depending on rim detection, in illumination complexity and picture for ratio In the relatively low environment of quality, effect is slightly poor.Because the species of the text color in scene image and font etc., change are more, and Method based on sliding window needs to be based on analysis in multiple yardsticks to image, and therefore, the method computation complexity is higher, and Usually need a big training set that grader is trained.In the method based on connected domain analysis, wide based on " stroke " The algorithm of degree obtains a lot of concerns due to its simplicity and effectiveness, and occurs in that some innovatory algorithm to this algorithm. However, in the case that word is more by partial occlusion or noise, the degree of accuracy by rim detection and gradient estimation is affected, The performance of these algorithms is not still very good.
Content of the invention
The technical problem to be solved is to use super-pixel correction edge in complex environment for prior art The deficiency that detection was lost efficacy, provide a kind of improve the recall rate of detection algorithm and accuracy rate based on stroke width rectangular histogram and super picture The scene image words detection method of element.
The technical scheme is that the scene image words based on rectangular histogram and super-pixel Detection method, specifically includes following steps:
Step 1: estimation is carried out to text width value that may be present in Target Photo and obtains stroke width value, based on stroke Width value generates a stroke rectangular histogram;
Step 2: the stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel;Target Photo is carried out Rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain here In the case of stroke width value, rim detection quality highest connected domain;
Step 3: skeletonizing is carried out to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is estimated Calculation obtains high accuracy stroke width;
Step 4: according to high accuracy stroke width, Target Photo is filtered, distinguish character and non-character, obtain character;
Step 5: further the character obtaining is filtered using geometrical constraint by the spatial distribution of connected domain, obtain Precisely character, and it is based on accurate character area partial objectives for text in picture row and non-textual row, obtain line of text;
Step 6: complete to the detection to accurate character and line of text in Target Photo.
The invention has the beneficial effects as follows: the local edge that the present invention is directed to the word in text detection problem improves edge inspection Mass metering;A kind of high speed and high-precision stroke width computational methods are proposed, to improve what word and non-legible connected domain filtered Precision and efficiency.
On the basis of technique scheme, the present invention can also do following improvement.
Further, also include step 7: the distance between each accurate character value in statistics line of text, set the word in word Distance threshold between symbol distance threshold and word;
Step 8: accurate character is divided into according to distance threshold between character distance threshold and word to line of text.
Beneficial effect using above-mentioned further scheme is, according to distance threshold between character distance threshold and word to line of text After being divided into character, facility can be provided for follow-up character recognition.
Further, the geometrical constraint described in described step 5 include stroke width concordance, the ratio of width to height, between connected domain Plyability etc..
Further, step 1 specifically includes following steps:
Step 1.1: be calculated the multiple edge pixels in Target Photo using canny edge detection operator;Use Sobel operator is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Step 1.2: edge pixel on the basis of an edge pixel, the gradient direction to reference edge pixel scans for The edge pixel of all presence;Judge whether the mapping edge pixel paired with reference edge pixel, if it does, execution Step 1.3;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.3: judge whether the Grad of mapping edge pixel and reference edge pixel gradient value difference value arrive at 150 degree Between 210 degree, if it is, execution step 1.4;Otherwise, delete this as the edge pixel of reference edge pixel, return and execute step Rapid 1.2;
Step 1.4: calculate the distance between mapping edge pixel and reference edge pixel and obtain stroke width value;
Step 1.5: judge whether also there is edge pixel, if it does, returning execution step 1.2;Otherwise, execution step 1.6;
Step 1.6: stroke rectangular histogram is generated based on the stroke width value that step 1.4 obtains.
Further, step 2 specifically includes following steps:
Step 2.1: select the larger several stroke width values of stroke rectangular histogram medium frequency as the step-size in search of super-pixel Value;
Step 2.2: search the lattice point obtaining that gap size is step-size in search value, select this lattice point minimum position of gradient nearby Put the initial barycenter as super-pixel;
Step 2.3: iteration execution step 2.1 and 2.2, update and calculate actual barycenter on picture for each super-pixel And border;
Step 2.4: reduce the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Step 2.5: compared in the border at edge on a large scale and super-pixel and revise, to revised edge on a large scale Remove the interference different from current stroke width, be met the edge on a large scale of the picture of stroke width rule;
Step 2.6: connected domain analysis are carried out to the edge on a large scale of picture, calculates the Euclidean distance conversion at edge on a large scale Figure (is calculated using the algorithms most in use that range conversion of the prior art is image), obtains the situation in this stroke width value Under, rim detection quality highest connected domain.
Further, described step 3, will be wherein terraced particularly as follows: calculate the gradient of Euclidean distance Transformation Graphs using sobel operator Degree is set to Skeleton pixel close to zero pixel;Estimation is carried out according to Skeleton pixel to stroke width value and obtains high accuracy stroke width Degree.
The technical problem to be solved is to use super-pixel correction edge in complex environment for prior art The deficiency that detection was lost efficacy, provide a kind of improve the recall rate of detection algorithm and accuracy rate based on stroke width rectangular histogram and super picture The scene image words detecting system of element.
The technical scheme is that the scene image words based on rectangular histogram and super-pixel Detecting system, comprising: estimation module, edge detection module, skeletonizing module, filtering module and secondary filter module;
Described estimation module carries out estimation to text width value that may be present in Target Photo and obtains stroke width value, base Generate a stroke rectangular histogram in stroke width value, and stroke rectangular histogram is sent to edge detection module;
Stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel by described edge detection module;To mesh Piece of marking on a map carries out rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and repaiies Just, obtain in the case of this stroke width value, rim detection quality highest connected domain;And the connected domain obtaining is sent to Skeletonizing module;
Described skeletonizing module carries out skeletonizing to connected domain, obtains Skeleton pixel;According to Skeleton pixel to stroke width Value carries out estimation and obtains high accuracy stroke width, and high accuracy stroke width is sent to filtering module;
Described filtering module filters to Target Photo according to high accuracy stroke width, distinguishes character and non-character, obtains To character;
Described secondary filter module is entered to the character obtaining using geometrical constraint further by the spatial distribution of connected domain Row filters, and obtains accurate character, and is based on accurate character area partial objectives for text in picture row and non-textual row, obtains line of text.
The invention has the beneficial effects as follows: the local edge that the present invention is directed to the word in text detection problem improves edge inspection Mass metering;A kind of high speed and high-precision stroke width computational methods are proposed, to improve what word and non-legible connected domain filtered Precision and efficiency.
On the basis of technique scheme, the present invention can also do following improvement.
Further, statistical module and segmentation module are also included;
Described statistical module is used for counting the distance between each accurate character value in line of text, sets the pitch character in word Distance threshold between threshold value and word;
Described segmentation module is divided into accurate character according to distance threshold between character distance threshold and word to line of text.
Further, the geometrical constraint described in described secondary filter module includes stroke width concordance, the ratio of width to height, connection Plyability between domain etc..
Further, described estimation module includes: gradient modules, the paired module of search, search mapping block and computing module;
Described gradient modules are calculated the multiple edge pixels in Target Photo using canny edge detection operator;Make It is calculated the Grad of Target Photo with sobel operator;Obtain the Grad of all edge pixels in Target Photo;
Described search paired module edge pixel on the basis of an edge pixel, to the gradient direction of reference edge pixel Scan for the edge pixel of all presence;Search for the mapping edge pixel paired with reference edge pixel;
Described search mapping block search Grad and reference edge pixel gradient value difference value are between 150 degree to 210 degree Mapping edge pixel, and the described mapping edge pixel obtaining is sent to computing module;
Described computing module is used for calculating the distance between mapping edge pixel and reference edge pixel and obtains stroke width Value.
Further, described edge detection module includes: step-length selecting module, barycenter selecting module, iteration update module, big Range detection module, correcting module and connected domain analysis module;
Described step-length selecting module selects the larger several stroke width values of stroke rectangular histogram medium frequency as super-pixel Step-size in search value;
Described barycenter selecting module searches the lattice point obtaining that gap size is step-size in search value, selects this lattice point gradient nearby Minimum position is as the initial barycenter of super-pixel;
Described iteration update module updates for iteration and calculates actual barycenter on picture for each super-pixel and side Boundary;
Described detection module on a large scale reduces the threshold value of canny edge detection operator, the new side on a large scale of detection picture Edge;
Compared and revise in the border at edge on a large scale and super-pixel by described correcting module, to revised on a large scale Edge removes the interference different from current stroke width, is met the edge on a large scale of the picture of stroke width rule;
Described connected domain analysis module is used for carrying out connected domain analysis to the edge on a large scale of picture, calculates edge on a large scale Euclidean distance Transformation Graphs (using range conversion of the prior art be image algorithms most in use calculated).
Further, described skeletonizing module, will specifically for calculating the gradient of Euclidean distance Transformation Graphs using sobel operator Wherein gradient is set to Skeleton pixel close to zero pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy pen Draw width.
Brief description
Fig. 1 is the scene image words detection method flow chart based on rectangular histogram and super-pixel of the present invention;
Fig. 2 is the tool based on step 1 in the scene image words detection method of rectangular histogram and super-pixel of the present invention Body flow chart;
Fig. 3 is the tool based on step 2 in the scene image words detection method of rectangular histogram and super-pixel of the present invention Body flow chart;
Fig. 4 is the scene image words detecting system structured flowchart based on rectangular histogram and super-pixel of the present invention.
In accompanying drawing, the list of parts representated by each label is as follows:
1st, estimation module, 2, edge detection module, 3, skeletonizing module, 4, filtering module, 5, secondary filter module, 6, system Meter module, 7, segmentation module, 11, gradient modules, 12, search for paired module, 13, search mapping block, 14, computing module, 21, Step-length selecting module, 22, barycenter selecting module, 23, iteration update module, 24, detection module on a large scale, 25, correcting module, 26th, connected domain analysis module.
Specific embodiment
Below in conjunction with accompanying drawing, the principle of the present invention and feature are described, example is served only for explaining the present invention, and Non- for limiting the scope of the present invention.
As shown in figure 1, being the scene image words detection method based on rectangular histogram and super-pixel of the present invention, specifically Comprise the following steps:
Step 1: estimation is carried out to text width value that may be present in Target Photo and obtains stroke width value, based on stroke Width value generates a stroke rectangular histogram;
Step 2: the stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel;Target Photo is carried out Rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain here In the case of stroke width value, rim detection quality highest connected domain;
Step 3: skeletonizing is carried out to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is estimated Calculation obtains high accuracy stroke width;
Step 4: according to high accuracy stroke width, Target Photo is filtered, distinguish character and non-character, obtain character;
Step 5: further the character obtaining is filtered using geometrical constraint by the spatial distribution of connected domain, obtain Precisely character, and it is based on accurate character area partial objectives for text in picture row and non-textual row, obtain line of text;
Step 6: complete to the detection to accurate character and line of text in Target Photo;
Step 7: the distance between each accurate character value in statistics line of text, set the character distance threshold in word and word Between distance threshold;
Step 8: accurate character is divided into according to distance threshold between character distance threshold and word to line of text.
Geometrical constraint described in described step 5 includes stroke width concordance, the ratio of width to height, the plyability between connected domain Deng.
As shown in Fig. 2 step 1 specifically includes following steps:
Step 1.1: be calculated the multiple edge pixels in Target Photo using canny edge detection operator;Use Sobel operator is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Step 1.2: edge pixel on the basis of an edge pixel, the gradient direction to reference edge pixel scans for The edge pixel of all presence;Judge whether the mapping edge pixel paired with reference edge pixel, if it does, execution Step 1.3;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.3: judge whether the Grad of mapping edge pixel and reference edge pixel gradient value difference value arrive at 150 degree Between 210 degree, if it is, execution step 1.4;Otherwise, delete this as the edge pixel of reference edge pixel, return and execute step Rapid 1.2;
Step 1.4: calculate the distance between mapping edge pixel and reference edge pixel and obtain stroke width value;
Step 1.5: judge whether also there is edge pixel, if it does, returning execution step 1.2;Otherwise, execution step 1.6;
Step 1.6: stroke rectangular histogram is generated based on the stroke width value that step 1.4 obtains.
As shown in figure 3, step 2 specifically includes following steps:
Step 2.1: select the larger several stroke width values of stroke rectangular histogram medium frequency as the step-size in search of super-pixel Value;
Step 2.2: search the lattice point obtaining that gap size is step-size in search value, select this lattice point minimum position of gradient nearby Put the initial barycenter as super-pixel;
Step 2.3: iteration execution step 2.1 and 2.2, update and calculate actual barycenter on picture for each super-pixel And border;
Step 2.4: reduce the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Step 2.5: compared in the border at edge on a large scale and super-pixel and revise, to revised edge on a large scale Remove the interference different from current stroke width, be met the edge on a large scale of the picture of stroke width rule;
Step 2.6: connected domain analysis are carried out to the edge on a large scale of picture, calculates the Euclidean distance conversion at edge on a large scale Figure (is calculated using the algorithms most in use that range conversion of the prior art is image).
Described step 3, will be close for wherein gradient particularly as follows: calculate the gradient of Euclidean distance Transformation Graphs using sobel operator Zero pixel is set to Skeleton pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy stroke width.
As shown in figure 4, being the scene image words detecting system based on rectangular histogram and super-pixel of the present invention, bag Include: estimation module 1, edge detection module 2, skeletonizing module 3, filtering module 4 and secondary filter module 5;
Described estimation module 1 carries out estimation to text width value that may be present in Target Photo and obtains stroke width value, One stroke rectangular histogram is generated based on stroke width value, and stroke rectangular histogram is sent to edge detection module 2;
Stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel by described edge detection module 2;Right Target Photo carries out rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and repaiies Just, obtain in the case of this stroke width value, rim detection quality highest connected domain;And the connected domain obtaining is sent to Skeletonizing module 3;
Described skeletonizing module 3 carries out skeletonizing to connected domain, obtains Skeleton pixel;According to Skeleton pixel to stroke width Value carries out estimation and obtains high accuracy stroke width, and high accuracy stroke width is sent to filtering module 4;
Described filtering module 4 filters to Target Photo according to high accuracy stroke width, distinguishes character and non-character, obtains To character;
Described secondary filter module 5 is entered to the character obtaining using geometrical constraint further by the spatial distribution of connected domain Row filters, and obtains accurate character, and is based on accurate character area partial objectives for text in picture row and non-textual row, obtains line of text.
Also include statistical module 6 and segmentation module 7;
Described statistical module 6 is used for counting the distance between each accurate character value in line of text, sets the character in word Distance threshold between distance threshold and word;
Described segmentation module 7 is divided into accurate character according to distance threshold between character distance threshold and word to line of text.
Geometrical constraint described in described secondary filter module 5 include stroke width concordance, the ratio of width to height, between connected domain Plyability etc..
Described estimation module 1 includes: gradient modules 11, the paired module 12 of search, search mapping block 13 and computing module 14;
Described gradient modules 11 are calculated the multiple edge pixels in Target Photo using canny edge detection operator; It is calculated the Grad of Target Photo using sobel operator;Obtain the Grad of all edge pixels in Target Photo;
Described search paired module 12 edge pixel on the basis of an edge pixel, to the gradient side of reference edge pixel To the edge pixel scanning for all presence;Search for the mapping edge pixel paired with reference edge pixel;
Described search mapping block 13 search Grad and reference edge pixel gradient value difference value 150 degree to 210 degree it Between mapping edge pixel, and the described mapping edge pixel obtaining is sent to computing module;
Described computing module 14 is used for calculating the distance between mapping edge pixel and reference edge pixel and obtains stroke width Angle value.
Described edge detection module 2 includes: step-length selecting module 21, barycenter selecting module 22, iteration update module 23, big Range detection module 24, correcting module 25 and connected domain analysis module 26;
Described step-length selecting module 21 selects the larger several stroke width values of stroke rectangular histogram medium frequency as super-pixel Step-size in search value;
Described barycenter selecting module 22 searches the lattice point obtaining that gap size is step-size in search value, selects this lattice point ladder nearby The minimum position of degree is as the initial barycenter of super-pixel;
Described iteration update module 23 update for iteration and calculate actual barycenter on picture for each super-pixel and Border;
Described detection module on a large scale 24 reduces the threshold value of canny edge detection operator, detection picture new on a large scale Edge;
Compared and revise in the border at edge on a large scale and super-pixel by described correcting module 25, to revised big model Peripheral edge removes the interference different from current stroke width, is met the edge on a large scale of the picture of stroke width rule;
Described connected domain analysis module 26 is used for carrying out connected domain analysis to the edge on a large scale of picture, calculates side on a large scale The Euclidean distance Transformation Graphs (being calculated using the algorithms most in use that range conversion of the prior art is image) of edge.
Described skeletonizing module 3, will be wherein terraced specifically for calculating the gradient of Euclidean distance Transformation Graphs using sobel operator Degree is set to Skeleton pixel close to zero pixel;Estimation is carried out according to Skeleton pixel to stroke width value and obtains high accuracy stroke width Degree.
The present invention mainly comprises two aspects: the local edge that (1) is directed to the word in text detection problem improves edge Detection quality;(2) a kind of high speed and high-precision stroke width computational methods are proposed, to improve word and non-legible connected domain mistake The precision of filter and efficiency.
Connected domain analysis method is belonged to based on the character detecting method of stroke width, this method is assumed in unified line of text Word stroke width roughly the same.The advantage of such method is simply, and does not need to make adjustment for language-specific. But, as such method belongs to the method for connected domain analysis with other, compare and depend on high-quality rim detection.In figure Piece noise is more, illumination condition is undesirable or word is blocked in the case that the rim detection that causes lost efficacy by railing etc., this kind of side French word Detection results are poor.Additionally, also there is the relatively low and slow problem of precision in the method.
For these problems, it is contemplated that using the super-pixel correction problem that rim detection lost efficacy in complex environment, And the stroke width computational methods proposing a kind of quick high accuracy are to improve the degree of accuracy communicating with filter domain and efficiency.This Bright inclusion herein below:
First with the pen to word that may be present in Target Photo for stroke width transform (swt) algorithm Draw width to be estimated, then set up a stroke width rectangular histogram using this information;
According in stroke rectangular histogram stroke width arrange super-pixel step parameter, experiment find stroke width value with When super-pixel step value is close, can effectively lift rim detection effect and partial occlusion and class text region can be removed;Afterwards, will The result of the boundary between super-pixel and canny rim detection is compared and is revised, following in certain stroke width to reach Edge detects quality highest effect;
Using range conversion and gradient operator by the connected domain detecting skeletonizing, using the skeleton picture obtaining after skeletonizing Element re-evaluates high-precision stroke width, using the foundation as filtering characters and non-character;
By the spatial distribution of connected domain utilize stroke width concordance, the ratio of width to height, the plyability between connected domain some Geometrical constraint comes further filtering characters and non-character, line of text and non-textual row;
Based on the experimental result on extensive public data collection it was demonstrated that the stroke width rectangular histogram proposing, super-pixel are calculated The effectiveness of the stroke width computational methods of method and the quick skeletonizing of connected domain.
The present invention based on stroke width histogrammic super-pixel initial method with based on the quick skeletonizing of connected domain Stroke width computational methods include following four step:
(1) calculate edge present in picture using canny edge detection operator.Calculated whole using sobel operator The gradient of pictures.Then whether paired edge pixel is had in this direction according to the gradient direction search of edge pixel.As mistake Paired edge pixel can be found and the gradient of this pixel and initial edge points gradient direction difference are between 150 degree and 210, then Calculate the distance between they and stroke width is set to the distance between they;
(2) stroke width obtaining in step () is utilized to generate stroke width rectangular histogram.Calculate complexity in order to reduce Degree, order and v are respectively average and the standard deviation that different histogram respective pixel are counted out, and histogrammic siding-to-siding block length h leads to Cross the l2 risk of computational minimization, that is, so that minimizing to determine;
(3) simple linear iterative clustering (slic) algorithm is used as super-pixel algorithm.Choosing Take the larger more main several stroke widths of stroke width rectangular histogram medium frequency as the step-size in search size of super-pixel and right Should select to be spaced apart the initial barycenter as super-pixel for the position of partial gradient minimum at the lattice point of step sizes in ground.Iteratively more New and calculate actual barycenter on picture for each super-pixel and border.The threshold value reducing canny edge detection operator is with more Complete detection goes out the edge in picture, then compares to revise these edges by the border with super-pixel, with removal and currently The different interference of stroke width, makes edge detection results meet stroke width rule as far as possible, improves rim detection effect.
(4) new edge detection results are carried out with connected domain analysis, and calculate the Euclidean distance Transformation Graphs at edge.Again Gradient using sobel operator computed range Transformation Graphs.Due to connected domain stroke center range conversion value change all than Relatively slow, so the pixel that wherein gradient is approximately zero is considered as Skeleton pixel.So far, can pass through these Skeleton pixels away from It is worth to the stroke width of connected domain from conversion.
Just high-precision connected domain stroke width has been obtained after above step.So far we can be according to each connection Whether the stroke width in domain unanimously carries out preliminary filtration to connected domain.Because in scene image, character seldom individually occurs, because This is filtered to these connected domains further using the individual features of line of text, such as character boundary in one text row, The ratio of width to height, stroke width and color should be close etc., and the connected domain being unsatisfactory for these constraints will be filtered.Finally according to literary composition The statistical value of the distance between each character in one's own profession, sets distance threshold between character distance threshold and word in word, and then by literary composition One's own profession is divided into character, is available for successive character identification module and uses.
For verifying effectiveness of the invention, have chosen common data sets icdar2005 and icdar2011 and tried Test.Icdar2005 data set comprises 509 colour pictures, resolution between 307 × 93 to 1280 × 960, wherein training set And have 258 and 251 pictures in test set respectively, have 1114 characters in picture.Icdar2011 data set comprises 484 Picture, including 229 training pictures and 255 test pictures, has 1189 characters.All experimental results are all based on text Row is carried out.The comparing result such as table 1 of the present invention and other main flow detection algorithms in recent years on icdar2005 and icdar2011 With shown in table 2, test result indicate that the present invention can obtain optimal Detection results.
Algorithm Accuracy rate Recall rate f-measure
The present invention 81% 67% 73%
epshtein[1] 73% 60% 66%
fabrizio[5] 46% 39% 43%
huang[2] 81% 74% 72%
yao[6] 69% 66% 67%
Table 1 present invention and Comparative result on icdar2005 for other algorithms
Algorithm Accuracy rate Recall rate f-measure
The present invention 80% 69% 74%
huang[2] 82% 75% 73%
neumann[7] 73% 65% 69%
yi[8] 76% 68% 67%
neumann[9] 67% 58% 62%
Table 1 present invention and Comparative result on icdar2011 for other algorithms
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and Within principle, any modification, equivalent substitution and improvement made etc., should be included within the scope of the present invention.

Claims (8)

1. the scene image words detection method based on rectangular histogram and super-pixel is it is characterised in that specifically include following steps:
Step 1: estimation is carried out to text width value that may be present in Target Photo and obtains stroke width value, based on stroke width Value generates a stroke rectangular histogram;
Step 2: the stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel;Edge is carried out to Target Photo Detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain in described pen In the case of drawing width value, rim detection quality highest connected domain;
Step 3: skeletonizing is carried out to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is carried out estimating To high accuracy stroke width;
Step 4: according to high accuracy stroke width, Target Photo is filtered, distinguish character and non-character, obtain character;
Step 5: further the character obtaining is filtered using geometrical constraint by the spatial distribution of connected domain, obtain precisely Character, and it is based on accurate character area partial objectives for text in picture row and non-textual row, obtain line of text;
Step 6: complete the detection to character accurate in Target Photo and line of text;
Described step 2 specifically includes following steps:
Step 2.1: select the larger several stroke width values of stroke rectangular histogram medium frequency as the step-size in search value of super-pixel;
Step 2.2: search the lattice point obtaining that gap size is step-size in search value, select this lattice point nearby to make the minimum position of gradient Initial barycenter for super-pixel;
Step 2.3: iteration execution step 2.1 and 2.2, update and calculate actual barycenter on picture for each super-pixel and side Boundary;
Step 2.4: reduce the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Step 2.5: compared in the border at edge on a large scale and super-pixel and revise, revised edge on a large scale is removed The interference different from current stroke width, is met the edge on a large scale of the picture of stroke width rule;
Step 2.6: connected domain analysis are carried out to the edge on a large scale of picture, calculates the Euclidean distance Transformation Graphs at edge on a large scale, Obtain in the case of described stroke width value, rim detection quality highest connected domain.
2. the scene image words detection method based on rectangular histogram and super-pixel according to claim 1 it is characterised in that Also include step 7: the distance between each accurate character value in statistics line of text, set between character distance threshold and the word in word Distance threshold;
Step 8: accurate character is divided into according to distance threshold between character distance threshold and word to line of text.
3. the scene image words detection method based on rectangular histogram and super-pixel according to claim 2 it is characterised in that Described step 3 particularly as follows: using sobel operator calculate Euclidean distance Transformation Graphs gradient, by wherein gradient close to zero pixel It is set to Skeleton pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy stroke width;
Geometrical constraint described in described step 5 includes stroke width concordance, the ratio of width to height, the plyability between connected domain.
4. the scene image words detection method based on rectangular histogram and super-pixel according to claim 3 it is characterised in that Step 1 specifically includes following steps:
Step 1.1: be calculated the multiple edge pixels in Target Photo using canny edge detection operator;Calculated using sobel Son is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Step 1.2: edge pixel on the basis of an edge pixel, scan for the gradient direction of reference edge pixel owning The edge pixel existing;Judge whether the mapping edge pixel paired with reference edge pixel, if it does, execution step 1.3;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.3: judge the Grad of mapping edge pixel and reference edge pixel gradient value difference value whether at 150 degree to 210 Between degree, if it is, execution step 1.4;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.4: calculate the distance between mapping edge pixel and reference edge pixel and obtain stroke width value;
Step 1.5: judge whether also there is edge pixel, if it does, returning execution step 1.2;Otherwise, execution step 1.6;
Step 1.6: stroke rectangular histogram is generated based on the stroke width value that step 1.4 obtains.
5. the scene image words detecting system based on rectangular histogram and super-pixel is it is characterised in that include: estimation module, edge Detection module, skeletonizing module, filtering module and secondary filter module;
Described estimation module carries out estimation to text width value that may be present in Target Photo and obtains stroke width value, based on pen Draw width value and generate a stroke rectangular histogram, and stroke rectangular histogram is sent to edge detection module;
Stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel by described edge detection module;To target figure Piece carries out rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain Arrive in the case of described stroke width value, rim detection quality highest connected domain;And the connected domain obtaining is sent to bone Frame module;
Described skeletonizing module carries out skeletonizing to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is entered Row estimation obtains high accuracy stroke width, and high accuracy stroke width is sent to filtering module;
Described filtering module filters to Target Photo according to high accuracy stroke width, distinguishes character and non-character, obtains word Symbol;
Described secondary filter module was carried out to the character obtaining using geometrical constraint further by the spatial distribution of connected domain Filter, obtains accurate character, and is based on accurate character area partial objectives for text in picture row and non-textual row, obtains line of text;
Described edge detection module includes: step-length selecting module, barycenter selecting module, iteration update module, on a large scale detection mould Block, correcting module and connected domain analysis module;
Described step-length selecting module selects the larger several stroke width values of stroke rectangular histogram medium frequency as the search of super-pixel Step value;
Described barycenter selecting module searches the lattice point obtaining that gap size is step-size in search value, and nearby gradient is minimum to select this lattice point Position as super-pixel initial barycenter;
Described iteration update module updates for iteration and calculates actual barycenter on picture for each super-pixel and border;
Described detection module on a large scale reduces the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Compared and revise in the border at edge on a large scale and super-pixel by described correcting module, to revised edge on a large scale Remove the interference different from current stroke width, be met the edge on a large scale of the picture of stroke width rule;
Described connected domain analysis module is used for the edge on a large scale of picture is carried out connected domain analysis, calculates the Europe at edge on a large scale Formula range conversion figure.
6. the scene image words detecting system based on rectangular histogram and super-pixel according to claim 5 it is characterised in that Also include statistical module and segmentation module;
Described statistical module is used for counting the distance between each accurate character value in line of text, and the character in setting word is apart from threshold Distance threshold between value and word;
Described segmentation module is divided into accurate character according to distance threshold between character distance threshold and word to line of text.
7. the scene image words detecting system based on rectangular histogram and super-pixel according to claim 6 it is characterised in that Described skeletonizing module specifically for calculating the gradient of Euclidean distance Transformation Graphs using sobel operator, will wherein gradient close to zero Pixel be set to Skeleton pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy stroke width;Described two Geometrical constraint described in secondary filtering module includes stroke width concordance, the ratio of width to height, the plyability between connected domain.
8. the scene image words detecting system based on rectangular histogram and super-pixel according to claim 7 it is characterised in that Described estimation module includes: gradient modules, the paired module of search, search mapping block and computing module;
Described gradient modules are calculated the multiple edge pixels in Target Photo using canny edge detection operator;Use Sobel operator is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Described search paired module edge pixel on the basis of an edge pixel, is carried out to the gradient direction of reference edge pixel Search for the edge pixel of all presence;Search for the mapping edge pixel paired with reference edge pixel;
Described search mapping block search Grad and reference edge pixel gradient value difference value reflecting between 150 degree to 210 degree Penetrate edge pixel, and the described mapping edge pixel obtaining is sent to computing module;
Described computing module is used for calculating the distance between mapping edge pixel and reference edge pixel and obtains stroke width value.
CN201410168244.0A 2014-04-24 2014-04-24 Scene image text detection method and system based on histogram and super-pixels Expired - Fee Related CN103942797B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410168244.0A CN103942797B (en) 2014-04-24 2014-04-24 Scene image text detection method and system based on histogram and super-pixels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410168244.0A CN103942797B (en) 2014-04-24 2014-04-24 Scene image text detection method and system based on histogram and super-pixels

Publications (2)

Publication Number Publication Date
CN103942797A CN103942797A (en) 2014-07-23
CN103942797B true CN103942797B (en) 2017-01-25

Family

ID=51190448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410168244.0A Expired - Fee Related CN103942797B (en) 2014-04-24 2014-04-24 Scene image text detection method and system based on histogram and super-pixels

Country Status (1)

Country Link
CN (1) CN103942797B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599275B (en) * 2015-01-27 2018-06-12 浙江大学 The RGB-D scene understanding methods of imparametrization based on probability graph model
CN105005764B (en) * 2015-06-29 2018-02-13 东南大学 The multi-direction Method for text detection of natural scene
CN106845474B (en) * 2015-12-07 2020-05-08 富士通株式会社 Image processing apparatus and method
CN107301651A (en) * 2016-04-13 2017-10-27 索尼公司 Object tracking apparatus and method
CN106446920B (en) * 2016-09-05 2019-10-01 电子科技大学 A kind of stroke width transform method based on gradient amplitude constraint
CN107844803B (en) * 2017-10-30 2021-12-28 ***股份有限公司 Picture comparison method and device
CN108573260A (en) * 2018-03-29 2018-09-25 广东欧珀移动通信有限公司 Information processing method and device, electronic equipment, computer readable storage medium
CN108921155A (en) * 2018-04-23 2018-11-30 新疆大学 A kind of hand script Chinese input equipment Uighur words Slant Rectify method
CN109117843B (en) * 2018-08-01 2022-04-15 百度在线网络技术(北京)有限公司 Character occlusion detection method and device
CN109472221A (en) * 2018-10-25 2019-03-15 辽宁工业大学 A kind of image text detection method based on stroke width transformation
CN110047083B (en) * 2019-04-01 2021-01-29 江西博微新技术有限公司 Image noise point identification method, server and storage medium
CN111639646B (en) * 2020-05-18 2021-04-13 山东大学 Test paper handwritten English character recognition method and system based on deep learning
CN111709419A (en) * 2020-06-10 2020-09-25 中国工商银行股份有限公司 Method, system and equipment for positioning banknote serial number and readable storage medium
CN112801088B (en) * 2020-12-31 2024-05-31 科大讯飞股份有限公司 Method and related device for correcting distorted text line image
CN117831037A (en) * 2024-01-04 2024-04-05 北京和气聚力教育科技有限公司 Method and device for determining answer condition of objective questions in answer sheet

Also Published As

Publication number Publication date
CN103942797A (en) 2014-07-23

Similar Documents

Publication Publication Date Title
CN103942797B (en) Scene image text detection method and system based on histogram and super-pixels
Lalimi et al. A vehicle license plate detection method using region and edge based methods
CN102999886B (en) Image Edge Detector and scale grating grid precision detection system
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
CN107045634B (en) Text positioning method based on maximum stable extremum region and stroke width
CN104361336A (en) Character recognition method for underwater video images
CN105260693A (en) Laser two-dimensional code positioning method
Paunwala et al. A novel multiple license plate extraction technique for complex background in Indian traffic conditions
CN106815583B (en) Method for positioning license plate of vehicle at night based on combination of MSER and SWT
CN108038481A (en) A kind of combination maximum extreme value stability region and the text positioning method of stroke width change
CN103208004A (en) Automatic recognition and extraction method and device for bill information area
CN109409356B (en) Multi-direction Chinese print font character detection method based on SWT
CN111353961B (en) Document curved surface correction method and device
CN107633491A (en) A kind of area image Enhancement Method and storage medium based on target detection
CN103793708A (en) Multi-scale license plate precise locating method based on affine correction
CN114299275A (en) Hough transform-based license plate inclination correction method
Hidayatullah et al. Optical character recognition improvement for license plate recognition in Indonesia
CN104008542A (en) Fast angle point matching method for specific plane figure
CN110335280A (en) A kind of financial documents image segmentation and antidote based on mobile terminal
Huang et al. Text detection and recognition in natural scene images
Wei et al. Detection of lane line based on Robert operator
Choudhury et al. A new zone based algorithm for detection of license plate from Indian vehicle
Ziaratban et al. An adaptive script-independent block-based text line extraction
Wang et al. Lane detection algorithm based on density clustering and RANSAC
CN109410227B (en) GVF model-based land utilization pattern spot contour extraction algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170125