CN104112128A - Digital image processing system applied to bill image character recognition and method - Google Patents

Digital image processing system applied to bill image character recognition and method Download PDF

Info

Publication number
CN104112128A
CN104112128A CN201410276103.0A CN201410276103A CN104112128A CN 104112128 A CN104112128 A CN 104112128A CN 201410276103 A CN201410276103 A CN 201410276103A CN 104112128 A CN104112128 A CN 104112128A
Authority
CN
China
Prior art keywords
character
image
bill
character zone
zone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410276103.0A
Other languages
Chinese (zh)
Other versions
CN104112128B (en
Inventor
曾修远
苏永前
王彦红
程炜华
周程伟
赵文哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN201410276103.0A priority Critical patent/CN104112128B/en
Publication of CN104112128A publication Critical patent/CN104112128A/en
Application granted granted Critical
Publication of CN104112128B publication Critical patent/CN104112128B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a digital image processing system applied to bill image character recognition and a method. The system comprises an image parameter detection module used fro detecting bill image parameters, an image inclination detection module used for detecting the inclination degree of the bill in the bill image, a character region detection module used for positioning the character region, an identifiable character region detection module used for detecting character parameters and seal pixels in the character region, a character matching degree detection module used for detecting the matching degree between the characters and the template in the character region, and an opening feature detection module used for detecting opening features of the character in the character region. When the system and the method of the invention are adopted, bill image character recognition accuracy during the bill image analysis processing process of bank bill image exchange can be improved.

Description

Be applied to digital image processing system and the method for bill image character recognition
Technical field
The present invention relates to Digital Image Processing and optical character recognition field, relate in particular to the digital image processing system and the method that are applied to bill image character recognition.
Background technology
Development along with Digital Image Processing, pattern-recognition and artificial intelligence, optical character identification (Optical Character Recognition, OCR) technology is in financial field, and the aspects such as especially bill image nominal value key element legitimacy check, bill operation BPR have obtained increasing application.At present the main treatment scheme of Optical Character Recognition system as shown in Figure 1, substantially can be divided into image input, image pre-service, character feature extraction, character match and identify this four steps.
First step is image input, mainly utilize optical instrument to scan subject matter to be identified (as books, file, certificate etc.), thereby generate corresponding image data, optical instrument comprises scanner, facsimile recorder, digital camera or other cameras.Illumination condition during video generation, the factors such as the resolution of image, by effect and the precision of the follow-up identification of impact.
Second step is the pre-service of image, this is according to the feature that generates image previous stage, image is processed so that the character of follow-up phase extracts, mainly comprised that image color correcting, image slant correction, noise filtering and unification become black and white bigraph (bipartite graph) or gray-scale map by video conversion.Pre-service detail flow process, need to design according to the feature of image data and parameter, and for example image exists colour cast in various degree, and pretreatment stage just must comprise color and correct this step.
The 3rd, the 4th step is respectively character feature and extracts and mate identification, is mainly that character zone is on last stage cut apart according to character, then single character extracted to its character feature, for follow-up coupling identification is prepared.Current recognition methods mainly contains two kinds, one feature for statistics, the black/white pixel count ratio in posting field for example, when literal field is divided into several regions, this is the associating of region black/white pixel ratio one by one, a numerical value vector that has just become space, is called proper vector, when follow-up identification, only need compare with this proper vector.And the another kind of feature that is characterized as structure, after character image graph thinning, obtains the stroke end points of character, quantity and the position of point of crossing, or to take the character topology parameters such as stroke section be feature, then in feature database, compares and can obtain result.
As mentioned above, the main process structure of current character identifying method, especially the 3rd, the 4th step comparative maturity is stable, and affect the key of its recognition correct rate, be the pre-service of second step image, subject matter to be identified is because the difference of himself feature, and the scanning process of image data, the difference of illumination condition and the individual difference of imaging device when scanning generates, " noise " (factor that represents to affect character in recognition image to arrange hereinafter noise) that capital causes image to comprise many and diverse influences identification, a general recognition system is due to efficiency, performance, the cause of cost and feasibility, when design, often can also can not consider all factors that may affect recognition correct rate.This that is to say, the layout strategy of existing general-purpose system is the character of identifying as much as possible in various images, but the image parameters model of its foundation but can not be described all image problems, therefore tending to exist some image to fail to obtain suitable pre-service just flows into follow-up identification step, finally causes identification error.The check number identification problem in bank money image exchange for example, because it is uncertain that check exists in the process of using, often exist check number by seal, the problem that hand-written character covers, also there is illumination in the bill image simultaneously generating, difference in colour cast, but in the recognition system of versatility, these features are not described quantitatively and considered at present, so existing generic identification-system often can not be satisfactory for the recognition result of bill, be difficult to meet financial circles Data Enter, especially the strict demand of account category information typing, this is in fact also to hinder the key that optical character identification is further promoted in financial circles.
Summary of the invention
The embodiment of the present invention provides a kind of digital image processing system that is applied to bill image character recognition, and in order to improve bill image character recognition accuracy in the check image analysing computer processing procedure in the exchange of bank money image, this system comprises:
Image parameters detection module, for detection of bill image parameter;
Image tilt detection module, for detection of the inclined degree of nominal value in bill image;
Text RegionDetection module, for positioning character zone;
Character zone can recognition detection module, for detection of the character properties in character zone and seal pixel;
Character match degree detection module, for carrying out matching degree detection by the character of character zone and template;
Opening features detection module, for detection of the opening features of the character in character zone.
In an embodiment, described image parameters detection module specifically for:
Detect bill image and whether meet resolution, image size and image format requirement, and whether comprise complete bill picture, determine that bill is for landing processing not reaching while requiring.
In an embodiment, described image tilt detection module specifically for:
By the nominal value edge in scanning bill image, obtain nominal value angle of inclination, while being no more than threshold value at angle of inclination, tilt to correct and detect correction result, at angle of inclination, surpass threshold value or determine that bill is for landing processing when overcorrection still exists.
In an embodiment, described image tilt detection module specifically for:
By transversal scanning, record out the coordinate of point set at the middle part of check coboundary, according to the coordinate of record, carry out fitting a straight line, then according to the angle of inclination of matching coboundary straight line, bill image is carried out to bilinearity rotation.
In an embodiment, described Text RegionDetection module specifically for:
According to nominal value structure in bill image and nominal value drift condition, obtain position coordinates and the size of check number region in bill image, judge whether to be partitioned into check number region, if can not be partitioned into check number region, determine that bill is for landing processing.
In an embodiment, described Text RegionDetection module specifically for:
Measure the length on black background in the image upper right corner horizontal and vertical, to locate position fixing position really, the face of the value upper right corner; Delimit position and the size of the relative nominal value of character zone; By dynamic adjustment, determine concrete size and the position coordinates of character zone.
In an embodiment, described character zone can recognition detection module specifically for:
Each individual digit character in character zone is positioned and cut apart, detect number, the width in gap between numeral and numeral and highly whether to meet character parameter request; At HSV color space, character zone is scanned, detect the seal pixel of character zone; If meet character properties requirement or seal number of pixels, do not surpass threshold value, determine that bill is for landing processing.
In an embodiment, described character match degree detection module specifically for:
Each individual digit character and template that location in character zone is partitioned into are carried out matching degree detection, and compare with bill image character identification result.
In an embodiment, described opening features detection module specifically for:
Detect the opening features of each individual digit character that in character zone, location is partitioned into; When opening features, matching degree, recognition result are all consistent, determine character recognition success, otherwise determine that bill is for landing processing.
The embodiment of the present invention also provides a kind of digital image processing method that is applied to bill image character recognition, and in order to improve bill image character recognition accuracy in the check image analysing computer processing procedure in the exchange of bank money image, the method comprises:
Detect bill image parameter;
Detect the inclined degree of nominal value in bill image;
Character zone is positioned;
Detect character properties and seal pixel in character zone;
Character in character zone and template are carried out to matching degree detection;
Detect the opening features of the character in character zone.
In an embodiment, described detection bill image parameter, comprising:
Detect bill image and whether meet resolution, image size and image format requirement, and whether comprise complete bill picture, determine that bill is for landing processing not reaching while requiring.
In an embodiment, the inclined degree of nominal value in described detection bill image, comprising:
By the nominal value edge in scanning bill image, obtain nominal value angle of inclination, while being no more than threshold value at angle of inclination, tilt to correct and detect correction result, at angle of inclination, surpass threshold value or determine that bill is for landing processing when overcorrection still exists.
In an embodiment, described in tilt to correct, comprising:
By transversal scanning, record out the coordinate of point set at the middle part of check coboundary, according to the coordinate of record, carry out fitting a straight line, then according to the angle of inclination of matching coboundary straight line, bill image is carried out to bilinearity rotation.
In an embodiment, described character zone is positioned, comprising:
According to nominal value structure in bill image and nominal value drift condition, obtain position coordinates and the size of check number region in bill image, judge whether to be partitioned into check number region, if can not be partitioned into check number region, determine that bill is for landing processing.
In an embodiment, described character zone is positioned, comprising:
Measure the length on black background in the image upper right corner horizontal and vertical, to locate position fixing position really, the face of the value upper right corner; Delimit position and the size of the relative nominal value of character zone; By dynamic adjustment, determine concrete size and the position coordinates of character zone.
In an embodiment, the character properties in described detection character zone and seal pixel, comprising:
Each individual digit character in character zone is positioned and cut apart, detect number, the width in gap between numeral and numeral and highly whether to meet character parameter request; At HSV color space, character zone is scanned, detect the seal pixel of character zone; If meet character properties requirement or seal number of pixels, do not surpass threshold value, determine that bill is for landing processing.
In an embodiment, described character in character zone and template are carried out to matching degree detection, comprising:
Each individual digit character and template that location in character zone is partitioned into are carried out matching degree detection, and compare with bill image character identification result.
In an embodiment, the opening features of the character in described detection character zone, comprising:
Detect the opening features of each individual digit character that in character zone, location is partitioned into; When opening features, matching degree, recognition result are all consistent, determine character recognition success, otherwise determine that bill is for landing processing.
In the embodiment of the present invention, be applied to digital image processing system and the method for bill image character recognition, can in the check image analysing computer processing procedure of bank money image exchange, improve bill image character recognition accuracy, can meet financial circles Data Enter, especially the strict demand of account category information typing, is conducive to optical character identification and is further promoted in financial circles.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.In the accompanying drawings:
Fig. 1 is the processing flow chart of existing character recognition system in background technology;
Fig. 2 is applied to the schematic diagram of the digital image processing system of bill image character recognition in the embodiment of the present invention;
Fig. 3 is the vertical projection result exemplary plot in check number region in the embodiment of the present invention;
Fig. 4 is the schematic diagram of character and template matches in the embodiment of the present invention;
Fig. 5 is the exemplary plot of embodiment of the present invention split shed structure detection;
Fig. 6 is applied to the flow example figure of the digital image processing method of bill image character recognition in the embodiment of the present invention.
Embodiment
For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with accompanying drawing, the embodiment of the present invention is described in further details.At this, schematic description and description of the present invention is used for explaining the present invention, but not as a limitation of the invention.
The embodiment of the present invention has overcome the shortcoming of prior art, a kind of digital image processing system that is applied to bill image character recognition is provided, this system can improve bill image character recognition accuracy in the check image analysing computer processing procedure of bank money image exchange, particularly improves cashier's check numbering recognition correct rate.
In implementation process, the embodiment of the present invention is considered comparatively comprehensive, complete for the identification of check number, such as considering the general features of check such as nominal value structure as check, size dimension, font architecture feature; Also consider the noise factor of the impact identification correctness that check may run in using the process of circulation, for example, while using by seal cover, and other character overlaps, and because of facility environment difference cause as factors such as nominal value inclination, illumination disunities.Based on this, the embodiment of the present invention adopts the strategy of " image that detects as far as possible, filters out the noise that comprises impact identification ", increases the new system being comprised of some detections and correction module on the basis of existing recognition system.Fig. 2 is the schematic diagram that is applied to the digital image processing system of bill image character recognition in the embodiment of the present invention.As shown in Figure 2, this system comprise image parameters detection module, image tilt detection module, character zone (ROI) detection module, ROI can recognition detection module, character match degree detection module and opening features detection module.These detection modules are connected with the modules in existing recognition system (comprising video generation module, image pretreatment module, character feature extraction module and character feature matching module) respectively.It is input message that each detection module be take result and the intermediateness of a certain resume module in existing recognition system, check, filter or correct the noise that a part may affect next resume module result correctness, if reach examination criteria, follow-up identified/detected module is carried out in indication, otherwise, if a certain detection module detects the noise that can affect recognition result correctness, stop the identifying of this bill image, do and land processing, the implication processed of landing is artificial participation bill and processes.Finally, after identification, recognition result input database is stored.The function introduction of each module of digital image processing system that is applied to bill image character recognition is as follows:
Image parameters detection module: for detection of bill image parameter.For example, this module can be connected on video generation module rear, whether the bill image generating for detection of video generation module meets parameter and standard, comprise whether resolution, image size, image format and image comprise complete bill picture, if do not reach requirement, be judged to be and land processing.
Image slant correction module: for detection of the inclined degree of nominal value in bill image.For example, this module can quantitatively detect the nominal value inclined degree of the image generating, by nominal value coboundary in scan-image, obtain its angle of inclination, if angle of inclination is less, within the specific limits, for example, be no more than threshold value, tilt to correct and detect correction result, if angle of inclination excessive (surpass threshold value) or still have inclination through overcorrection is judged to be and lands processing.
Character zone (ROI) detection module: for character zone is positioned.For example, this module is responsible for ROI to position, the drift condition in image according to the structure of the face of the value and nominal value, obtain exact position coordinate and the size of check number region in image, in the process of location, judge whether the check number region that can cut apart, can not be partitioned into check number region and determine that bill is for landing processing; Can also Preliminary detection whether there is other characters and ROI region overlapping, if can not completely be partitioned into ROI region, be judged to be and land processing.
ROI can recognition detection module: for detection of the character properties in character zone and seal pixel.For example, this module is responsible for each the individual digit character in ROI to position and cut apart, detect numeral and its gap number, width, highly whether meet the requirements; In addition, at HSV color space, ROI region is scanned, detected ROI region and whether have the red or blue of seal, if character properties is above standard or has too much red or blue seal pixel, as number of pixels surpasses threshold value, be judged to be and land processing.
Character match degree detection module: for the character of character zone and template are carried out to matching degree detection.For example, the single character of result and template that this module is responsible for ROI to split in can recognition detection module are carried out matching degree detection, compare with the recognition result of character match identification module last in existing recognition system.
Opening features detection module: for detection of the opening features of the character in character zone.This module detects the opening features of individual digit character.If above-mentioned opening features in embodiment, these three results of matching degree and recognition result are consistent, to judge and identify successfully, whole system is moved complete, otherwise lands processing.
As previously mentioned, in order to solve the problem of prior art, the embodiment of the present invention processing stage of each of whole identifying before and after, check image qualitative or affect quantitatively recognition correct rate factor to existing is detected, is filtered, is processed or corrected.The embodiment of the present invention has increased some detection modules on the basis of prior art scheme, comprise image parameters detection module, image tilt detection module, character zone (ROI) detection module, ROI can recognition detection module, character match degree detection module and opening features detection module, they are connected on before and after existing system basic module, result to each existing system basic module detects, to guarantee the correctness of the recognition result of image.Specifically, it is input that image parameters detection module be take the image that video generation module generates, and detects the design parameter of its raw bill image; It is input that image tilt detection module be take the corrected image of image pretreatment module, detects the inclined degree of nominal value in the bill image of its generation and carries out redundancy correction; It is input that character zone (ROI) detection module and ROI can recognition detection module be take the corrected image of image pretreatment module, detects and extract ticket number region; The character bigraph (bipartite graph) that character match degree detection module can recognition detection module generates with ROI, and the recognition result of character feature matching module is input, to checking that recognition result does last detection.
In instantiation, image parameters detection module can require video generation module according to certain parameter setting, and the parameter to these images detects after video generation.While gathering image, can require scanning device is the flat bed scanner of current main flow, recommendation cuts the scanner of function with automatic image, the fi-5220c of Fujitsu high speed scanner for example, during scanning, make four limits of check image parallel with the scan box of scanner, the check image that scanning generates for example can possess following characteristics after testing as far as possible:
1, the chromatic image that image resolution ratio is 200dpi;
2, wide 1500 ± 100 pixels of image, high 650 ± 50 pixels (being all pixel hereinafter to arrange the unit of picture size size and coordinate);
3, image store form is a kind of in 24 JPG forms, tiff format, 256 look BMP forms, and recommending to format is 24 JPG forms;
4, in image, the face of the value is all high-visible, and the image edge background parts except the face of the value is ater, and rgb value is (0,0,0);
5, in bill image, check part does not significantly tilt with respect to whole image, check is process check before scanning, bill key element, check number especially to be identified are artificially deliberately altered (according to check operating specification, that site operation teller has a responsibility for when collecting bill is whether clear to bill, by correction, do not checked).
If the parameter of the check image generating does not reach above-mentioned standard, be judged to be and can not identify or rescan.
In instantiation, image tilt detection module can be responsible for processing the bill image that last module generates and finishes dealing with, the face of the value in image tilt is corrected, and filter out excessive, cannot tilt to correct and the abnormal image of image size.In embodiment, can quantitatively stipulate: check image angle of inclination surpasses ± 15 degree and is judged to be and can not identifies; Whole check image width surpasses 1400 to 1600 pixels, highly surpasses the scope of 600 to 700 pixels, is judged to be and can not identifies.The image that meets above-mentioned scope for parameter, tilt to correct, first can record out by transversal scanning the coordinate of point set at the middle part of check coboundary, according to coordinate, carry out fitting a straight line, according to the angle of inclination of matching coboundary straight line, image is carried out to bilinearity rotation again, thereby complete, tilt to correct.The idiographic flow that the medium dip of image tilt detection module is corrected and detected can be as follows:
1, detect the resolution Resolution of check image, height H eight, and width W idth.If width W idth is not in the scope of 1500 to 1600 pixels, or height H eight is not in the scope of 600 to 700 pixels, judges that this image is as identifying; If resolution Resolution is not 200dpi, judge that this image is as identifying;
2, image edge is processed: by the four edges edge of image, four hem width degree are that pixel on the edge of a pixel all becomes black picture element, and the rgb value that is about to these pixels is modified as (0,0,0);
3, check edge detection: first can be on horizontal ordinate, point centered by 1/2nd places on image width is determined the scope of Width * 0.5 ± 100 pixel in horizontal ordinate.Within the scope of this, from coordinate, be [Width * 0.5-100, 1] pixel starts, first at horizontal ordinate, fix, ordinate scans detection from the rgb value of the pixel in 1 to Height scope, when the pixel scanning and below the value of tri-passages of RGB of continuous two pixels be all less than (50, 50, 50) time, think and found the edge in face of the value region, record the coordinate of this point on edge, stop the scanning of this row pixel, and then be [Width * 0.5-100+1 from coordinate, 1] pixel starts, again at horizontal ordinate, fix, the pixel of the scope of ordinate from 1 to Height detects scanning, repeat this scanning step until scanned line by line [Width * 0.5-100, Width * 0.5+100] interior all pixels.While scanning any row pixel, if the ordinate of pixel has surpassed the rgb value of Height * 0.25 o'clock pixel, still over (50,50,50), do not stop scanning, and judge that this image can not scan; If the value of the ordinate of the value of the ordinate of the edge pixel finding for any one and a previous or rear edge pixel differs, surpass ± 2 pixels, stop scanning, and judge that this image cannot identify;
4, angle of inclination is calculated and is corrected: the coordinate that obtains the point set of check coboundary, utilize least square fitting to go out the Slope Parameters of the straight line of check coboundary, then with this Slope Parameters, obtain the face of the value with respect to the angle of inclination of check image, according to angle of inclination, take the center of image is the center of circle, and image is carried out to bilinearity rotation to correct.Obtain behind angle of inclination, if the angle tilting over ± 15 degree, is judged this image, can not identify; After rectification, repeat the 3rd, 4 steps and obtain the angle of inclination after image is corrected, if the angle of inclination of the face of the value surpasses ± 0.5 degree after proofreading and correct, judge that this image is as identifying.
In instantiation, character zone (ROI) detection module is after the check image that obtains correcting, need to be further positioned at the region-of-interest (ROI of the check number in the check upper right corner, take and hereinafter arrange ROI and in image, just comprise the region of check number, be i.e. a sub regions of image) so that next module is extracted character feature.Owing to there is uncertain black background, face of the value part is fixing in check image, and the total reason that has skew of later stage printing, and the relative coordinate systems in image of check number in the upper right corner is also not bery fixing.Position coordinates and size that this module is responsible for eliminating this uncertain shift factor and is located the definite region of check number.First can measure the length on black background in the image upper right corner horizontal and vertical, thereby the face of the value upper right corner, location is position fixing position really, then can delimit Position Approximate and the size of the relative nominal value in ROI region, finally by dynamically adjusting concrete size and the position coordinates of further determining ROI.Idiographic flow can be as follows:
1, upper right corner black background detects: consider that nominal value and image are all rectangles, and through over-rotation, correct, therefore first from the central point of image coboundary, be that coordinate is the pixel of [Width * 0.5,1], at horizontal ordinate, determine, the scope interscan of ordinate from 1 to Height, until run into the rgb value of continuous three pixels, be all less than (50,50,50), till, think and obtained the pixel of nominal value coboundary.The ordinate of the edge pixel now obtaining is labeled as Height_Blackground.Similarly, central point from image right hand edge, be that coordinate is [Width, Height * 0.5] pixel start, to ordinate fix, the pixel of horizontal ordinate in from Width to 1 to scope scan, until till running into the pixel of nominal value right hand edge that the rgb value of continuous three pixels is all less than (50,50,50).The difference of the image width W idth now obtaining and the horizontal ordinate of edge pixel is labeled as Width_Blackground;
2, be tentatively decided to be ROI region: check number region ROI is a rectangle that just comprises check number numerical portion.The initial value of the coordinate of rectangle upper left angle point can be thought based on experience value apart from 50 pixels of nominal value coboundary, apart from 300 pixels of face of the value right hand edge.The width ROI_Width in ROI region is 200 pixels, and height ROI_Height is 53 pixels;
3, dynamically adjust ROI region: first can choose the value of green (G) passage in tri-passages of RGB in preliminary fixed ROI region, obtain the gray-scale map in corresponding ROI region, be labeled as Gray_ROI.Then can adopt maximum variance between clusters (OSTU) to carry out adaptive threshold and cut apart according to this gray-scale map, obtain the binary picture in Initial R OI region, be labeled as Binary_ROI.The coordinate in the ROI upper left corner is labeled as [new_ROI_X, new_ROI_Y], wherein only has black picture element and white pixel, and the pixel value of black picture element is set to 0, and the pixel value of white pixel is set to 1.Obtain after the bigraph (bipartite graph) of ROI region, the edge that is a pixel to its four width successively scans.Left hand edge for example, the pixel that is [Weight-Weight_Blackground-300, Height+50] from coordinate on edge starts, and horizontal ordinate remains unchanged, the gray-scale value of the pixel of ordinate scope within the scope of from Height+50 to Height+50+53.If there is black picture element, a pixel moves right ROI left hand edge, being about to is [Weight-Weight_Blackground-300+1 from coordinate, Height+50] pixel start, horizontal ordinate remains unchanged, the pixel set (line segment) of ordinate scope within the scope of from Height+50 to Height+50+53 be as the new left hand edge of ROI, and again detects new edge and whether have black picture element.The left hand edge that so moves to right, until new left hand edge does not have black picture element.Adopt similar method, when lower limb, right hand edge and coboundary are dynamically adjusted, lower limb moves up, and right hand edge is moved to the left, and coboundary moves down.If new edge does not exist black picture element in mobile process, is all white pixel, think and found suitable edge and stopped mobile.If any edge newly still exists black picture element on edge after being moved beyond 10 pixels, be judged to be and can not identify.After adjustment, new Binary_ROI zone marker is new_Binary_ROI, and the coordinate of its upper left corner pixels is labeled as [new_ROI_X, new_ROI_Y].If the width new_ROI_Width of new_Binary_ROI is lower than 180 pixels, or height new_ROI_Height is lower than 45 pixels, is judged to be and can not identifies.
In instantiation, ROI can recognition detection module successfully obtain when ROI detection module after the accurate coordinates and size in ROI region, before further extracting the character feature of check number in ROI and identifying, can scan ROI region, to detect the noise of impact identifications such as whether having seal, hand-written character covering.Bank is used when carrying out check business operation, the seal using only has pure red or two kinds of colors of ethereal blue, therefore seal detects and pixel need to be transformed into HSV color space from rgb color space, and the pure red or pure blue to seal in color space detects.In face of the value structure, check number below is paying bank title, sometimes exist printing or hand-written paying bank title to be offset and to cover the situation of check number, can detect by the bigraph (bipartite graph) new_Binary_ROI in the ROI region of ROI detection module generation is carried out to vertical projection.Concrete implementing procedure can be as follows:
1, seal detects: first on image, determine the scope in ROI region, then, for each pixel in this region, convert the rgb value of pixel to HSV (Hue, Saturation, Value) value.Can arrange, pure red form and aspect (Hue) scope is [0,01] and [0.9,1]; In the form and aspect of pure blue (Hue) scope [0.55,0.65]; The saturation degree of seal pixel (Saturation) scope is [0.3,1]; The brightness of seal pixel (Value) scope is [0.6,1].For the HSV value of the pixel in any one ROI region, if any one in its HSV value reached scope above, think seal pixel (seal pixel is the pixel of pixel HSV value in above-mentioned scope), and give record.If scanned behind complete ROI region, the quantity of the seal pixel of record surpasses 25, can think that numbering is covered by seal, is judged to be and can not identifies;
2, covering character detects: obtain the binary picture new_Binary_ROI in the region that last module generates, obtain the vertical projection result of new_Binary_ROI according to the vertical projection function ROI_Projection (x) shown in below.Fig. 3 is the vertical projection result exemplary plot in check number region in this example.As shown in Figure 3, wherein x represents horizontal ordinate, and unit is pixel, functional value is the cumulative sum of the pixel value of the pixel set in each file of Binary_ROI, the pixel p ixel_value value of setting black picture element is 0, and the pixel value of white pixel is 1, and concrete formula is as follows:
ROI _ projection ( x ) = Σ i = 1 new _ ROI _ Height Pixel _ value ( i ) ;
Obtain after accumulation result, obviously can find out, the projection function value of the blank separated region of horizontal ordinate between character is constant result, it is new_ROI_Height * 1, the width of the abscissa zone in the region that continuous functional value Y is new_ROI_Height * 1 is exactly intercharacter interval width, between the continuum that two values are new_ROI_Height * 1, functional value, lower than the width in the region of new_ROI_Height * 1, is the width of character.By vertical projection function, measure out the width at 10 intervals (comprising head and the tail) and the width of 8 characters, the numeral of normal check number is 8, in the situation that resolution is 200dpi: the width of character is set in [8,20] in scope (unit is pixel), interval width is set in [4,12] in scope, the height setting of character is [25,32] in scope, if character number or gap number are not 8 and 10, or there are the width of any character or the width in gap not in above-mentioned scope, can think and be covered by other characters, be judged to be and can not identify.
In instantiation, character match degree detection module can obtain after recognition result by character match identification module, detects the reliability of recognition result.Result according to vertical projection function ROI_Projection (x) on the new_Binary_ROI region obtaining in Text RegionDetection module, can obtain the exact position of each numerical character in new_Binary_ROI.Can copy an independent copy to each individual digit character zone, compare and detect its matching degree with 0 to 9 Character mother plate, by the numerical value of this quantification of matching degree, quantitatively detect the reliability of recognition result.Concrete implementing procedure can be as follows:
1, single character detection and localization: according to the projection function ROI_Projection (x) in a upper module, from Binary_ROI region, left-to-right beginning scanned, N functional value is less than to the reference position coordinate record of continuum of new_ROI_Height * 1 at array Number_Star[N] in, and by the width record of this continuum at Number_Width[N] in.Wherein Number_Star and Number_Width are the array with 8 unit, 8 numerals of corresponding check number.In like manner according to the horizontal projection function ROI_horizon (y) in a upper module, obtain the coordinate of character top Number_Top and lower part Number_Down, and character height Number_Height.Obtain after coordinate, for the N in numbering, (N ∈ [1,8], check number only has 8) individual character, from Binary_ROI, take upper left corner coordinate as [Number_Star[N], Number_Top, size is for Number_Width[N] region of * Number_Height copies the copy that can obtain the binary picture of N character, is labeled as Num_Binary[N];
2, generating digital template: select the image that some is clear, there is no the check that seal etc. covers, through the processing of above-mentioned several modules and step, obtain the copy of binary picture of the some characters that split of the check number of check image.Select clearly 0 to 9 the copy of the binary picture of totally 10 characters as template, be labeled as Num_Template[M].Wherein M is 0 to 9 integer.In M template, be required to be the binary picture of digital M, the size of each digital template is required to be 25 * 53 simultaneously.If template is high or wide to not requiring, the direct white pixel that increases some row, column at coboundary and the left hand edge of character, until size reaches requirement.Template after having selected can Reusability, follow-up without regeneration, so this step only need carry out once, if but check font change, can repeat this step and generate new template.
3, single character matching degree detects: the binary picture that has produced numerical digit character according to upper two steps.For N numeral in check number, N is the integer between 1 to 8, by Num_Binary[N] with template Num_Template[M] mate one by one, Fig. 4 is the schematic diagram of character and template matches in this example, as shown in Figure 4, first by Num_Bianry[N] each pixel, with template Num_Template[0] in coordinate range be wide from 1 to Number_Width[N], high from 1 to Number_Height one with Num_Binary[N] pixel in region of same shape size is corresponding one by one.For the pixel pair of all correspondences, statistics is black equally, and is the numerical value of white pixel equally, and this numerical value is divided by Num_Binary[N] in after the quantity of pixel is normalized, the result after definition normalization is matching degree.And then by Num_Template[0] in the coordinates regional pixel that moves right, be about to wide from 2 to Number_Width[N]+1, high from 1 to Number_Height one with Num_Binary[N] same shape size region mates and counts matching degree.So mobile until Number_Width[N]+H equals template Num_Template[0] wide by 25, at this moment region is moved to the wide from 1 to Number_Width[N of template], high from 2 to Number_Height+1 range statistics matching degree, so until Num_Binary[N] added up after matching degree from all different same shape size area in template, select wherein the highest matching degree, be labeled as Match[0].Adopt and use the same method, again by Num_Binary[N] the template Num_Template[M all with other] mate and obtain corresponding matching degree Match[M], if Match[I at this moment] (I representative digit I) maximum, Num_Binary[N] can be identified as digital I, in like manner can obtain other numerals in numbering, at this moment for Num_Binary[N] maximum matching degree, be labeled as Max_Match[N].If I is not the recognition result of character match identification module output, think that incorrect being judged to be of result can not identify; If matching degree numerical value lower than 0.8, is thought result, incorrect being judged to be can not be identified.
In instantiation, by character match degree detection module, obtaining after recognition result, still need to be by the correctness of opening features detection module checking recognition result.For example, can detect according to the hatch frame feature in the region of each character binaryzation figure, if the testing result of hatch frame is consistent with the digital opening features identifying, think that identification is correct.Different digital opening features is also different, can detect and encode by the opening to four of a character open areas, if the digital opening features that the opening features of binary picture is corresponding with recognition result is not inconsistent, can judge that this image can not identify.Idiographic flow can be as follows:
1, hatch frame detects: Fig. 5 is the exemplary plot that hatch frame detects, and as shown in Figure 5, character binaryzation figure is divided into upper left, lower-left, bottom right and Si Ge region, upper right.First verify top left region: first at the center point P _ Top of the first half from character binaryzation figure, start with horizontal linear boundary scan left, sweep trace is recorded as L1, scans first black picture element and stops, and be recorded as P1.If scanned left hand edge, still there is no black picture element, directly think region, the upper left corner be opening and verify next region.If find P1, start to start horizontal scanning to the right from the mid point of the left hand edge of character binaryzation figure, sweep trace is recorded as L2, until run into first black picture element, stops, and is recorded as P2.If the horizontal ordinate of P2 is less than the horizontal ordinate of P1, from left hand edge mid point, a pixel (being the little pixel of ordinate) starts horizontal scanning, until the horizontal ordinate of first P2 finding is larger than the horizontal ordinate of P1.If the ordinate of starting point of scanning is 0 still not find P2, think not opening not of this region; When finding P2, on sweep trace L1, from P1, start to do vertical scan direction to L2, until meet L2, sweep trace is recorded as L3.If there is black picture element on L3, the pixel that moves to right of the P1 from L2 restarts to scan downward vertically as the starting point of L3.If until starting point still can not find a L3 while being P_Top, making the pixel above it is all white, thinks not opening of this region; Otherwise think opening; In like manner the lower left corner, the lower right corner, the upper right corner are judged;
2, opening coding: from top to bottom, from left to right opening testing result is encoded, region opening is labeled as 1, otherwise is labeled as 0, is encoded to four to opening detection architecture everywhere, and the opening coding of numeral 0,3,5,6,8,9 is as follows:
0:0000 3:1100 5:0101
6:0001 8:0000 9:0101
If certain character has been identified as the numeral of above-mentioned correspondence, but the result that opening detects does not but meet above-mentioned opening coding, is judged to be and can not identifies;
3, width detection: for the character that is identified as 1, if its width surpasses 12 pixels, be judged to be and can not identify; For the character that is identified as 4, if the width of character surpasses 16 pixels, be judged to be and can not identify; For the character that is identified as 2, every one-row pixels to its binary picture scans, if 3 row pixels of edge bottom, black picture element in every a line is less than 8/10, or in three row pixels in the middle of character binaryzation figure, in every a line, the quantity of black picture element is unnecessary 1/3, is judged to be and can not identifies; For the character that is identified as 7, if in three row pixels of mouth coboundary, the black picture element in every a line is less than 8/10, or in the pixel of every a line of the latter half of character binaryzation figure, the quantity of black picture element is unnecessary 1/3, is judged to be and can not identifies.
Based on same inventive concept, in the embodiment of the present invention, also provide a kind of digital image processing method that is applied to bill image character recognition, as described in the following examples.Because the principle that the method is dealt with problems is similar to the digital image processing system that is applied to bill image character recognition, therefore the enforcement of the method can, referring to the enforcement that is applied to the digital image processing system of bill image character recognition, repeat part and repeat no more.
The digital image processing method that is applied to bill image character recognition in the embodiment of the present invention can comprise:
Detect bill image parameter;
Detect the inclined degree of nominal value in bill image;
Character zone is positioned;
Detect character properties and seal pixel in character zone;
Character in character zone and template are carried out to matching degree detection;
Detect the opening features of the character in character zone.
During concrete enforcement, detect bill image parameter, can comprise:
Detect bill image and whether meet resolution, image size and image format requirement, and whether comprise complete bill picture, determine that bill is for landing processing not reaching while requiring.
During concrete enforcement, detect the inclined degree of nominal value in bill image, can comprise:
By the nominal value edge in scanning bill image, obtain nominal value angle of inclination, while being no more than threshold value at angle of inclination, tilt to correct and detect correction result, at angle of inclination, surpass threshold value or determine that bill is for landing processing when overcorrection still exists.
During concrete enforcement, tilt to correct, can comprise:
By transversal scanning, record out the coordinate of point set at the middle part of check coboundary, according to the coordinate of record, carry out fitting a straight line, then according to the angle of inclination of matching coboundary straight line, bill image is carried out to bilinearity rotation.
During concrete enforcement, character zone is positioned, can comprise:
According to nominal value structure in bill image and nominal value drift condition, obtain position coordinates and the size of check number region in bill image, judge whether to be partitioned into check number region, if can not be partitioned into check number region, determine that bill is for landing processing.
During concrete enforcement, character zone is positioned, can comprise:
Measure the length on black background in the image upper right corner horizontal and vertical, to locate position fixing position really, the face of the value upper right corner; Delimit position and the size of the relative nominal value of character zone; By dynamic adjustment, determine concrete size and the position coordinates of character zone.
During concrete enforcement, detect character properties and seal pixel in character zone, can comprise:
Each individual digit character in character zone is positioned and cut apart, detect number, the width in gap between numeral and numeral and highly whether to meet character parameter request; At HSV color space, character zone is scanned, detect the seal pixel of character zone; If meet character properties requirement or seal number of pixels, do not surpass threshold value, determine that bill is for landing processing.
During concrete enforcement, the character in character zone and template are carried out to matching degree detection, can comprise:
Each individual digit character and template that location in character zone is partitioned into are carried out matching degree detection, and compare with bill image character identification result.
During concrete enforcement, detect the opening features of the character in character zone, can comprise:
Detect the opening features of each individual digit character that in character zone, location is partitioned into; When opening features, matching degree, recognition result are all consistent, determine character recognition success, otherwise determine that bill is for landing processing.
As previously mentioned, the core concept of the embodiment of the present invention is on the basis of existing optical character recognition, before or after each step of existing identification process, increase independently detecting step respectively, these additional detecting steps are specifically designed to the noise that detection can hinder the correct identification of next identification step, if find noise factor, judge and can not identify, and do and land processing, thereby stop the possibility that identification makes mistakes.
Fig. 6 is applied to the flow example figure of the digital image processing method of bill image character recognition in the embodiment of the present invention.In Fig. 6, having provided image parameters detection, image tilt detection, Text RegionDetection, ROI can recognition detection, character match degree detects and opening features detecting step, and the combination between they and existing identification process.Each step in Fig. 6 has corresponding module in Fig. 2, and above-mentioned each detecting step can be realized its function by a corresponding standalone module in new system.The output of each step of identification process all can be accepted special detection, and similarly, the result of each testing process can be as Rule of judgment, and to control the next step that whether can carry out identification process, the idiographic flow step of whole method for example can comprise:
Step 1: generate bill image, this step is responsible for generating the digitized video copy that needs bill to be processed, therefore need to adopt optical imaging apparatus, as flat bed scanner etc. obtains the digitized video of bill, bill itself needs artificial visually examine to check, nominal value part must be clear be artificially altered, the image of generation must be also clear and legiblely to recognize.After generating image, perform step 2.
Step 2: detect bill image parameter, the bill image that this step 1 generates is input, is responsible for checking the design parameter of bill image, comprising: whether the resolution of image is 200dpi; Whether the size of image is 1500 ± 100 * 600 ± 50; Whether the image edge background parts except the face of the value is ater, and rgb value is (0,0,0).If the bill image generating is undesirable, is judged to be and can not identifies, and regenerate image or land processings, if by detection, carry out step 3.
Step 3: image pre-service, it is input that this step be take the bill image that step 1 generates, the image of being responsible for the bill to generating carries out color, illumination and slant correction, image image being caused to eliminate the factors such as different imaging devices, imaging circumstances and manual operation uncertainty, performs step 4 after pre-service.
Step 4: detect image and tilt, this step is usingd the pretreated image of process of step 3 output as input, be responsible for checking the inclined degree of nominal value part in image, (the separate and redundancy in function of this detecting step and step 3, and obtain higher reliability with this), detect the crossing angle (being less than the angle of 90 degree) between nominal value edge line and image edge straight line, the size of angle surpasses ± 15 degree if, judge that this bill image can not identify, if within ± 15 degree, utilize bilinear interpolation rotary process to proofread and correct, after correction, again check angle of inclination, if still there is the inclination that surpasses ± 1 degree, judge that image can not identify and land processing, if by detecting, carry out step 5.
Step 5: extract character feature, it is input that this step be take the pretreated image of step 3 output, be responsible for extracting the quantization characteristic of character to be identified, first location character region relative position in bill image in this step, then the relative position of each character in location character region in character zone one by one, and the feature of each character of quantification extraction, after extraction, perform step 6.
Step 6: detect character zone, it is input that this step be take the image through tilt detection and after proofreading and correct of step 4 output, be responsible for detecting the condition whether character zone to be identified meets further identification, the position of the upper right corner that first this step detects nominal value in bill image, the distance that is nominal value coboundary to image coboundary and nominal value right hand edge to image right hand edge, and based on ticket number to be identified region (being character zone) in the relatively-stationary feature in position at par, provide the initial position of character zone in image.Utilize afterwards varimax to obtain the bigraph (bipartite graph) of character zone, whether the edge of detecting the bigraph (bipartite graph) generating there is black picture element, there is black picture element in the coboundary of the bigraph (bipartite graph) of character zone for example, an entire row of pixels at coboundary place is drawn and removed from character zone, the longitudinally upper pixel of downward translation of coboundary that makes bigraph (bipartite graph), repeats this step until new coboundary does not exist black picture element.If similarly there is black picture element on the left hand edge of the bigraph (bipartite graph) of character zone, left hand edge is equalled to a pixel to the right, repeat this step until there is no black picture element on new left hand edge.Similarly, lower limb and right hand edge are also adjusted in a comparable manner, and through adjustment region edge progressively, character zone will progressively dwindle, and finally obtain exact position and the size of character zone.If dynamically adjust the scope at a certain edge of character zone, exceed 10 pixels, be judged to be and can not identify, and done and land processing, otherwise continued execution step 7.
Step 7: whether the character detecting in character zone can be identified, it is input that this step be take the character zone bigraph (bipartite graph) of step 6 output, is responsible for detecting the bigraph (bipartite graph) whether character zone can be partitioned into the single character that can identify.First from bill image, part corresponding to character zone copied out to copy, and by RGB, convert the color space of this copy to HSV, whether scanning exists form and aspect [0 again, 01] and [0.9,1] between (pure red) or [0.55,0.65] (ethereal blue), saturation degree between [0.3,1], brightness is [0.6, whether the pixel 1], covered by seal; On the other hand the bigraph (bipartite graph) of character zone is carried out to horizontal and vertical scanning, detect the size of character and number and the width of number and character pitch in character zone, during longitudinal scanning, from character zone bigraph (bipartite graph), lateral coordinates is fixed, from coboundary, vertically to the gray-scale value of a row pixel of lower limb, is scanned, intercharacter interval when wherein part continuous, that do not have black picture element is thought, character when part continuous, that have black picture element is thought.In the result detecting, interval quantity must be 10, and character quantity is 8, and, between [8,20], interval width is between [4,12] for the width of character (take pixel as unit), and character height must be between [25,32].If do not reach above-mentioned standard, be judged to be can not identify to do and land processing, otherwise continue execution step 8.
Step 8: coupling character feature, this step is usingd character feature that step 5 extracts as input, is responsible for searching and identifies the corresponding numeral of character.After being finished, continue execution step 9.
Step 9: detect matching degree and opening features, it is input that this step be take the bigraph (bipartite graph) of each character and the recognition result of step 8 that are partitioned in step 7, is responsible for detecting the correctness of recognition result.First character to be identified is mated with 0 to 9 Character mother plate, gray-scale value between respective pixel on all pixels of character bigraph (bipartite graph) and each Character mother plate is carried out to scale-of-two XOR, the quantity that statistics XOR result is 0 divided by the quantity normalization of the total pixel of character bigraph (bipartite graph), this numerical value and be defined as the digital matching degree that this character is corresponding with this template, statistics obtains the numeral that matching degree is the highest, is exactly the actual value of this character.If the highest matching degree is less than 0.95, or the recognition result of result and the step 8 of coupling is inconsistent, judges can not identify to do and lands processings, otherwise continuation performs step 10, does further detection.
Step 10: detect opening features, the same with step 9 function, it is input that this step be take the bigraph (bipartite graph) of each character and the recognition result of step 8 that are partitioned in step 7, is responsible for the correctness of checking recognition result.First judge recognition result, whether the beginning feature of then verifying the bigraph (bipartite graph) that this character is corresponding matches with recognition result, with the upper central point of character bigraph (bipartite graph) (central point of the working part of the bigraph (bipartite graph) of character) starting point, left hand edge level line L1 to bigraph (bipartite graph), until run into black picture element or left hand edge, if there is no black picture element on the line segment marking, think that this character is opening at upper left quarter (left one side of something of the first half of character bigraph (bipartite graph)).Otherwise the mid point of left hand edge of character bigraph (bipartite graph) of take is starting point, to the right hand edge horizontal line L2 that strikes, until run into black picture element (being designated as p2), if p2 on the p1 left side, is usingd the upper pixel of L2 starting point and started to repeat as ground zero the L2 that rules.Obtain after L1 and L2, take p1 as starting point, to L2, draw vertical line, then the pixel in the L1 of usining upper p1 the right starts repetition as ground zero and draws vertical line L3 to L2, until the intersection point of vertical line L3 and L2 is p2, if enough find a vertical line L3 who there is no black picture element, think that this character upper left quarter is opening, otherwise think closed; Similarly, can detect the open nature of the lower left quarter of character bigraph (bipartite graph), right lower quadrant, upper right quarter.
Definition opening is 1, and closure is 0, according to the opening features of character upper left, lower-left, bottom right, upper right, 0,3,5,6,8,9 beginning Feature Conversion is become to following 4 codings:
0:0000 3:1100 5:0101
6:0001 8:0000 9:0101
If the recognition result of step 8 is one in above-mentioned character, but opening testing result and above-mentioned coding are inconsistent, is judged to be can not identify to do and lands processing.
When recognition result, 1,2,4,7 time, need to carry out character duration detection.When recognition result is 1, if character duration surpasses between 14 pixels, is judged to be and can not identifies; When recognition result is 2, if in the bottom three row pixels of character bigraph (bipartite graph), in every one-row pixels, black picture element accounting is lower than 80%, or in three row pixels in the middle of character bigraph (bipartite graph), in every a line, black picture element accounting, higher than 30%, is judged to be and can not identifies; When recognition result is 4, if the width of character surpasses 16 pixels, is judged to be and can not identifies; When recognition result is 7, if in three row pixels of uppermost edge, the black picture element accounting in every a line is less than 80%, or in the pixel of the every a line of character bigraph (bipartite graph) the latter half, black picture element accounting is unnecessary 30%, is judged to be and can not identifies;
After this step completes, whole identification process completes, and starts the identification of next bill image.
With respect to prior art, the digital image processing system that is applied to bill image character recognition of the embodiment of the present invention and method have mainly been made improvement on following 2:
1, strategy is different: mostly the strategy that existing recognition technology adopts is: " eliminate the noise of impact identification in image as far as possible and guarantee that identification is correct ", " but the noise of impact identification " is of a great variety, consideration for efficiency, cost and feasibility, be difficult to also various noises all to be detected, although some noise can be detected in addition, be also difficult to be excluded completely.In the embodiment of the present invention, adopt different strategies: " detect the factor of impact identification as far as possible and filter out image not easy to identify ", can't remove to make great efforts to attempt eliminating some reluctant noise, but attempt to detect the existence of these noises, and will exist the image of noise to exclude identification process.So naturally, just effectively avoided the situation of the identification error that causes due to those reluctant noises.When there is the image negligible amounts of noise, (for example, in handing over city ticket, the image quantity that the embodiment of the present invention filters out can not surpass sum 30%), can under guaranteeing the prerequisite that most of image is identified, make recognition correct rate obviously improve (note: the recognition correct rate is here defined as: the image quantity that recognition result is correct and recognizer are judged to be the ratio of identifying in successful image quantity; Can be defined as by discrimination: recognizer is judged to be the ratio of the identification quantity of successful image and the quantity of all images to be identified);
2, for check, design specially: prior art is for the consideration of cost and versatility, can for some specific identification scenes, not go exploitation specially, the embodiment of the present invention designs for the identification of check number specially, for a plurality of links such as face of the value structure, the use process of circulation, video generations, analyze, and the noise of the impact identification likely occurring is quantitatively considered, and provide recognition detection method.In other words, the embodiment of the present invention has provided the parameterized model of a check number, this model generates the features such as difference, face of the value structure, seal covering, character covering, check number font and is quantitatively described by series of parameters to the scanning such as check image, each corresponding steps in identification process is detected these parameters, if testing result does not reach given index, be judged to be and can not identify, therefore when image, cover the recognition result that whole identification process obtains, its accuracy is significantly improved than existing system;
Lift a routine confirmatory experiment result below: charting below the one-time authentication experimental result based on the embodiment of the present invention, to be certain row hand between in October, 2009 to November the bill image of totally 30 days with city ticket to experimental data, amount to nearly 60,000 bill images, the bill image of average every day is 2000, image resolution ratio 200dpi.For verifying the data of recognition result verification of correctness, for this branch's accounting event processing enter local bill, process manual typing record.Whole recognizer has C language compilation, and development platform is VC6.0+OPENCV, and test data stored data base is ORCALE10G.
In following table, detect different these row of number data statements be that check number and the recognizer of Database field show that recognition result goes out different number, the result obtaining through artificial checking recognizer is correct, causing different reason is because bill image name is not mated and caused with the name field of database corresponding record, is the mistake of typing.By statistics, on average can over seventy percent, reach 72% left and right by discrimination as seen, recognition correct rate is 100%.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt complete hardware implementation example, implement software example or in conjunction with the form of the embodiment of software and hardware aspect completely.And the present invention can adopt the form that wherein includes the upper computer program of implementing of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) of computer usable program code one or more.
The present invention is with reference to describing according to process flow diagram and/or the block scheme of the method for the embodiment of the present invention, equipment (system) and computer program.Should understand can be in computer program instructions realization flow figure and/or block scheme each flow process and/or the flow process in square frame and process flow diagram and/or block scheme and/or the combination of square frame.Can provide these computer program instructions to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, the instruction of carrying out by the processor of computing machine or other programmable data processing device is produced for realizing the device in the function of flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame appointments.
These computer program instructions also can be stored in energy vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work, the instruction that makes to be stored in this computer-readable memory produces the manufacture that comprises command device, and this command device is realized the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make to carry out sequence of operations step to produce computer implemented processing on computing machine or other programmable devices, thereby the instruction of carrying out is provided for realizing the step of the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame on computing machine or other programmable devices.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the foregoing is only specific embodiments of the invention; the protection domain being not intended to limit the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (18)

1. a digital image processing system that is applied to bill image character recognition, is characterized in that, comprising:
Image parameters detection module, for detection of bill image parameter;
Image tilt detection module, for detection of the inclined degree of nominal value in bill image;
Text RegionDetection module, for positioning character zone;
Character zone can recognition detection module, for detection of the character properties in character zone and seal pixel;
Character match degree detection module, for carrying out matching degree detection by the character of character zone and template;
Opening features detection module, for detection of the opening features of the character in character zone.
2. the system as claimed in claim 1, is characterized in that, described image parameters detection module specifically for:
Detect bill image and whether meet resolution, image size and image format requirement, and whether comprise complete bill picture, determine that bill is for landing processing not reaching while requiring.
3. the system as claimed in claim 1, is characterized in that, described image tilt detection module specifically for:
By the nominal value edge in scanning bill image, obtain nominal value angle of inclination, while being no more than threshold value at angle of inclination, tilt to correct and detect correction result, at angle of inclination, surpass threshold value or determine that bill is for landing processing when overcorrection still exists.
4. system as claimed in claim 3, is characterized in that, described image tilt detection module specifically for:
By transversal scanning, record out the coordinate of point set at the middle part of check coboundary, according to the coordinate of record, carry out fitting a straight line, then according to the angle of inclination of matching coboundary straight line, bill image is carried out to bilinearity rotation.
5. the system as claimed in claim 1, is characterized in that, described Text RegionDetection module specifically for:
According to nominal value structure in bill image and nominal value drift condition, obtain position coordinates and the size of check number region in bill image, judge whether to be partitioned into check number region, if can not be partitioned into check number region, determine that bill is for landing processing.
6. system as claimed in claim 5, is characterized in that, described Text RegionDetection module specifically for:
Measure the length on black background in the image upper right corner horizontal and vertical, to locate position fixing position really, the face of the value upper right corner; Delimit position and the size of the relative nominal value of character zone; By dynamic adjustment, determine concrete size and the position coordinates of character zone.
7. the system as claimed in claim 1, is characterized in that, described character zone can recognition detection module specifically for:
Each individual digit character in character zone is positioned and cut apart, detect number, the width in gap between numeral and numeral and highly whether to meet character parameter request; At HSV color space, character zone is scanned, detect the seal pixel of character zone; If meet character properties requirement or seal number of pixels, do not surpass threshold value, determine that bill is for landing processing.
8. system as claimed in claim 7, is characterized in that, described character match degree detection module specifically for:
Each individual digit character and template that location in character zone is partitioned into are carried out matching degree detection, and compare with bill image character identification result.
9. system as claimed in claim 8, is characterized in that, described opening features detection module specifically for:
Detect the opening features of each individual digit character that in character zone, location is partitioned into; When opening features, matching degree, recognition result are all consistent, determine character recognition success, otherwise determine that bill is for landing processing.
10. a digital image processing method that is applied to bill image character recognition, is characterized in that, comprising:
Detect bill image parameter;
Detect the inclined degree of nominal value in bill image;
Character zone is positioned;
Detect character properties and seal pixel in character zone;
Character in character zone and template are carried out to matching degree detection;
Detect the opening features of the character in character zone.
11. methods as claimed in claim 10, is characterized in that, described detection bill image parameter, comprising:
Detect bill image and whether meet resolution, image size and image format requirement, and whether comprise complete bill picture, determine that bill is for landing processing not reaching while requiring.
12. methods as claimed in claim 10, is characterized in that, the inclined degree of nominal value in described detection bill image, comprising:
By the nominal value edge in scanning bill image, obtain nominal value angle of inclination, while being no more than threshold value at angle of inclination, tilt to correct and detect correction result, at angle of inclination, surpass threshold value or determine that bill is for landing processing when overcorrection still exists.
13. methods as claimed in claim 12, is characterized in that, described in tilt to correct, comprising:
By transversal scanning, record out the coordinate of point set at the middle part of check coboundary, according to the coordinate of record, carry out fitting a straight line, then according to the angle of inclination of matching coboundary straight line, bill image is carried out to bilinearity rotation.
14. methods as claimed in claim 10, is characterized in that, described character zone are positioned, and comprising:
According to nominal value structure in bill image and nominal value drift condition, obtain position coordinates and the size of check number region in bill image, judge whether to be partitioned into check number region, if can not be partitioned into check number region, determine that bill is for landing processing.
15. methods as claimed in claim 14, is characterized in that, described character zone are positioned, and comprising:
Measure the length on black background in the image upper right corner horizontal and vertical, to locate position fixing position really, the face of the value upper right corner; Delimit position and the size of the relative nominal value of character zone; By dynamic adjustment, determine concrete size and the position coordinates of character zone.
16. methods as claimed in claim 10, is characterized in that, the character properties in described detection character zone and seal pixel, comprising:
Each individual digit character in character zone is positioned and cut apart, detect number, the width in gap between numeral and numeral and highly whether to meet character parameter request; At HSV color space, character zone is scanned, detect the seal pixel of character zone; If meet character properties requirement or seal number of pixels, do not surpass threshold value, determine that bill is for landing processing.
17. methods as claimed in claim 16, is characterized in that, described character in character zone and template are carried out to matching degree detection, comprising:
Each individual digit character and template that location in character zone is partitioned into are carried out matching degree detection, and compare with bill image character identification result.
18. methods as claimed in claim 17, is characterized in that, the opening features of the character in described detection character zone, comprising:
Detect the opening features of each individual digit character that in character zone, location is partitioned into; When opening features, matching degree, recognition result are all consistent, determine character recognition success, otherwise determine that bill is for landing processing.
CN201410276103.0A 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition Active CN104112128B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410276103.0A CN104112128B (en) 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410276103.0A CN104112128B (en) 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition

Publications (2)

Publication Number Publication Date
CN104112128A true CN104112128A (en) 2014-10-22
CN104112128B CN104112128B (en) 2018-01-26

Family

ID=51708912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410276103.0A Active CN104112128B (en) 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition

Country Status (1)

Country Link
CN (1) CN104112128B (en)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104916033A (en) * 2015-05-28 2015-09-16 浪潮软件集团有限公司 Bill information analysis method based on bank bill acceptance machine (CTM)
CN105046553A (en) * 2015-07-09 2015-11-11 胡昭 Cloud intelligent invoice recognition inspection system and method based on mobile phone
CN105528604A (en) * 2016-01-31 2016-04-27 华南理工大学 Bill automatic identification and processing system based on OCR
CN105590112A (en) * 2015-09-22 2016-05-18 成都数联铭品科技有限公司 Oblique character determination method in image identification
CN105930842A (en) * 2016-04-15 2016-09-07 深圳市永兴元科技有限公司 Character recognition method and device
CN107169488A (en) * 2017-05-03 2017-09-15 四川长虹电器股份有限公司 A kind of correction system and antidote of bill scan image
CN107194400A (en) * 2017-05-31 2017-09-22 北京天宇星空科技有限公司 A kind of finance reimbursement unanimous vote is according to picture recognition processing method
CN107358184A (en) * 2017-06-30 2017-11-17 中国科学院自动化研究所 The extracting method and extraction element of document word
CN107622255A (en) * 2017-10-12 2018-01-23 江苏鸿信***集成有限公司 Bill images field localization method and system based on situation template and semantic template
CN107688805A (en) * 2017-07-25 2018-02-13 平安科技(深圳)有限公司 The method, apparatus and relevant device positioned according to image file in single mode plate is recorded
CN107945194A (en) * 2017-10-31 2018-04-20 四川长虹电器股份有限公司 Bill dividing method based on OpenCV technologies
CN107967479A (en) * 2016-10-19 2018-04-27 深圳怡化电脑股份有限公司 A kind of band is stained the character identifying method and system of bill
CN108830133A (en) * 2018-04-17 2018-11-16 平安科技(深圳)有限公司 Recognition methods, electronic device and the readable storage medium storing program for executing of contract image picture
CN109034154A (en) * 2018-07-23 2018-12-18 西安电子科技大学昆山创新研究院 The extraction and recognition methods of Invoice Seal duty paragraph
CN109063770A (en) * 2018-08-01 2018-12-21 上海联影医疗科技有限公司 Ruler detects verification method, system and computer readable storage medium
CN109426814A (en) * 2017-08-22 2019-03-05 顺丰科技有限公司 A kind of positioning of the specific plate of invoice picture, recognition methods, system, equipment
CN109543770A (en) * 2018-11-30 2019-03-29 合肥泰禾光电科技股份有限公司 Dot character recognition methods and device
WO2019071660A1 (en) * 2017-10-09 2019-04-18 平安科技(深圳)有限公司 Bill information identification method, electronic device, and readable storage medium
WO2019071662A1 (en) * 2017-10-09 2019-04-18 平安科技(深圳)有限公司 Electronic device, bill information identification method, and computer readable storage medium
CN110032990A (en) * 2019-04-23 2019-07-19 杭州智趣智能信息技术有限公司 A kind of invoice text recognition method, system and associated component
CN110472505A (en) * 2019-07-11 2019-11-19 深圳怡化电脑股份有限公司 Recognition methods, identification device and the terminal of bill serial number
CN110619331A (en) * 2019-09-20 2019-12-27 江苏鸿信***集成有限公司 Color distance-based color image field positioning method
CN110659647A (en) * 2019-09-11 2020-01-07 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN111091499A (en) * 2018-10-24 2020-05-01 方正国际软件(北京)有限公司 Method and device for correcting image of mobile terminal
CN112132851A (en) * 2020-11-25 2020-12-25 恒银金融科技股份有限公司 Calculation method for financial bill image rotation angle
CN112308056A (en) * 2019-07-26 2021-02-02 深圳怡化电脑股份有限公司 Method, device and equipment for acquiring note characteristic region and storage medium
CN112446912A (en) * 2021-02-01 2021-03-05 恒银金融科技股份有限公司 Financial bill width calculation method
CN112733854A (en) * 2021-03-30 2021-04-30 恒银金融科技股份有限公司 Method for calculating deflection angle of bank note
CN113191348A (en) * 2021-05-31 2021-07-30 山东新一代信息产业技术研究院有限公司 Template-based text structured extraction method and tool
TWI745068B (en) * 2020-09-02 2021-11-01 中國信託商業銀行股份有限公司 Method for establishing seal identification model and server terminal for establishing seal identification model
TWI748861B (en) * 2021-02-01 2021-12-01 中國鋼鐵股份有限公司 Character row distinguishing method
CN116403098A (en) * 2023-05-26 2023-07-07 四川金投科技股份有限公司 Bill tampering detection method and system
US11763424B2 (en) 2018-06-04 2023-09-19 Shanghai United Imaging Healthcare Co., Ltd. Devices, systems, and methods for image stitching

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567764A (en) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 Bill certificate and system for improving electronic image recognition efficiency
US20140010434A1 (en) * 2012-07-09 2014-01-09 Seiko Epson Corporation Recording media processing device, control method of a recording media processing device, and non-transitory storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567764A (en) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 Bill certificate and system for improving electronic image recognition efficiency
US20140010434A1 (en) * 2012-07-09 2014-01-09 Seiko Epson Corporation Recording media processing device, control method of a recording media processing device, and non-transitory storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
娄元芳等: ""一种人民币纸币号码自动识别快速方法"", 《微计算机信息》 *
张小军: ""票据字符识别方法与应用的研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
张艳: ""基于结构特征的钢坯端面字符识别方法研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104916033A (en) * 2015-05-28 2015-09-16 浪潮软件集团有限公司 Bill information analysis method based on bank bill acceptance machine (CTM)
CN105046553A (en) * 2015-07-09 2015-11-11 胡昭 Cloud intelligent invoice recognition inspection system and method based on mobile phone
CN105590112A (en) * 2015-09-22 2016-05-18 成都数联铭品科技有限公司 Oblique character determination method in image identification
CN105590112B (en) * 2015-09-22 2018-12-04 成都数联铭品科技有限公司 Text judgment method is tilted in a kind of image recognition
CN105528604A (en) * 2016-01-31 2016-04-27 华南理工大学 Bill automatic identification and processing system based on OCR
CN105528604B (en) * 2016-01-31 2018-12-11 华南理工大学 A kind of bill automatic identification and processing system based on OCR
CN105930842A (en) * 2016-04-15 2016-09-07 深圳市永兴元科技有限公司 Character recognition method and device
CN107967479B (en) * 2016-10-19 2021-11-12 深圳怡化电脑股份有限公司 Character recognition method and system with stained bill
CN107967479A (en) * 2016-10-19 2018-04-27 深圳怡化电脑股份有限公司 A kind of band is stained the character identifying method and system of bill
CN107169488A (en) * 2017-05-03 2017-09-15 四川长虹电器股份有限公司 A kind of correction system and antidote of bill scan image
CN107194400A (en) * 2017-05-31 2017-09-22 北京天宇星空科技有限公司 A kind of finance reimbursement unanimous vote is according to picture recognition processing method
CN107358184A (en) * 2017-06-30 2017-11-17 中国科学院自动化研究所 The extracting method and extraction element of document word
CN107688805A (en) * 2017-07-25 2018-02-13 平安科技(深圳)有限公司 The method, apparatus and relevant device positioned according to image file in single mode plate is recorded
CN109426814B (en) * 2017-08-22 2023-02-24 顺丰科技有限公司 Method, system and equipment for positioning and identifying specific plate of invoice picture
CN109426814A (en) * 2017-08-22 2019-03-05 顺丰科技有限公司 A kind of positioning of the specific plate of invoice picture, recognition methods, system, equipment
WO2019071660A1 (en) * 2017-10-09 2019-04-18 平安科技(深圳)有限公司 Bill information identification method, electronic device, and readable storage medium
WO2019071662A1 (en) * 2017-10-09 2019-04-18 平安科技(深圳)有限公司 Electronic device, bill information identification method, and computer readable storage medium
CN107622255A (en) * 2017-10-12 2018-01-23 江苏鸿信***集成有限公司 Bill images field localization method and system based on situation template and semantic template
CN107622255B (en) * 2017-10-12 2020-09-01 江苏鸿信***集成有限公司 Bill image field positioning method and system based on position template and semantic template
CN107945194A (en) * 2017-10-31 2018-04-20 四川长虹电器股份有限公司 Bill dividing method based on OpenCV technologies
CN108830133A (en) * 2018-04-17 2018-11-16 平安科技(深圳)有限公司 Recognition methods, electronic device and the readable storage medium storing program for executing of contract image picture
US11763424B2 (en) 2018-06-04 2023-09-19 Shanghai United Imaging Healthcare Co., Ltd. Devices, systems, and methods for image stitching
CN109034154A (en) * 2018-07-23 2018-12-18 西安电子科技大学昆山创新研究院 The extraction and recognition methods of Invoice Seal duty paragraph
CN109063770A (en) * 2018-08-01 2018-12-21 上海联影医疗科技有限公司 Ruler detects verification method, system and computer readable storage medium
CN111091499B (en) * 2018-10-24 2023-05-23 方正国际软件(北京)有限公司 Mobile terminal image correction method and device
CN111091499A (en) * 2018-10-24 2020-05-01 方正国际软件(北京)有限公司 Method and device for correcting image of mobile terminal
CN109543770A (en) * 2018-11-30 2019-03-29 合肥泰禾光电科技股份有限公司 Dot character recognition methods and device
CN110032990A (en) * 2019-04-23 2019-07-19 杭州智趣智能信息技术有限公司 A kind of invoice text recognition method, system and associated component
CN110472505B (en) * 2019-07-11 2022-03-08 深圳怡化电脑股份有限公司 Bill serial number identification method, bill serial number identification device and terminal
CN110472505A (en) * 2019-07-11 2019-11-19 深圳怡化电脑股份有限公司 Recognition methods, identification device and the terminal of bill serial number
CN112308056A (en) * 2019-07-26 2021-02-02 深圳怡化电脑股份有限公司 Method, device and equipment for acquiring note characteristic region and storage medium
CN110659647B (en) * 2019-09-11 2022-03-22 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN110659647A (en) * 2019-09-11 2020-01-07 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN110619331A (en) * 2019-09-20 2019-12-27 江苏鸿信***集成有限公司 Color distance-based color image field positioning method
TWI745068B (en) * 2020-09-02 2021-11-01 中國信託商業銀行股份有限公司 Method for establishing seal identification model and server terminal for establishing seal identification model
CN112132851A (en) * 2020-11-25 2020-12-25 恒银金融科技股份有限公司 Calculation method for financial bill image rotation angle
TWI748861B (en) * 2021-02-01 2021-12-01 中國鋼鐵股份有限公司 Character row distinguishing method
CN112446912A (en) * 2021-02-01 2021-03-05 恒银金融科技股份有限公司 Financial bill width calculation method
CN112733854A (en) * 2021-03-30 2021-04-30 恒银金融科技股份有限公司 Method for calculating deflection angle of bank note
CN113191348A (en) * 2021-05-31 2021-07-30 山东新一代信息产业技术研究院有限公司 Template-based text structured extraction method and tool
CN116403098A (en) * 2023-05-26 2023-07-07 四川金投科技股份有限公司 Bill tampering detection method and system
CN116403098B (en) * 2023-05-26 2023-08-08 四川金投科技股份有限公司 Bill tampering detection method and system

Also Published As

Publication number Publication date
CN104112128B (en) 2018-01-26

Similar Documents

Publication Publication Date Title
CN104112128A (en) Digital image processing system applied to bill image character recognition and method
CN106156761B (en) Image table detection and identification method for mobile terminal shooting
CN105654072B (en) A kind of text of low resolution medical treatment bill images automatically extracts and identifying system and method
RU2678485C1 (en) Method of character segmentation and recognition
CN110738602B (en) Image processing method and device, electronic equipment and readable storage medium
CN107729899B (en) License plate number recognition method and device
EP2897082B1 (en) Methods and systems for improved license plate signature matching
CN111914838B (en) License plate recognition method based on text line recognition
CN110210440B (en) Table image layout analysis method and system
US20070253040A1 (en) Color scanning to enhance bitonal image
CN111353961B (en) Document curved surface correction method and device
JP4901676B2 (en) License plate information processing apparatus and license plate information processing method
CN103455814B (en) Text line segmenting method and text line segmenting system for document images
CN101673338A (en) Fuzzy license plate identification method based on multi-angle projection
CN102360419A (en) Method and system for computer scanning reading management
CN107766854B (en) Method for realizing rapid page number identification based on template matching
CN109726717A (en) A kind of vehicle comprehensive information detection system
US8538157B2 (en) Device, method and computer program for detecting characters in an image
JP2018120445A (en) Car number recognition apparatus
CN108197624A (en) The recognition methods of certificate image rectification and device, computer storage media
RU2436156C1 (en) Method of resolving conflicting output data from optical character recognition system (ocr), where output data include more than one character image recognition alternative
CN113065404B (en) Method and system for detecting train ticket content based on equal-width character segments
CN115410191B (en) Text image recognition method, device, equipment and storage medium
CN102682308B (en) Imaging processing method and device
CN115546796A (en) Non-contact data acquisition method and system based on visual computation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant