CN104112128B - Digital image processing system and method applied to bill image character recognition - Google Patents

Digital image processing system and method applied to bill image character recognition Download PDF

Info

Publication number
CN104112128B
CN104112128B CN201410276103.0A CN201410276103A CN104112128B CN 104112128 B CN104112128 B CN 104112128B CN 201410276103 A CN201410276103 A CN 201410276103A CN 104112128 B CN104112128 B CN 104112128B
Authority
CN
China
Prior art keywords
character
image
roi
bill
width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410276103.0A
Other languages
Chinese (zh)
Other versions
CN104112128A (en
Inventor
曾修远
苏永前
王彦红
程炜华
周程伟
赵文哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN201410276103.0A priority Critical patent/CN104112128B/en
Publication of CN104112128A publication Critical patent/CN104112128A/en
Application granted granted Critical
Publication of CN104112128B publication Critical patent/CN104112128B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of digital image processing system and method applied to bill image character recognition, wherein systems approach includes:Image parameters detection module, for detecting bill image parameter;Image tilt detection module, for detecting the inclined degree of nominal value in bill image;Text RegionDetection module, for being positioned to character zone;Character zone can recognize that detection module, for detecting character properties and seal pixel in character zone;Character match degree detection module, for the character in character zone and template to be carried out into matching degree detection;Opening features detection module, for detecting the opening features of the character in character zone.Bill image character recognition accuracy can be improved in the check image analysing computer processing procedure that bank money image exchanges using the present invention.

Description

Digital image processing system and method applied to bill image character recognition
Technical field
The present invention relates to Digital Image Processing and OCR field, more particularly to applied to bill image word Accord with the digital image processing system and method for identification.
Background technology
With the development of Digital Image Processing, pattern-recognition and artificial intelligence, optical character identification (Optical Character Recognition, OCR) technology in financial field, especially bill image nominal value key element legitimacy examine, ticket Increasing application has been obtained according to BPR etc..The main handling process of Optical Character Recognition system is as schemed at present Shown in 1, it can substantially be divided into image input, Yunnan snub-nosed monkey, character feature extracts, character match identifies this four steps.
First step is image input, mainly using optical instrument to subject matter (such as books, file, card to be identified Part etc.) be scanned, so as to generate corresponding image data, optical instrument include scanner, facsimile machine, digital camera or its His camera.The factor such as illumination condition during video generation, the resolution ratio of image, the effect and precision that influence is subsequently identified.
Second step is the pretreatment of image, and this is the characteristics of generating image according to previous stage, at image Manage and extracted in order to the character of follow-up phase, mainly including image color correcting, image slant correction, noise filtering and uniformly By video conversion into black white binarization figure either gray-scale map.Detail flow is pre-processed, it is necessary to spy according to image data Point and parameter are designed, such as image has different degrees of colour cast, and pretreatment stage just must correct this comprising color Step.
3rd, the 4th step be respectively character feature extraction and match cognization, mainly by character on last stage Region is split according to character, then extracts its character feature to single character, is prepared for subsequent match identification.Current Recognition methods mainly has two kinds, one for statistics feature, such as the black/white pixel count ratio in posting field, when word is distinguished into During several regions, the joint of this region black/white pixel ratio one by one, just into a numerical value vector in space, referred to as feature Vector, it need to be only compared in follow-up identification with this feature vector.And the another kind of feature for being characterized as structure, such as character After image graph thinning, the stroke end points of character, the quantity in crosspoint and position are obtained, or join with character topologies such as stroke sections Number is characterized, and is then compared in feature database and be can obtain result.
As described above, the main flow framework of character identifying method, the especially the 3rd, four steps compare at present Ripe stabilization, and the key of its recognition correct rate is influenceed, it is second step Yunnan snub-nosed monkey, subject matter to be identified is because of it The difference of own characteristic, and the difference and imaging device of the illumination condition when scanning process of image data, scanning generation Individual difference, it can all cause image to include many and diverse influences identification " noise " and (represent shadow hereinafter to arrange noise Ring the factor of character in identification image), a general identifying system due to the reason of efficiency, performance, cost and feasibility, The factor for being possible to influence recognition correct rate often can not possibly be taken into account that during design yet.This i other words, existing general-purpose system Layout strategy be to identify the character in various images as much as possible, but the image parameters model of its foundation can not but be retouched All image problems are stated, therefore often fails suitably to be pre-processed in the presence of some images and just flows into follow-up identification step Suddenly, identification mistake is ultimately caused.Such as bank money image exchange in check number identification problem because check is using During exist uncertainty, often exist check number by seal, hand-written character cover the problem of, while generate bill shadow Picture is not subject to quantitatively in the identifying system of versatility to these features at present there is also the difference in illumination, colour cast Description and consider, so existing generic identification-system for bill recognition result tend not to it is satisfactory, it is difficult to meet The strict demand of financial circles Data Enter, the especially typing of account category information, this is actually also to hinder optical character identification to exist The key that financial circles are further promoted.
The content of the invention
The embodiment of the present invention provides a kind of digital image processing system applied to bill image character recognition, in silver Bill image character recognition accuracy is improved in the check image analysing computer processing procedure that row bill image exchanges, the system includes:
Image parameters detection module, for detecting bill image parameter;
Image tilt detection module, for detecting the inclined degree of nominal value in bill image;
Text RegionDetection module, for being positioned to character zone;
Character zone can recognize that detection module, for detecting character properties and seal pixel in character zone;
Character match degree detection module, for the character in character zone and template to be carried out into matching degree detection;
Opening features detection module, for detecting the opening features of the character in character zone;
The Text RegionDetection module is specifically used for:
Measure black background in the image upper right corner it is horizontal and vertical on length, to position the face of the value upper right corner Determine coordinate position;Delimit position of the character zone with respect to nominal value and size;Adjusted by dynamic and determine the specific of character zone Size and location coordinate;
According to nominal value structure in bill image and nominal value drift condition, position of the check number region in bill image is obtained Coordinate and size are put, judges whether that check number region can be partitioned into, it is true if it can not be partitioned into check number region Booking is handled according to for landing.
In one embodiment, the image parameters detection module is specifically used for:
Whether detection bill image meets resolution ratio, image size and image format requirement, and whether includes complete ticket According to picture, determine that bill is handled for landing when not reaching requirement.
In one embodiment, the image tilt detection module is specifically used for:
Nominal value angle of inclination is obtained by scanning the nominal value edge in bill image, is entered when angle of inclination is no more than threshold value Line tilt is corrected and detects correction result, and threshold value is exceeded at angle of inclination or determines that bill is to fall when still suffering from inclination through overcorrection Ground processing.
In one embodiment, the image tilt detection module is specifically used for:
The coordinate of the point set at the middle part of check top edge is recorded out by transversal scanning, straight line is carried out according to the coordinate of record Fitting, bilinearity rotation is carried out to bill image further according to the angle of inclination of fitting top edge straight line.
In one embodiment, the character zone can recognize that detection module is specifically used for:
Locating segmentation is carried out to each individual digit character in character zone, in gap between detection numeral and numeral Whether number, width and height meet character parameter request;Character zone is scanned in HSV color spaces, detects character area The seal pixel in domain;If do not meet character properties requirement or seal number of pixels exceedes threshold value, it is determined that bill is at landing Reason.
In one embodiment, the character match degree detection module is specifically used for:
Each individual digit character that locating segmentation in character zone goes out and template are subjected to matching degree detection, and and ticket It is compared according to image character identification result.
In one embodiment, the opening features detection module is specifically used for:
The opening features for each individual digit character that locating segmentation goes out in detection character zone;Opening features, With degree, recognition result it is consistent when, determine character recognition success, otherwise determine bill be landing processing.
The embodiment of the present invention also provides a kind of digital image processing method applied to bill image character recognition, to Bill image character recognition accuracy, this method bag are improved in the check image analysing computer processing procedure that bank money image exchanges Include:
Detect bill image parameter;
Detect the inclined degree of nominal value in bill image;
Character zone is positioned;
Detect the character properties and seal pixel in character zone;
Character in character zone and template are subjected to matching degree detection;
Detect the opening features of the character in character zone;
It is described that character zone is positioned, including:
Measure black background in the image upper right corner it is horizontal and vertical on length, to position the face of the value upper right corner Determine coordinate position;Delimit position of the character zone with respect to nominal value and size;Adjusted by dynamic and determine the specific of character zone Size and location coordinate;
According to nominal value structure in bill image and nominal value drift condition, position of the check number region in bill image is obtained Coordinate and size are put, judges whether that check number region can be partitioned into, it is true if it can not be partitioned into check number region Booking is handled according to for landing.
In one embodiment, the detection bill image parameter, including:
Whether detection bill image meets resolution ratio, image size and image format requirement, and whether includes complete ticket According to picture, determine that bill is handled for landing when not reaching requirement.
In one embodiment, the inclined degree of nominal value in the detection bill image, including:
Nominal value angle of inclination is obtained by scanning the nominal value edge in bill image, is entered when angle of inclination is no more than threshold value Line tilt is corrected and detects correction result, and threshold value is exceeded at angle of inclination or determines that bill is to fall when still suffering from inclination through overcorrection Ground processing.
In one embodiment, the carry out Slant Rectify, including:
The coordinate of the point set at the middle part of check top edge is recorded out by transversal scanning, straight line is carried out according to the coordinate of record Fitting, bilinearity rotation is carried out to bill image further according to the angle of inclination of fitting top edge straight line.
In one embodiment, character properties and seal pixel in the detection character zone, including:
Locating segmentation is carried out to each individual digit character in character zone, in gap between detection numeral and numeral Whether number, width and height meet character parameter request;Character zone is scanned in HSV color spaces, detects character area The seal pixel in domain;If do not meet character properties requirement or seal number of pixels exceedes threshold value, it is determined that bill is at landing Reason.
In one embodiment, the character by character zone carries out matching degree detection with template, including:
Each individual digit character that locating segmentation in character zone goes out and template are subjected to matching degree detection, and and ticket It is compared according to image character identification result.
In one embodiment, the opening features of the character in the detection character zone, including:
The opening features for each individual digit character that locating segmentation goes out in detection character zone;Opening features, With degree, recognition result it is consistent when, determine character recognition success, otherwise determine bill be landing processing.
It is applied to the digital image processing system and method for bill image character recognition in the embodiment of the present invention, can be in silver Bill image character recognition accuracy is improved in the check image analysing computer processing procedure that row bill image exchanges, disclosure satisfy that finance The strict demand of industry Data Enter, the especially typing of account category information, be advantageous to optical character identification and further obtained in financial circles To popularization.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.In the accompanying drawings:
Fig. 1 is the process chart of existing character recognition system in background technology;
Fig. 2 is the schematic diagram for the digital image processing system for being applied to bill image character recognition in the embodiment of the present invention;
Fig. 3 is the vertical projection result exemplary plot in check number region in the embodiment of the present invention;
Fig. 4 is character and the schematic diagram of template matches in the embodiment of the present invention;
Fig. 5 is the exemplary plot of split shed structure detection of the embodiment of the present invention;
Fig. 6 is the flow example for the digital image processing method for being applied to bill image character recognition in the embodiment of the present invention Figure.
Embodiment
For the purpose, technical scheme and advantage of the embodiment of the present invention are more clearly understood, below in conjunction with the accompanying drawings to this hair Bright embodiment is described in further details.Here, the schematic description and description of the present invention is used to explain the present invention, but simultaneously It is not as a limitation of the invention.
The embodiment of the present invention overcomes the shortcomings that prior art, there is provided a kind of number applied to bill image character recognition Word image processing system, the system can improve bill shadow in the check image analysing computer processing procedure that bank money image exchanges As character recognition accuracy, banker's check numbering recognition correct rate is particularly improved.
In implementation process, the embodiment of the present invention considers more comprehensively, completely for the identification of check number, such as in view of Nominal value structure, size dimension, the font architecture feature check generality feature of such as check;Check is also allowed for using circulation During the noise factor of influence identification correctness that is likely encountered, such as using when covered by seal, with other character weights It is folded, and such as nominal value inclination, illumination disunity factor caused by facility environment difference.Based on this, the embodiment of the present invention is adopted With the strategy of " image for detecting, filtering out the noise comprising influence identification as far as possible ", increase on the basis of existing identifying system The new system being made up of some detections and correction module.Fig. 2 is to be applied to bill image character recognition in the embodiment of the present invention The schematic diagram of digital image processing system.As shown in Fig. 2 the system includes image parameters detection module, image tilt detection mould Block, character zone (ROI) detection module, ROI can recognize that detection module, character match degree detection module and opening features detection mould Block.(including video generation module, Yunnan snub-nosed monkey module, character feature carry these detection modules with existing identifying system respectively Modulus block and character feature matching module) in modules connected.Each detection module is with existing identifying system The result and intermediateness of a certain resume module are input information, are checked, filtering or a correction part may influence The noise of next resume module result correctness, if reaching examination criteria, it indicates that follow-up identified/detected module is performed, , whereas if a certain detection module detects the noise that can influence recognition result correctness, then the identification of the bill image is terminated Process, landing processing is done, the implication for landing processing is that the artificial bill that participates in is handled.After finally identifying, recognition result is defeated Enter database purchase.Applied to the digital image processing system of bill image character recognition each module function introduction such as Under:
Image parameters detection module:For detecting bill image parameter.For example, the module can be connected on video generation mould Behind block, whether the bill image for detecting the generation of video generation module meets parameter and standard, including resolution ratio, image chi Whether very little, image format and image include complete bill picture, are determined as that landing is handled if requirement is not reached.
Image slant correction module:For detecting the inclined degree of nominal value in bill image.For example, the module can be to life Into the nominal value inclined degree of image carry out quantitative detection, by nominal value top edge in scan-image, obtain its angle of inclination, such as Fruit angle of inclination is smaller, within the specific limits, such as no more than threshold value, then carries out Slant Rectify and detects correction result, if Angle of inclination excessive (exceeding threshold value) or through overcorrection still have inclination be then determined as landing handle.
Character zone (ROI) detection module:For being positioned to character zone.For example, the module is responsible for carrying out ROI Positioning, according to the drift condition of the structure of the face of the value and nominal value in image, obtain essence of the check number region in image True position coordinates and size, judge whether the check number region that can split, it is impossible to be partitioned into during positioning Check number region then determines that bill is handled for landing;Can be overlapping with ROI region with the presence or absence of other characters with Preliminary detection, It is determined as that landing is handled if being unable to full segmentation and going out ROI region.
ROI can recognize that detection module:For detecting character properties and seal pixel in character zone.For example, the module It is responsible for carrying out each individual digit character in ROI locating segmentation, the number in detection numeral and its gap, width, height Whether degree meets the requirements;In addition, being scanned in HSV color spaces to ROI region, detection ROI region is with the presence or absence of seal Red or blueness, if character properties are above standard either in the presence of excessive red or blue seal pixel, such as pixel Number exceedes threshold value, then is determined as that landing is handled.
Character match degree detection module:For the character in character zone and template to be carried out into matching degree detection.For example, should Module is responsible for can recognize that ROI the single character of result split in detection module carries out matching degree detection with template, with showing The recognition result for having character match identification module last in identifying system is compared.
Opening features detection module:For detecting the opening features of the character in character zone.The module detects single number The opening features of word character.If above-mentioned opening features, these three results of matching degree and recognition result are consistent in embodiment, then sentence Identify successfully calmly, whole system operation is finished, and otherwise landing is handled.
As it was previously stated, in order to solve problem of the prior art, each processing of the embodiment of the present invention in whole identification process Before and after stage, it is qualitative or quantitatively on exist influence recognition correct rate factor check image detected, filter, handle or Person corrects.The embodiment of the present invention adds some detection modules, including image parameters detection on the basis of prior art Module, image tilt detection module, character zone (ROI) detection module, ROI can recognize that detection module, the detection of character match degree Module and opening features detection module, they are connected on before and after existing system basic module, to each existing system basic module Result detected, to ensure the correctness of the recognition result of image.Specifically, image parameters detection module is with shadow The image generated as generation module is to input, and detects the design parameter of the bill image of its life;Image tilt detection module is with shadow As the corrected image of pretreatment module to input, detect the inclined degree of nominal value in the bill image of its generation and carry out redundancy Correction;Character zone (ROI) detection module and ROI can recognize that detection module then using the corrected image of Yunnan snub-nosed monkey module as Input, detects and extracts ticket number region;And character match degree detection module then can recognize that the word of detection module generation with ROI Binary picture is accorded with, and the recognition result of character feature matching module is input, to checking that recognition result does last detection.
In instantiation, image parameters detection module can require video generation module according to certain parameter setting, and The parameter of these images is detected after video generation.Scanning device can be required for the flat of current main flow when gathering image Plate scanner, recommend to cut the scanner of function, such as Fujitsu's fi-5220c high speed scanners with automatic image, during scanning Make four sides of check image parallel with the scan box of scanner as far as possible, the check image for scanning generation can for example possess after testing Following characteristics:
1st, image resolution ratio is 200dpi chromatic image;
2nd, wide 1500 ± 100 pixel of image, high 650 ± 50 pixel is (hereinafter to arrange picture size size and coordinate Unit be all pixel);
3rd, image store form is one kind in 24 JPG forms, tiff format, 256 color BMP forms, and recommended setting is defeated It is 24 JPG forms to go out form;
4th, the face of the value is all high-visible in image, and the image edge background parts in addition to the face of the value are black Color, i.e. rgb value are (0,0,0);
5th, check part is not tilted significantly relative to whole image in bill image, and check passes through before scanning to be examined, Bill key element, check number especially to be identified are not artificially deliberately altered (according to check operating specification, site operation cabinet Whether member has a responsibility for when collecting bill clear to bill, is not checked by correction).
If the parameter of the check image of generation does not reach above-mentioned standard, it is judged to not can recognize that or rescaning.
In instantiation, image tilt detection module can be responsible for the bill shadow that the module of before processing one generates and handles completion Picture, Slant Rectify is carried out to the face of the value in image, and filter out tilt it is excessive, can not Slant Rectify and image size not Normal image.It can quantify and provide in embodiment:Check image angle of inclination exceedes ± 15 degree and is judged to not can recognize that;It is whole Individual check image width is then judged to not can recognize that more than 1400 to 1600 pixels, the highly scope more than 600 to 700 pixels. Meet the image of above range for parameter, carry out Slant Rectify, check top edge can be recorded out by transversal scanning first Middle part point set coordinate, according to coordinate carry out fitting a straight line, further according to fitting top edge straight line angle of inclination to image Bilinearity rotation is carried out, so as to complete Slant Rectify.Slant Rectify and the idiographic flow of detection in image tilt detection module Can be as follows:
1st, resolution ratio Resolution, height Height, and the width Width of check image are detected.If width Width Not in the range of 1500 to 1600 pixels, or height Height then judges the shadow not in the range of 600 to 700 pixels As being that not can recognize that;If resolution ratio Resolution is not 200dpi, the image is judged not can recognize that;
2nd, image edge is handled:By the four edges edge of image, i.e. four hem width degree are that the pixel on the edge of a pixel is complete Portion becomes black picture element, i.e., the rgb value of these pixels is modified as (0,0,0);
3rd, check edge detection:First can on the horizontal scale, the point centered at the half on image width, The scope of the pixel of Width × 0.5 ± 100 is determined in abscissa.Within the range, from coordinate for [Width × 0.5-100, 1] pixel starts, fix first in abscissa, ordinate from 1 to Height in the range of the rgb value of pixel be scanned Detection, when pixel scan and below the value of tri- passages of RGB of continuous two pixels respectively less than (50,50,50) when, then Think to have been found the edge in face of the value region, record the coordinate of the point on edge, stop the scanning of the row pixel, then Start again from coordinate for the pixel of [Width × 0.5-100+1,1], fix again in abscissa, ordinate is from 1 to Height's The pixel of scope carries out detection scanning, repeat the scanning step until scan through line by line [Width × 0.5-100, Width × 0.5+100] in all pixels.When scanning any row pixel, if the ordinate of pixel exceeded Height × The rgb value of pixel is still not above (50,50,50) then stopping scanning, and judge that the image can not scan when 0.25;It is if right In the value and the value of previous or the latter edge pixel ordinate of the ordinate for the edge pixel that any one finds ± 2 pixels are differed by more than, then stop scanning, and judge that the image cannot identify;
4th, angle of inclination calculates and corrected:The coordinate of the point set of check top edge is obtained, is gone out using least square fitting The Slope Parameters of the straight line of check top edge, then obtain inclination angle of the face of the value relative to check image with this Slope Parameters Degree, according to angle of inclination, using the center of image as the center of circle, bilinearity rotation is carried out to image and is tilted with correcting.Obtain inclination angle After degree, if inclined angle more than ± 15 degree, judges that the image not can recognize that;After correction, repeat the 3rd, 4 steps and obtain Angle of inclination after image correction, judge the image for not if the angle of inclination of the face of the value after correction exceedes ± 0.5 degree It is recognizable.
In instantiation, character zone (ROI) detection module is after the check image corrected, it is necessary to further fixed Interest region (ROI, hereinafter arrange ROI as image in just to include check of the position positioned at the check number in the check upper right corner One sub-regions in the region of numbering, i.e. image) so as to next module extraction character feature.Due to the uncertain black back of the body be present Scape, i.e. face of the value part are not fixed in check image, and later stage printing always has the reason for skew, the branch in the upper right corner The relative coordinate systems in image of ticket numbering is simultaneously not bery fixed.The module is responsible for eliminating the uncertain shift factor and positions check number The position coordinates and size of definite region.Can measure first black background in the image upper right corner it is horizontal and vertical on length Degree, so as to position face of the value upper right corner position fixing position really, then can delimit Position Approximate of the ROI region with respect to nominal value And size, the specific size and position coordinates for further determining that ROI are adjusted finally by dynamic.Idiographic flow can be as follows:
1st, upper right corner black background detects:All it is rectangle in view of nominal value and image, and has already passed through rotational correction, Therefore first from the central point of image top edge, i.e. coordinate is the pixel of [Width × 0.5,1], in abscissa determination, ordinate Scan in the range of from 1 to Height, untill the rgb value until running into continuous three pixels is respectively less than (50,50,50), then recognize To have obtained the pixel of nominal value top edge.The ordinate of the edge pixel now obtained is labeled as Height_Blackground. Similarly, start for the pixel of [Width, Height × 0.5] from the central point of image right hand edge, i.e. coordinate, ordinate is consolidated Fixed, pixel of the abscissa from Width to 1 in the range of is scanned, until the rgb value for running into continuous three pixels is respectively less than Untill the pixel of the nominal value right hand edge of (50,50,50).The image width Width that now obtains and the abscissa of edge pixel Difference is labeled as Width_Blackground;
2nd, tentatively it is set to ROI region:Check number region ROI is the length for just including check number numerical portion It is square.The initial value of the coordinate of rectangle upper left angle point is believed that 50 away from nominal value top edge pixels based on experience value, away from check 300 pixels of nominal value right hand edge.The width ROI_Width of ROI region is 200 pixels, and height ROI_Height is 53 pixels;
3rd, dynamic adjustment ROI region:The green (G) in tri- passages of RGB of tentatively fixed ROI region can be chosen first The value of passage, that is, the gray-scale map of corresponding ROI region is obtained, labeled as Gray_ROI.Then can be according to the gray-scale map, using most Big Ostu method (OSTU) carries out adaptive threshold fuzziness, the binary picture in Initial R OI regions is obtained, labeled as Binary_ ROI.The coordinate in the ROI upper left corners is labeled as [new_ROI_X, new_ROI_Y], wherein black picture element and white pixel are only existed, The pixel value of black picture element is arranged to 0, and the pixel value of white pixel is arranged to 1.It is right successively after obtaining ROI region binary picture Its four width are scanned for the edge of a pixel.Such as left hand edge, to from coordinate being [Weight-Weight_ on edge Blackground-300, Height+50] pixel start, abscissa keep constant, ordinate scope from Height+50 to The gray value of pixel in the range of Height+50+53.If there is black picture element, then ROI left hand edges are moved right a picture Element, it will start from coordinate for the pixel of [Weight-Weight_Blackground-300+1, Height+50], abscissa is protected Hold constant, ordinate scope from Height+50 to Height+50+53 in the range of pixel set (line segment) as ROI it is new Left hand edge, and detect new edge again and whether there is black picture element.Left hand edge so is moved to right, until new left hand edge is not black Untill color pixel.Using similar method, when entering Mobile state adjustment to lower edge, right hand edge and top edge, lower edge is to moving up Dynamic, right hand edge is moved to the left, and top edge moves down.If black picture element is not present in new edge in mobile process, i.e., all It is white pixel, then it is assumed that have found suitable edge and stop moving.If any a line edge movement has exceeded 10 pictures Black picture element is still had on new edge after element, then is judged to not can recognize that.After adjustment, new Binary_ROI zone markers For new_Binary_ROI, the coordinate of its upper left corner pixels is labeled as [new_ROI_X, new_ROI_Y].If new_ Binary_ROI width new_ROI_Width is less than 180 pixels, or height new_ROI_Height is less than 45 pictures Element, then it is judged to not can recognize that.
In instantiation, ROI can recognize that detection module is successfully obtaining the accurate coordinates of ROI region when ROI detection modules After size, before further extracting the character feature of check number in ROI and being identified, ROI region can be swept Retouch, to detect whether to have the noise of the influence identifications such as seal, hand-written character covering.Bank's use is carrying out check business operation When, the seal used only has pure red or two kinds of colors of ethereal blue, therefore seal detection needs to change pixel from rgb color space To HSV color spaces, the pure red or pure blue of seal is detected in color space.In face of the value structure, It is paying bank title below check number, is offset occasionally there are printing or hand-written paying bank title and cover check number Situation, vertical projection can be carried out by the binary picture new_Binary_ROI of the ROI region generated to ROI detection modules and entered Row detection.Flow is embodied can be as follows:
1st, seal detects:The scope of ROI region is determined first on image, then for each pixel in the region, The rgb value of pixel is converted into HSV (Hue, Saturation, Value) value.It can arrange, pure red form and aspect (Hue) scope It is [0,01] and [0.9,1];In the range of the form and aspect (Hue) of pure blue [0.55,0.65];The saturation degree of seal pixel (Saturation) scope is [0.3,1];Brightness (Value) scope of seal pixel is [0.6,1].For any one ROI The HSV value of the pixel in region, if any one in its HSV value has reached range above, then it is assumed that be seal pixel (print Chapter pixel is pixel of the pixel HSV value in above-mentioned scope), and recorded.If after scanning through whole ROI region, The quantity of the seal pixel of record, then it is considered that numbering is covered by seal, is judged to not can recognize that more than 25;
2nd, character machining is covered:The binary picture new_Binary_ROI in the region of previous module generation is obtained, under Vertical projection function ROI_Projection (x) shown in face obtains binary picture new_Binary_ROI vertical projection knot Fruit.Fig. 3 is the vertical projection result exemplary plot in check number region in this example.As shown in figure 3, wherein x represents abscissa, unit For pixel, ROI_Projection (x) functional values are the picture that abscissa is in x file in binary picture new_Binary_ROI The pixel value of element set cumulative and, the pixel p ixel_value values for setting black picture element are 0, and the pixel value of white pixel is 1, specific formula is as follows:
Wherein, i is ordinate, and new_ROI_Height is the height of the binary picture new_Binary_ROI files;
After obtaining accumulation result, hence it is evident that as can be seen that the ROI_ of the blank separated region of abscissa between characters Projection (x) functional values are constant result, i.e. new_ROI_Height × 1, continuous ROI_Projection (x) function It is worth the width of the width of the abscissa zone for new_ROI_Height × 1, as the blank separated region of intercharacter;Two skies Width between white separated region is the width of character, and ROI_Projection (x) functional values are less than new_ROI_Height The width in × 1 region.The width at 10 intervals (including head and the tail) and the width of 8 characters are measured out by vertical projection function Degree, the numeral of normal check number is 8, in the case where resolution ratio is 200dpi:The width of character is set in [8,20] In scope (unit is pixel), interval width is set in the range of [4,12], and the height of character is set in the range of [25,32], Either if gap number is not 8 and 10 or has the width of any character or the width in gap not to exist character number In above range, then it is believed that being covered by other characters, it is judged to not can recognize that.
In instantiation, character match degree detection module can be identified result by character match identification module Afterwards, the reliability of recognition result is detected.According to vertical on the new_Binary_ROI regions obtained in Text RegionDetection module Projection function ROI_Projection (x) result, essence of each numerical character in new_Binary_ROI can be obtained True position.One can be replicated to each individual digit character zone, and individually copy, the Character mother plate with 0 to 9 are compared Relatively and detect its matching degree, the reliability of recognition result is quantitatively detected by matching degree this numerical value quantified.Specific implementation stream Journey can be as follows:
1st, single character locating detection:According to the projection function ROI_Projection (x) in a upper module, from Binary_ROI regions are left-to-right to be started to scan, and n-th functional value is less than to of the continuum of new_ROI_Height × 1 Beginning position coordinates is recorded in array Number_Star [N], and by the width record of the continuum in Number_Width In [N].Wherein Number_Star and Number_Width is the array with 8 units, 8 numbers of corresponding check number Word.Similarly the floor projection function ROI_horizon (y) in a upper module obtains character top Number_Top and bottom Divide Number_Down coordinate, and character height Number_Height.After obtaining coordinate, for N (the N ∈ in numbering [1,8], check number only have 8) individual character, from Binary_ROI using top left co-ordinate as [Number_Star [N], Number_Top, size carries out copy for Number_Width [N] × Number_Height region can obtain n-th The copy of the binary picture of character, labeled as Num_Binary [N];
2nd, digital template is generated:The image of clear, without coverings such as seals the check of certain amount is selected, by above-mentioned several The processing of individual module and step, the binary picture for obtaining some characters split of the check number of check image are copied Shellfish.The copy of clearly 0 to 9 binary pictures of totally 10 characters is selected as template, labeled as Num_Template [M].Wherein M is 0 to 9 integer.Needed in m-th template as digital M binary picture, while the size of each digital template needs For 25 × 53.If template is high or wide to not requiring, directly increase some row, column in the top edge and left hand edge of character White pixel, until size reaches requirement.Template after the completion of selecting can be with Reusability, subsequently without regenerating, therefore The step need to only perform once, if but check font change, the step can be repeated and generate new template.
3rd, single character match degree detection:The binary picture of numerical digit character is generated according to upper two steps.For N-th numeral in check number, N are the integer between 1 to 8, by Num_Binary [N] and template Num_Template [M] by One is matched, and Fig. 4 is character and the schematic diagram of template matches in this example, as shown in figure 4, first by the every of Num_Bianry [N] In one pixel, with template Num_Template [0] coordinate range be wide from 1 to Number_Width [N], height from 1 to Number_ One piece of Height and the pixel in the region of the same shape sizes of Num_Binary [N] correspond.For all corresponding pictures Element is right, and statistics is equally black, and is equally the numerical value of white pixel, pixel in this numerical value divided by Num_Binary [N] Quantity be normalized after, definition normalization after result be matching degree.Then again by coordinate area in Num_Template [0] Domain moves right a pixel, will wide [N]+1 from 2 to Number_Width, it is high one piece from 1 to Number_Height with The same shape size regions of Num_Binary [N] are matched and count matching degree.It is so mobile until Number_Width [N]+H is wide by 25 equal to template Num_Template's [0], and region at this moment is moved into the wide from 1 to Number_Width of template [N], the high range statistics matching degree from 2 to Number_Height+1, so until owning on Num_Binary [N] and template After different similar shape size areas has counted matching degree, wherein highest matching degree is selected, labeled as Match [0].Using same The method of sample, then Num_Binary [N] is matched and obtained corresponding with other all template Num_Template [M] Matching degree Match [M], if at this moment Match [I] (I represents digital I) is maximum, Num_Binary [N] can be identified as numeral I, other numerals in can similarly being numbered, at this moment for Num_Binary [N] maximum matching degree, labeled as Max_ Match[N].If I is not the recognition result of character match identification module output, then it is assumed that result is incorrect to be judged to knowing Not;If matching degree numerical value is less than 0.8, then it is assumed that result is incorrect to be judged to not can recognize that.
In instantiation, after result is identified by character match degree detection module, it is still desirable to by opening features Detection module verifies the correctness of recognition result.For example, can be according to the hatch frame spy in the region of each character binaryzation figure Sign is detected, if the testing result of hatch frame is consistent with the digital opening features identified, then it is assumed that identification is just Really.Different digital opening features are simultaneously different, can be detected by the opening of four open areas to a character into Row coding, if the opening features of binary picture digital opening features corresponding with recognition result are not inconsistent, can be determined that this Image not can recognize that.Idiographic flow can be as follows:
1st, hatch frame detects:Fig. 5 is the exemplary plot of hatch frame detection, as shown in figure 5, character binaryzation figure is divided For four upper left, lower-left, bottom right and upper right regions.First verify that top left region:First in the top half from character binaryzation figure Center point P _ Top start that boundary scan, scan line are recorded as L1 to the left with horizontal linear, scan to first black picture element then Stop, and be recorded as P1.If left hand edge has been arrived in scanning does not still have black picture element, it is opening directly to think upper left corner area And verify subsequent region.If finding P1, start since the midpoint of the left hand edge of character binaryzation figure that level is swept to the right Retouch, scan line is recorded as L2, stops until running into first black picture element, is recorded as P2.If horizontal stroke of the P2 abscissa than P1 Coordinate is small, then the horizontal sweep since a upper pixel (i.e. the small pixel of ordinate) at left hand edge midpoint, until what is found First P2 abscissa is bigger than P1 abscissa.If the ordinate of the starting point of scanning is for 0 still without finding P2, then it is assumed that The region is not open;When finding P2, vertical scan direction is done to L2 since P1 on scan line L1, until meeting L2 Untill, scan line is recorded as L3.If black picture element on L3 be present, from the P1 on L2 move to right a pixel as L3 Point restarts to scan downward vertically.If a L3 still can not be found when starting point is P_Top so that the picture above it The all white of element, then it is assumed that the region is not open;Otherwise it is assumed that it is opening;Similarly the lower left corner, the lower right corner, the upper right corner are entered Row judges;
2nd, opening coding:From top to bottom, from left to right opening testing result is encoded, region openings are labeled as 1, instead Be labeled as 0, to be open everywhere detection structured coding be four, numeral 0,3,5,6,8,9 opening coding it is as follows:
0:0000 3:1100 5:0101
6:0001 8:0000 9:0101
If some character has been identified as above-mentioned corresponding numeral, but the result for the detection that is open does not meet above-mentioned open but Mouth coding, then be judged to not can recognize that;
3rd, width detection:Character for being identified as 1, it is judged to knowing if more than 12 pixels of its width Not;Character for being identified as 4, if more than 16 pixels of the width of character, are judged to not can recognize that;For being identified as 2 Character, every one-row pixels of its binary picture are scanned, if 3 row pixels of lowermost edge, per the black in a line Pixel is less than in three row pixels among 8/10, or character binaryzation figure, the quantity unnecessary 1/3 of black picture element in every a line, Then it is judged to not can recognize that;Character for being identified as 7, if in three row pixels of mouth top edge, the black picture in every a line Element is less than the quantity unnecessary 1/3 of black picture element in 8/10, or the pixel of every a line in the latter half of character binaryzation figure, then sentences It is set to and does not can recognize that.
Based on same inventive concept, a kind of number applied to bill image character recognition is additionally provided in the embodiment of the present invention Word image processing method, as described in the following examples.Because this method solves the principle of problem and is applied to bill image word The digital image processing system of symbol identification is similar, therefore the implementation of this method may refer to applied to bill image character recognition The implementation of digital image processing system, repeat part and repeat no more.
Digital image processing method in the embodiment of the present invention applied to bill image character recognition can include:
Detect bill image parameter;
Detect the inclined degree of nominal value in bill image;
Character zone is positioned;
Detect the character properties and seal pixel in character zone;
Character in character zone and template are subjected to matching degree detection;
Detect the opening features of the character in character zone.
When it is implemented, detection bill image parameter, can include:
Whether detection bill image meets resolution ratio, image size and image format requirement, and whether includes complete ticket According to picture, determine that bill is handled for landing when not reaching requirement.
When it is implemented, detecting the inclined degree of nominal value in bill image, can include:
Nominal value angle of inclination is obtained by scanning the nominal value edge in bill image, is entered when angle of inclination is no more than threshold value Line tilt is corrected and detects correction result, and threshold value is exceeded at angle of inclination or determines that bill is to fall when still suffering from inclination through overcorrection Ground processing.
When it is implemented, carrying out Slant Rectify, can include:
The coordinate of the point set at the middle part of check top edge is recorded out by transversal scanning, straight line is carried out according to the coordinate of record Fitting, bilinearity rotation is carried out to bill image further according to the angle of inclination of fitting top edge straight line.
When it is implemented, being positioned to character zone, can include:
According to nominal value structure in bill image and nominal value drift condition, position of the check number region in bill image is obtained Coordinate and size are put, judges whether that check number region can be partitioned into, it is true if it can not be partitioned into check number region Booking is handled according to for landing.
When it is implemented, being positioned to character zone, can include:
Measure black background in the image upper right corner it is horizontal and vertical on length, to position the face of the value upper right corner Determine coordinate position;Delimit position of the character zone with respect to nominal value and size;Adjusted by dynamic and determine the specific of character zone Size and location coordinate.
When it is implemented, character properties and seal pixel in detection character zone, can include:
Locating segmentation is carried out to each individual digit character in character zone, in gap between detection numeral and numeral Whether number, width and height meet character parameter request;Character zone is scanned in HSV color spaces, detects character area The seal pixel in domain;If do not meet character properties requirement or seal number of pixels exceedes threshold value, it is determined that bill is at landing Reason.
When it is implemented, the character in character zone and template are carried out into matching degree detection, can include:
Each individual digit character that locating segmentation in character zone goes out and template are subjected to matching degree detection, and and ticket It is compared according to image character identification result.
When it is implemented, the opening features of the character in detection character zone, can include:
The opening features for each individual digit character that locating segmentation goes out in detection character zone;Opening features, With degree, recognition result it is consistent when, determine character recognition success, otherwise determine bill be landing processing.
As it was previously stated, the core concept of the embodiment of the present invention is on the basis of existing OCR, respectively Increase independent detecting step before or after each step of existing identification process, these additional detecting steps are dedicated for inspection Survey can hinder the noise that next identification step correctly identifies, if it find that noise factor then judges not can recognize that, and do at landing Reason, so as to prevent the possibility of identification error.
Fig. 6 is the flow example for the digital image processing method for being applied to bill image character recognition in the embodiment of the present invention Figure.Image parameters detection, the recognizable detection of image tilt detection, Text RegionDetection, ROI, character match degree are given in Fig. 6 Detection and opening features detecting step, and their combinations between existing identification process.Each step in Fig. 6 exists There is corresponding module in Fig. 2, i.e., each above-mentioned detecting step can be by a corresponding standalone module in new system To realize its function.The output of each step of identification process can all receive special detection, similarly, each testing process As a result Rule of judgment can be all used as, to control whether to be identified the next step of flow, the idiographic flow of whole method Step can for example include:
Step 1:Bill image is generated, the step, which is responsible for generation, needs the digitized video copy of bill to be processed, therefore needs Optical imaging apparatus is used, such as flat bed scanner obtains the digitized video of bill, and bill needs manually to visually inspect in itself, ticket Face part must clearly not by it is artificial alter, the image of generation must be also clear and legible to recognize.Step is performed after generation image 2。
Step 2:Bill image parameter is detected, the bill image of the step 1 generation is input, is responsible for checking bill image Design parameter, including:Whether the resolution ratio of image is 200dpi;Whether the size of image is 1500 ± 100 × 600 ± 50;Remove Whether the image edge background parts beyond the face of the value are ater, i.e., rgb value is (0,0,0).If the bill shadow of generation As undesirable, then it is judged to not can recognize that, and regenerates image or landing processing, if by detection, carries out Step 3.
Step 3:Yunnan snub-nosed monkey, the step are responsible for the bill of generation using the bill image that step 1 generates as input Image carries out color, illumination and slant correction, uncertain to eliminate different imaging devices, imaging circumstances and manual operation Etc. factor to caused by image image, after pretreatment perform step 4.
Step 4:Detect image to tilt, the step is responsible for inspection using the image by pretreatment that step 3 exports as input The inclined degree of nominal value part in image is looked into, (the detecting step and step 3 functionally redundancy, and being obtained with this independently of each other Higher reliability), that is, detect the angle of intersection (angle for being less than 90 degree) between nominal value edge line and image edge straight line, angle The size of degree then judges that the bill image not can recognize that, if within ± 15 degree, utilizes two-wire if more than ± 15 degree Property interpolation rotary process is corrected, and angle of inclination is checked again for after correction, if the inclination still more than ± 1 degree, is judged Image is not recognizable and lands processing, if by detection, carries out step 5.
Step 5:Character feature is extracted, the step is treated by input, responsible extraction of the pretreated image that step 3 exports The quantization characteristic of the character of identification, location character region relative position in bill image first in the step are then fixed one by one Relative position of each character in character zone in the character zone of position, and quantify to extract the feature of each character, extract After perform step 6.
Step 6:Character zone is detected, the image after tilt detection and correction that the step is exported using step 4 is defeated Enter, be responsible for whether detection character zone to be identified meets the condition further identified, the step detects the upper right corner of nominal value first Position in bill image, i.e. nominal value top edge to image top edge and nominal value right hand edge to the distance of image right hand edge, and The characteristics of being relatively fixed based on ticket number region (i.e. character zone) to be identified in position at par, is provided character zone and existed Initial position in image.The binary picture of character zone is obtained using varimax afterwards, detects the binary picture of generation Edge whether there is black picture element, such as there is black picture element in the top edge of the binary picture of character zone, then will be upper An entire row of pixels where edge is drawn from character zone to be removed, and makes the top edge of binary picture is longitudinally upper to translate a picture downwards Element, this step is repeated untill black picture element is not present in new top edge.Similarly if the binary picture of character zone Black picture element on left hand edge be present, then left hand edge is equalled into the right a pixel, repeat this step does not have until on new left hand edge Untill black picture element.Similarly, lower edge and right hand edge also adjust in a comparable manner, by progressively adjustment region surrounding Edge, character zone will progressively reduce, and finally obtain exact position and the size of character zone.If dynamic adjustment character zone A certain bar edge scope beyond 10 pixels, then be judged to not can recognize that, and do landing processing, otherwise continue executing with step Rapid 7.
Step 7:Detection character zone in character whether can recognize that, the character zone two-value that the step is exported with step 6 It is input to change figure, is responsible for whether detection character zone can be partitioned into the binary picture for the single character that can be identified.First from Copied part corresponding to character zone is gone out into copy in bill image, and the color space of the copy is converted into HSV by RGB, Scan again with the presence or absence of form and aspect between [0,01] and [0.9,1] (pure red) or [0.55,0.65] (ethereal blue), saturation degree exists Between [0.3,1], pixel of the brightness between [0.6,1], i.e., whether covered by seal;On the other hand to the two-value of character zone Change figure and carry out horizontal and vertical scanning, detect the size and number and the number and width of character pitch of character in character zone Degree, during longitudinal scanning, from character zone binary picture lateral coordinates fix, from top edge vertically to a row pixel of lower edge Gray value be scanned, wherein continuously, when thinking in the absence of the part of black picture element intercharacter interval, it is continuous, deposit The character when the part of black picture element is thought.In the result of detection, interval quantity must be 10, and character quantity is 8, word The width (in units of pixel) of symbol between [8,20], interval width between [4,12], and character height must [25, 32] between.If not reaching above-mentioned standard, it is judged to not can recognize that and does landing processing, otherwise continue executing with step 8.
Step 8:Character feature is matched, the step is responsible for searching and identifying using the character feature that step 5 is extracted as input Go out the numeral corresponding to character.Step 9 is continued executing with after being finished.
Step 9:Matching degree and opening features are detected, the step is with the binary picture for each character being partitioned into step 7 Recognition result with step 8 is input, is responsible for the correctness of detection recognition result.First by character to be identified and 0 to 9 character Template is matched, and the gray value between respective pixel in character binaryzation figure all pixels and each Character mother plate is carried out into two System XOR, quantity that statistics XOR result is 0 and divided by the total pixel of character binaryzation figure quantity normalization, this number It is worth and is defined as the character digital matching degree corresponding with the template, statistics obtains matching degree highest numeral, is exactly the word The actual value of symbol.If highest matching degree is less than 0.95, or the recognition result of the result and step 8 of matching is inconsistent, then Judge not can recognize that and do landing processing, otherwise continue executing with step 10 and do further detection.
Step 10:Detection opening feature, as step 9 function, the step is with each character for being partitioned into step 7 Binary picture and the recognition result of step 8 are input, are responsible for the correctness of checking recognition result.Recognition result is first determined whether, so Verify whether the beginning feature of binary picture corresponding to the character matches with recognition result afterwards, then with the upper of character binaryzation figure Central point (central point of the working part of the binary picture of character) starting point, to the horizontal line L1 of left hand edge of binary picture, directly Untill black picture element or left hand edge is run into, if there is no black picture element on the line segment marked, then it is assumed that the character is in upper left Portion's (left one side of something of the top half of character binaryzation figure) is opening.Otherwise, with the midpoint of the left hand edge of character binaryzation figure For starting point, struck horizontal line L2 to right hand edge, untill running into black picture element (being designated as p2), if p2 on the p1 left sides, with L2 A upper pixel for starting point starts to repeat line L2 as ground zero.After obtaining L1 and L2, using p1 as starting point, vertical line is drawn to L2, Then a pixel starts to repeat to draw vertical line L3 to L2 as ground zero on the right of the p1 using on L1, until vertical line L3 and L2 intersection point are Untill p2, if enough finding a vertical line L3 without black picture element, then it is assumed that the character upper left quarter is opening, otherwise it is assumed that It is closure;Likewise it is possible to the open nature of the lower left quarter of character binaryzation figure, right lower quadrant, upper right quarter is detected.
Definition opening is 1, is closed as 0, according to character upper left, lower-left, bottom right, upper right opening features by 0,3,5,6,8, 9 beginning Feature Conversion encodes into following 4:
0:0000 3:1100 5:0101
6:0001 8:0000 9:0101
If the recognition result of step 8 is one in above-mentioned character, but opening testing result and above-mentioned coding are inconsistent, Then it is judged to not can recognize that and does landing processing.
, it is necessary to carry out character duration detection when when recognition result 1,2,4,7.When recognition result is 1, if character is wide Spend between more than 14 pixels, be then judged to not can recognize that;When recognition result is 2, if character binaryzation figure bottom three In row pixel, black picture element accounting is less than in three row pixels among 80%, or character binaryzation figure in every one-row pixels, often Black picture element accounting is higher than 30% in a line, then is judged to not can recognize that;When recognition result is 4, if the width of character surpasses 16 pixels are crossed, then are judged to not can recognize that;When recognition result is 7, if in three row pixels of uppermost edge, in every a line Black picture element accounting be less than 80%, or character binaryzation figure the latter half per a line pixel in black picture element accounting it is unnecessary 30%, then it is judged to not can recognize that;
After the completion of the step, whole identification process is completed, and starts the identification of next bill image.
Relative to prior art, the digital image processing system applied to bill image character recognition of the embodiment of the present invention Mainly improvement is made that with method on following 2 points:
1st, strategy is different:Strategy is mostly used by existing identification technology is:" eliminating as far as possible influences making an uproar for identification in image Sound simultaneously ensures that identification is correct ", still " noise for influenceing identification " species is various, for the consideration of efficiency, cost and feasibility, very Difficulty is also impossible to all detect various noises, although some noises can be detected in addition, also has been difficult to Exclude entirely.Different strategies is then used in the embodiment of the present invention:" detect to influence the factor of identification and filter out to be not easy as far as possible It the image of identification ", i.e., can't go to have tried to eliminate some reluctant noises, and be attempt to detect depositing for these noises , and there will be the image of noise to exclude identification process.So just it efficiently avoid naturally because those are reluctant The situation of identification mistake caused by noise.When the image negligible amounts of noise be present, (such as in being handed over city ticket, this hair The image quantity that bright embodiment filters out not over sum 30%), can be before ensureing that most of image is identified Put so that recognition correct rate significantly improves (note:Here recognition correct rate is defined as:The correct image quantity of recognition result with Recognizer is judged to identifying the ratio in successful image quantity;Recognizable rate is defined as:Recognizer is judged to being identified as The ratio of the quantity of the image of work(and the quantity of all images to be identified);
2nd, designed specifically for check:Prior art, will not be specifically for some for the consideration of cost and versatility Specific identification scene is gone to develop, and the embodiment of the present invention is designed specifically for the identification of check number, for the face of the value Structure, analyzed using multiple links such as the process of circulation, video generation, and to noises that all influences being likely to occur identify Consideration is quantified, and provides recognition detection method.In other words, the embodiment of the present invention gives the parameter of a check number Change model, scanning of the model to such as check image generates difference, face of the value structure, seal covering, character covering, check The features such as numbering font are quantitatively described with series of parameters, and each corresponding steps in identification process are to these parameters Detected, if testing result does not reach given index, be judged to not can recognize that, therefore when image covers whole identification The recognition result that flow obtains, its accuracy are significantly improved than existing system;
Confirmatory experiment result as an example below:One-time authentication of the following charting based on the embodiment of the present invention is real Test result, experimental data is certain row hands in October, 2009 to the bill image of totally 30 days between November with city ticket, altogether nearly 60,000 tickets According to image, average daily bill image is 2000, image resolution ratio 200dpi.For verifying recognition result verification of correctness Data handle manual typing record for branch accounting event processing center local bill.Whole recognizer has C language volume Write, development platform VC6.0+OPENCV, test data data storage storehouse is ORCALE10G.
The data statement of different this row of number is detected in following table is the check number and identification journey of Database field Sequence show that recognition result goes out different number, and the result obtained by manual verification's recognizer is correct, causes to differ It is the mistake of typing caused by the reason for sample is due to the name field mismatch of bill image name and database corresponding record. Reach 72% or so more than seventy percent by the visible averagely recognizable rate of statistical result, recognition correct rate 100%.
It should be understood by those skilled in the art that, embodiments of the invention can be provided as method, system or computer program Product.Therefore, the present invention can use the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware Apply the form of example.Moreover, the present invention can use the computer for wherein including computer usable program code in one or more The computer program production that usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided The processors of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in individual square frame or multiple square frames.
Particular embodiments described above, the purpose of the present invention, technical scheme and beneficial effect are carried out further in detail Describe in detail it is bright, should be understood that the foregoing is only the present invention specific embodiment, the guarantor being not intended to limit the present invention Scope is protected, within the spirit and principles of the invention, any modification, equivalent substitution and improvements done etc., should be included in this Within the protection domain of invention.

Claims (14)

  1. A kind of 1. digital image processing system applied to bill image character recognition, it is characterised in that including:
    Image parameters detection module, for detecting bill image parameter;
    Image tilt detection module, for detecting the inclined degree of nominal value in bill image;
    Text RegionDetection module, for being positioned to character zone;
    Character zone can recognize that detection module, for detecting character properties and seal pixel in character zone;
    Character match degree detection module, for the character in character zone and template to be carried out into matching degree detection;
    Opening features detection module, for detecting the opening features of the character in character zone;
    The Text RegionDetection module is specifically used for:
    Measure black background in the image upper right corner it is horizontal and vertical on length, to position the determination in the face of the value upper right corner Coordinate position;Delimit position of the character zone with respect to nominal value and size;The specific size for determining character zone is adjusted by dynamic And position coordinates;
    According to nominal value structure in bill image and nominal value drift condition, obtain position of the check number region in bill image and sit Mark and size, judge whether that check number region can be partitioned into, ticket are determined if it can not be partitioned into check number region Handled according to for landing;
    The character zone can recognize that detection module is specifically used for:
    Vertical projection is carried out to the binary picture new_Binary_ROI of the character zone of Text RegionDetection module generation Detection:When carrying out covering character machining, the binary picture new_Binary_ROI is obtained, according to vertical projection function ROI_ Projection (x) obtains the vertical projection result of the binary picture new_Binary_ROI;Wherein x represents abscissa, single Position is pixel, and ROI_Projection (x) functional values are the file that abscissa is x in the binary picture new_Binary_ROI On pixel set pixel value cumulative and, the pixel p ixel_value values for setting black picture element are 0, the picture of white pixel Element value is 1, and specific formula is as follows:
    <mrow> <mi>R</mi> <mi>O</mi> <mi>I</mi> <mo>_</mo> <mi>Pr</mi> <mi>o</mi> <mi>j</mi> <mi>e</mi> <mi>c</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>n</mi> <mi>e</mi> <mi>w</mi> <mo>_</mo> <mi>R</mi> <mi>O</mi> <mi>I</mi> <mo>_</mo> <mi>H</mi> <mi>e</mi> <mi>i</mi> <mi>g</mi> <mi>h</mi> <mi>t</mi> </mrow> </munderover> <mi>p</mi> <mi>i</mi> <mi>x</mi> <mi>e</mi> <mi>l</mi> <mo>_</mo> <mi>v</mi> <mi>a</mi> <mi>l</mi> <mi>u</mi> <mi>e</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
    Wherein, i is ordinate, and new_ROI_Height is the height of the binary picture new_Binary_ROI files;
    After obtaining accumulation result, ROI_Projection (x) functional values of the blank separated region of abscissa between characters are Constant result, i.e. new_ROI_Height × 1, continuous ROI_Projection (x) functional value are new_ROI_Height × 1 Abscissa zone width, as the blank separated region of intercharacter width;Width between two blank separated regions The as width of character, width of ROI_Projection (x) functional values less than the region of new_ROI_Height × 1;Pass through Vertical projection function measures out the width of 10 width being spaced and 8 characters, and the numeral of normal check number is 8, In the case that resolution ratio is 200dpi:The width of character is set in the range of [8,20], and interval width is set in [4,12] scope Interior, the height of character is set in the range of [25,32], if character number or gap number are not 8 and 10, Huo Zheyou Not within the above range, then it is believed that being covered by other characters, being determined as can not for the width of any character or the width in gap Identification.
  2. 2. the system as claimed in claim 1, it is characterised in that the image parameters detection module is specifically used for:
    Whether whether detection bill image meets resolution ratio, image size and image format requirement, and drawn comprising complete bill Face, determine that bill is handled for landing when not reaching requirement.
  3. 3. the system as claimed in claim 1, it is characterised in that the image tilt detection module is specifically used for:
    Nominal value angle of inclination is obtained by scanning the nominal value edge in bill image, is inclined when angle of inclination is no more than threshold value Tiltedly correct and detect correction result, threshold value is exceeded at angle of inclination or determines that bill is at landing when still suffering from inclination through overcorrection Reason.
  4. 4. system as claimed in claim 3, it is characterised in that the image tilt detection module is specifically used for:
    The coordinate of the point set at the middle part of check top edge is recorded out by transversal scanning, straight line plan is carried out according to the coordinate of record Close, bilinearity rotation is carried out to bill image further according to the angle of inclination of fitting top edge straight line.
  5. 5. the system as claimed in claim 1, it is characterised in that the character zone can recognize that detection module is specifically used for:
    To in character zone each individual digit character carry out locating segmentation, detection numeral and numeral between gap number, Whether width and height meet character parameter request;Character zone is scanned in HSV color spaces, detects character zone Seal pixel;If do not meet character properties requirement or seal number of pixels exceedes threshold value, it is determined that bill is handled for landing.
  6. 6. system as claimed in claim 5, it is characterised in that the character match degree detection module is specifically used for:
    Each individual digit character that locating segmentation in character zone is gone out and template carry out matching degree detection, and with bill shadow As character identification result is compared.
  7. 7. system as claimed in claim 6, it is characterised in that the opening features detection module is specifically used for:
    The opening features for each individual digit character that locating segmentation goes out in detection character zone;Opening features, matching degree, When recognition result is consistent, character recognition success is determined, otherwise determines that bill is handled for landing.
  8. A kind of 8. digital image processing method applied to bill image character recognition, it is characterised in that including:
    Detect bill image parameter;
    Detect the inclined degree of nominal value in bill image;
    Character zone is positioned;
    Detect the character properties and seal pixel in character zone;
    Character in character zone and template are subjected to matching degree detection;
    Detect the opening features of the character in character zone;
    It is described that character zone is positioned, including:
    Measure black background in the image upper right corner it is horizontal and vertical on length, to position the determination in the face of the value upper right corner Coordinate position;Delimit position of the character zone with respect to nominal value and size;The specific size for determining character zone is adjusted by dynamic And position coordinates;
    According to nominal value structure in bill image and nominal value drift condition, obtain position of the check number region in bill image and sit Mark and size, judge whether that check number region can be partitioned into, ticket are determined if it can not be partitioned into check number region Handled according to for landing;
    Character properties in detection character zone specifically include:
    Vertical projection is carried out to the binary picture new_Binary_ROI of the character zone of Text RegionDetection module generation Detection:When carrying out covering character machining, the binary picture new_Binary_ROI is obtained, according to vertical projection function ROI_ Projection (x) obtains the vertical projection result of the binary picture new_Binary_ROI;Wherein x represents abscissa, single Position is pixel, and ROI_Projection (x) functional values are the file that abscissa is x in the binary picture new_Binary_ROI On pixel set pixel value cumulative and, the pixel p ixel_value values for setting black picture element are 0, the picture of white pixel Element value is 1, and specific formula is as follows:
    <mrow> <mi>R</mi> <mi>O</mi> <mi>I</mi> <mo>_</mo> <mi>Pr</mi> <mi>o</mi> <mi>j</mi> <mi>e</mi> <mi>c</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>n</mi> <mi>e</mi> <mi>w</mi> <mo>_</mo> <mi>R</mi> <mi>O</mi> <mi>I</mi> <mo>_</mo> <mi>H</mi> <mi>e</mi> <mi>i</mi> <mi>g</mi> <mi>h</mi> <mi>t</mi> </mrow> </munderover> <mi>p</mi> <mi>i</mi> <mi>x</mi> <mi>e</mi> <mi>l</mi> <mo>_</mo> <mi>v</mi> <mi>a</mi> <mi>l</mi> <mi>u</mi> <mi>e</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
    Wherein, i is ordinate, and new_ROI_Height is the height of the binary picture new_Binary_ROI files;
    After obtaining accumulation result, ROI_Projection (x) functional values of the blank separated region of abscissa between characters are Constant result, i.e. new_ROI_Height × 1, continuous ROI_Projection (x) functional value are new_ROI_Height × 1 Abscissa zone width, as the blank separated region of intercharacter width;Width between two blank separated regions The as width of character, width of ROI_Projection (x) functional values less than the region of new_ROI_Height × 1;Pass through Vertical projection function measures out the width of 10 width being spaced and 8 characters, and the numeral of normal check number is 8, In the case that resolution ratio is 200dpi:The width of character is set in the range of [8,20], and interval width is set in [4,12] scope Interior, the height of character is set in the range of [25,32], if character number or gap number are not 8 and 10, Huo Zheyou Not within the above range, then it is believed that being covered by other characters, being determined as can not for the width of any character or the width in gap Identification.
  9. 9. method as claimed in claim 8, it is characterised in that the detection bill image parameter, including:
    Whether whether detection bill image meets resolution ratio, image size and image format requirement, and drawn comprising complete bill Face, determine that bill is handled for landing when not reaching requirement.
  10. 10. method as claimed in claim 8, it is characterised in that the inclined degree of nominal value, bag in the detection bill image Include:
    Nominal value angle of inclination is obtained by scanning the nominal value edge in bill image, is inclined when angle of inclination is no more than threshold value Tiltedly correct and detect correction result, threshold value is exceeded at angle of inclination or determines that bill is at landing when still suffering from inclination through overcorrection Reason.
  11. 11. method as claimed in claim 10, it is characterised in that the carry out Slant Rectify, including:
    The coordinate of the point set at the middle part of check top edge is recorded out by transversal scanning, straight line plan is carried out according to the coordinate of record Close, bilinearity rotation is carried out to bill image further according to the angle of inclination of fitting top edge straight line.
  12. 12. method as claimed in claim 8, it is characterised in that character properties and seal picture in the detection character zone Element, including:
    To in character zone each individual digit character carry out locating segmentation, detection numeral and numeral between gap number, Whether width and height meet character parameter request;Character zone is scanned in HSV color spaces, detects character zone Seal pixel;If do not meet character properties requirement or seal number of pixels exceedes threshold value, it is determined that bill is handled for landing.
  13. 13. method as claimed in claim 12, it is characterised in that the character by character zone is matched with template Degree detection, including:
    Each individual digit character that locating segmentation in character zone is gone out and template carry out matching degree detection, and with bill shadow As character identification result is compared.
  14. 14. method as claimed in claim 13, it is characterised in that the opening features of the character in the detection character zone, Including:
    The opening features for each individual digit character that locating segmentation goes out in detection character zone;Opening features, matching degree, When recognition result is consistent, character recognition success is determined, otherwise determines that bill is handled for landing.
CN201410276103.0A 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition Active CN104112128B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410276103.0A CN104112128B (en) 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410276103.0A CN104112128B (en) 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition

Publications (2)

Publication Number Publication Date
CN104112128A CN104112128A (en) 2014-10-22
CN104112128B true CN104112128B (en) 2018-01-26

Family

ID=51708912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410276103.0A Active CN104112128B (en) 2014-06-19 2014-06-19 Digital image processing system and method applied to bill image character recognition

Country Status (1)

Country Link
CN (1) CN104112128B (en)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104916033B (en) * 2015-05-28 2017-11-10 浪潮软件集团有限公司 Bill information analysis method based on bank bill acceptance machine (CTM)
CN105046553A (en) * 2015-07-09 2015-11-11 胡昭 Cloud intelligent invoice recognition inspection system and method based on mobile phone
CN105590112B (en) * 2015-09-22 2018-12-04 成都数联铭品科技有限公司 Text judgment method is tilted in a kind of image recognition
CN105528604B (en) * 2016-01-31 2018-12-11 华南理工大学 A kind of bill automatic identification and processing system based on OCR
CN105930842A (en) * 2016-04-15 2016-09-07 深圳市永兴元科技有限公司 Character recognition method and device
CN107967479B (en) * 2016-10-19 2021-11-12 深圳怡化电脑股份有限公司 Character recognition method and system with stained bill
CN107169488A (en) * 2017-05-03 2017-09-15 四川长虹电器股份有限公司 A kind of correction system and antidote of bill scan image
CN107194400B (en) * 2017-05-31 2019-12-20 北京天宇星空科技有限公司 Financial reimbursement full ticket image recognition processing method
CN107358184A (en) * 2017-06-30 2017-11-17 中国科学院自动化研究所 The extracting method and extraction element of document word
CN107688805A (en) * 2017-07-25 2018-02-13 平安科技(深圳)有限公司 The method, apparatus and relevant device positioned according to image file in single mode plate is recorded
CN109426814B (en) * 2017-08-22 2023-02-24 顺丰科技有限公司 Method, system and equipment for positioning and identifying specific plate of invoice picture
CN107766809B (en) * 2017-10-09 2020-05-19 平安科技(深圳)有限公司 Electronic device, bill information identification method, and computer-readable storage medium
CN107798299B (en) * 2017-10-09 2020-02-07 平安科技(深圳)有限公司 Bill information identification method, electronic device and readable storage medium
CN107622255B (en) * 2017-10-12 2020-09-01 江苏鸿信***集成有限公司 Bill image field positioning method and system based on position template and semantic template
CN107945194A (en) * 2017-10-31 2018-04-20 四川长虹电器股份有限公司 Bill dividing method based on OpenCV technologies
CN108830133B (en) * 2018-04-17 2020-02-21 平安科技(深圳)有限公司 Contract image picture identification method, electronic device and readable storage medium
CN109063770B (en) * 2018-08-01 2022-07-26 上海联影医疗科技股份有限公司 Ruler detection verification method, system and computer readable storage medium
WO2019233422A1 (en) 2018-06-04 2019-12-12 Shanghai United Imaging Healthcare Co., Ltd. Devices, systems, and methods for image stitching
CN109034154A (en) * 2018-07-23 2018-12-18 西安电子科技大学昆山创新研究院 The extraction and recognition methods of Invoice Seal duty paragraph
CN111091499B (en) * 2018-10-24 2023-05-23 方正国际软件(北京)有限公司 Mobile terminal image correction method and device
CN109543770A (en) * 2018-11-30 2019-03-29 合肥泰禾光电科技股份有限公司 Dot character recognition methods and device
CN110032990A (en) * 2019-04-23 2019-07-19 杭州智趣智能信息技术有限公司 A kind of invoice text recognition method, system and associated component
CN110472505B (en) * 2019-07-11 2022-03-08 深圳怡化电脑股份有限公司 Bill serial number identification method, bill serial number identification device and terminal
CN112308056A (en) * 2019-07-26 2021-02-02 深圳怡化电脑股份有限公司 Method, device and equipment for acquiring note characteristic region and storage medium
CN110659647B (en) * 2019-09-11 2022-03-22 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN110619331A (en) * 2019-09-20 2019-12-27 江苏鸿信***集成有限公司 Color distance-based color image field positioning method
TWI745068B (en) * 2020-09-02 2021-11-01 中國信託商業銀行股份有限公司 Method for establishing seal identification model and server terminal for establishing seal identification model
CN112132851A (en) * 2020-11-25 2020-12-25 恒银金融科技股份有限公司 Calculation method for financial bill image rotation angle
TWI748861B (en) * 2021-02-01 2021-12-01 中國鋼鐵股份有限公司 Character row distinguishing method
CN112446912B (en) * 2021-02-01 2021-06-04 恒银金融科技股份有限公司 Financial bill width calculation method
CN112733854B (en) * 2021-03-30 2021-08-03 恒银金融科技股份有限公司 Method for calculating deflection angle of bank note
CN113191348B (en) * 2021-05-31 2023-02-03 山东新一代信息产业技术研究院有限公司 Template-based text structured extraction method and tool
CN116403098B (en) * 2023-05-26 2023-08-08 四川金投科技股份有限公司 Bill tampering detection method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567764A (en) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 Bill certificate and system for improving electronic image recognition efficiency

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9576211B2 (en) * 2012-07-09 2017-02-21 Seiko Epson Corporation Device, method, and storage medium for magnetic ink character peak detection and recognition

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567764A (en) * 2012-01-13 2012-07-11 中国工商银行股份有限公司 Bill certificate and system for improving electronic image recognition efficiency

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"一种人民币纸币号码自动识别快速方法";娄元芳等;《微计算机信息》;20081225;第24卷(第12-3期);第210-212页 *
"基于结构特征的钢坯端面字符识别方法研究";张艳;《中国优秀硕士学位论文全文数据库信息科技辑》;20091115(第11期);第I138-1176页 *
"票据字符识别方法与应用的研究";张小军;《中国优秀硕士学位论文全文数据库信息科技辑》;20090515(第5期);第I138-786页 *

Also Published As

Publication number Publication date
CN104112128A (en) 2014-10-22

Similar Documents

Publication Publication Date Title
CN104112128B (en) Digital image processing system and method applied to bill image character recognition
CN111401372B (en) Method for extracting and identifying image-text information of scanned document
US6778703B1 (en) Form recognition using reference areas
CN111914838B (en) License plate recognition method based on text line recognition
CN108596166A (en) A kind of container number identification method based on convolutional neural networks classification
EP3258422A1 (en) Character segmentation and recognition method
CN111191611B (en) Traffic sign label identification method based on deep learning
CN110766020A (en) System and method for detecting and identifying multi-language natural scene text
CN111353961B (en) Document curved surface correction method and device
US11151402B2 (en) Method of character recognition in written document
CN109726717A (en) A kind of vehicle comprehensive information detection system
CA2815591C (en) Method for detecting and recognising an object in an image, and an apparatus and a computer program therefor
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN112507782A (en) Text image recognition method and device
CN111626292B (en) Text recognition method of building indication mark based on deep learning technology
CN110738216A (en) Medicine identification method based on improved SURF algorithm
US20120082372A1 (en) Automatic document image extraction and comparison
CN102737240B (en) Method of analyzing digital document images
CN110689003A (en) Low-illumination imaging license plate recognition method and system, computer equipment and storage medium
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN112446259A (en) Image processing method, device, terminal and computer readable storage medium
CN108197624A (en) The recognition methods of certificate image rectification and device, computer storage media
CN110766001B (en) Bank card number positioning and end-to-end identification method based on CNN and RNN
CN115410191B (en) Text image recognition method, device, equipment and storage medium
CN102682308B (en) Imaging processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant