CN105225218A - For distortion correction method and the equipment of file and picture - Google Patents

For distortion correction method and the equipment of file and picture Download PDF

Info

Publication number
CN105225218A
CN105225218A CN201410286333.5A CN201410286333A CN105225218A CN 105225218 A CN105225218 A CN 105225218A CN 201410286333 A CN201410286333 A CN 201410286333A CN 105225218 A CN105225218 A CN 105225218A
Authority
CN
China
Prior art keywords
baseline
text
short
file
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410286333.5A
Other languages
Chinese (zh)
Other versions
CN105225218B (en
Inventor
魏晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to CN201410286333.5A priority Critical patent/CN105225218B/en
Publication of CN105225218A publication Critical patent/CN105225218A/en
Application granted granted Critical
Publication of CN105225218B publication Critical patent/CN105225218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)

Abstract

The present invention relates to the distortion correction method for file and picture and equipment.Distortion correction method for file and picture comprises a baseline extraction step, for extract comprise in file and picture text filed in the baseline of line of text, wherein each line of text corresponds to a baseline; Baseline extends step, for extending the Short baseline comprised in the baseline that extracts based on the Long baselines comprised in extracted baseline; And aligning step, for correcting the distortion of file and picture based on extracted Long baselines and the Short baseline through extending.

Description

For distortion correction method and the equipment of file and picture
Technical field
The present invention relates to the method and apparatus of the distortion for correcting file and picture.More specifically, the present invention relates to the method and apparatus at least correcting the distortion of file and picture by extending the short essay one's own profession comprised in file and picture.
Background technology
In recent years, infotech is fast-developing in multiple fields of such as computer vision, image procossing and understanding etc., and especially electronic document process field obtains increasingly extensive concern and is used widely.
In electronic document process, file and picture identification (such as OCR) has been applied to and has contributed to multiple application and plurality of devices, this plurality of devices from have image pick-up device (such as scanner etc.) for document process professional office equipment to the device had for picking up file and picture (such as camera etc.), can pick up and process file and picture with the personal device (such as PC computing machine, PDA, handheld device) of the content of the such document of clear identification.
Along with the development of handheld device being equipped with camera, in the urgent need to movement (based on camera) OCR application in the various environment (identification of digitizing outside the office of such as document, foreign language road sign and the text voice input of vision disorder personnel).
But due to the difference between scanner and camera, mobile OCR becomes new challenge.In the document based on scanner is caught, document, usually by platen presses, is therefore flat substantially, and the distortion caused due to the physical state of document does not almost have.And in the document based on camera is caught, the document that be captured is normally free and may be uneven, the distortion that therefore physical distortion document causes is common phenomenon.Distortion will reduce OCR accuracy greatly, this is because the document shape of injustice makes typical text-processing (such as, split (line of text segmentation and Character segmentation) and identify) for captured image even if still become difficulty after local rotates.Therefore, the distortion correction of the file and picture of catching for camera is the necessary process for mobile OCR.
The multiple research of the distortion correction about file and picture is there is in prior art.
A kind of research is the technology based on 3D.In typical realization, this technology obtains the 3D surface configuration of original document by approximate original document surface, then corrects (such as carrying out planarization based on some special purpose models) the 3D surface obtained like this.Original document surface by being such as similar to the physical modeling on 3D grid by photographic projection, or can use SFS (shape from shading) technology to draw from the light and shade distribution single image.
But such method has restriction.Especially, for physical modeling, some are special and the optional equipment of complexity is required, and possibility is inconvenient and time-consuming in some cases like this.For shape from shading method, should meet some hypothesis (such as, near point light source) and should know some existing cameras restriction (such as, focal length), this can only be obtained by camera calibration process accurately.Therefore, such method based on 3D needs more complicated equipment mechanism and time-consuming operation usually.
In view of the defect of the above-mentioned method based on 3D, some other methods for the 2D image of document of proposing are to determine and to correct the distortion of file and picture.A kind of technology of the distortion for correcting file and picture is like this based on following hypothesis: distortion type is particular type and knows in advance.In the exemplary implementation, the surface of the bending document of this technology supposition is a certain geometric type (such as cylinder type), therefore have the supposition of a certain geometric type based on document surfaces, the geometric type surface for the file and picture of this supposition performs correction (such as low-rank matrix recovers and sparse error corrects).
But consider and want the real surface of the document of picked image usually to have the shape more complicated than the simple shape on the surface of supposition, this technology can not correct file and picture effectively.
Another kind method is the method based on border, and its border based on file and picture performs correction to file and picture.List of references [1] discloses a kind of method based on border, and the method is extracted and used document boundaries to describe distortion.This is suitable for the common distortion met with when imaging (such as, bonding distortion), this is because the method hypothesis document surfaces is made up of (the thick books such as, opened) two retive boundary curves.But the method can not correct situation when border is not relative.Another restriction is four edges circle of this method based on border supposition document is complete, but the situation in esse imperfect border will cause the twisted slices (warpingmesh) that can not generate whole document.
A kind of method in addition by the text message based on file and picture (such as, the baseline of the text comprised in text filed, text filed twisted slices) estimate the distortion of file and picture, thus distortion correction can be performed based on the text message in file and picture.Certain methods directly uses the baseline information of text.List of references [2] proposes the method for the line of text for bending with sizing.They connect component by cluster and find line of text curve, and this component mobile is to recover straight horizontal base line.List of references [3] is estimated text orientation and is recovered file and picture by word segmentation result.List of references [4] uses line transect describe baseline and use line transect to build two-dimensional grid, and uses scalloping technology to revise, and the distance between the adjacent column in its hypothetical target grid is uniform.
U.S. Patent application US2010/0073735 discloses a kind of document imaging method based on camera, and propose text based method, the distortion of the regional area during the method hypothesis is text filed is linear, it corrects by perspective distortion and is solved, and wherein local distortion's information can be collected by from the line of text in document.Distortion file and picture is divided into multiple grid (grid) based on estimated row and character direction by text based method, is then transformed into square by each grid, and they is placed in together to obtain the recovery completely of image.
Fig. 1 illustrates the process in method disclosed in U.S. Patent application US2010/0073735, first extract the baseline of the detected all line of text in input file and picture, then determine the vertical boundary of each paragraph based on Hough transformation method (assuming that vertical boundary is linear).Based on those baselines and vertical boundary, generate twisted slices, in each grid of twisted slices, then perform perspective distortion correct, image goes distortion the most at last.
But the method for above-mentioned prior art has some defects.
First, such text based method usually suppose file and picture text filed in most of line of text of comprising be long and complete, thus determine on this basis and correct the text filed distortion of document.
But, by such method, when multiple short essay one's own professions (very short row) that file and picture text filed actual comprises much smaller than other line of text, these short texts are about to by directly abandoning as noise and can not carry out any process, but the determination of text filed left margin and right margin affects being subject to abandoned short essay one's own profession, and the local distortion near the line of text be dropped can not be their locations accurately assessed, thus such text based method can not determine text filed distortion information exactly, and effectively can not correct text image.Therefore, the text based method of prior art can not tackle file and picture text filed in there is the situation of multiple short essay one's own profession.
Figure 1B shows the result corrected the wherein text filed file and picture comprising multiple short essay one's own profession by prior art, and as shown in Figure 1B, bearing calibration of the prior art have ignored the region with short essay one's own profession as shown in the left side circle in Figure 1B, therefore the correspondence correction for this region will be inaccurate, such as some words will be dropped, even some short words (such as caption text) will be lost, and text filed corresponding border is by inaccurate, as shown in the right side circle in Figure 1B.
Secondly, the text filed vertical distortion of the method supposition file and picture of prior art is linear, then utilizes Hough transformation to draw text filed vertical boundary based on this hypothesis.
But, by the method, when the text filed vertical distortion of file and picture is actual be nonlinear time, such text based method can not determine border exactly, text filed distortion information can not be determined exactly thus, and effectively can not correct file and picture.Therefore, it is nonlinear situation that such method can not tackle vertical distortion.
Fig. 1 C shows and is actually for wherein vertical boundary the result that nonlinear file and picture carries out correcting by prior art, and as shown in Figure 1 C, bearing calibration of the prior art supposes that vertical distortion is linear simply, therefore the correspondence correction for the document image will be inaccurate, such as special in ground represented by the symbol "-" of text filed boundary, text filed border in some positions by irregular (such as, not alignment), the distortion on border is not fully corrected.
Therefore the aberration correction technique for file and picture of prior art still needs to be improved.
The document quoted
[1]Y.C.TsoiandM.S.Brown.Geometricandshadingcorrectionforimagesofprintedmaterialsaunifiedapproachusingboundary.CVPR,pages240–246,2004.
[2]Z.ZhangandC.L.Tan.“Correctingdocumentimagewarpingbasedonregressionofcurvedtextlines”.InProceedingsoftheInternationalConferenceonDocumentAnalysisandRecognition,volume1,pages589–593,2003.
[3]B.Gatos,I.PratikakisandK.Ntirogiannis,“SegmentationBasedRecoveryofArbitraryWarpedDocumentImage”,Proc.9thInternationalConferenceonDocumentAnalysisandRecognition,pp.989-993,2007.
[4]C.WuandG.Agam.“Documentimagede-warpingfortext/graphicsrecognition”.InProceedingsofJointIAPR2002andSPR2002,2002.
[5]USpatentapplicationUS2010/0073735
Summary of the invention
The distortion correction that the present invention is directed to file and picture is developed, and is intended to solve the problem.
Even if an object of the present invention is file and picture to comprise multiple short essay one's own profession, still accurately determine that the text filed distortion of file and picture is effectively to correct.
Another object of the present invention accurately determines that the text filed border of file and picture is effectively to correct.
In one aspect, the invention provides a kind of distortion correction equipment for file and picture, comprise baseline extraction unit, be arranged to extract in file and picture comprise text filed in the baseline of line of text, wherein each line of text corresponds to a baseline; Baseline extension apparatus, is arranged to and extends based on the Long baselines comprised in extracted baseline the Short baseline comprised in extracted baseline; And correcting unit, be arranged to the distortion correcting file and picture based on extracted Long baselines and the Short baseline extended.
In yet another aspect, the invention provides a kind of distortion correction method for file and picture, comprise baseline extraction step, for extract comprise in file and picture text filed in the baseline of line of text, wherein each line of text corresponds to a baseline; Baseline extends step, for extending the Short baseline comprised in extracted baseline based on the Long baselines comprised in extracted baseline; And aligning step, for correcting the distortion of file and picture based on extracted Long baselines and the Short baseline extended.
Preferably, Long baselines can be that its length in extracted line is longer than or equals the line of first threshold, and Short baseline can be its in extracted line is shorter in length than the line of first threshold.
Preferably, baseline extends step and can comprise further: sub-zone dividing step, for being divided at least one subregion by text filed, wherein, from the first Long baselines in extracted Long baselines, every two the adjacent Long baselines in the Long baselines extracted limit each subregion in this at least one subregion; And subregion baseline extends step, for for each subregion at least one subregion, when this subregion comprises at least one Short baseline, extend based on these two Long baselines comprised in this subregion at least one Short baseline comprised in this subregion.
Preferably, when the line of text in file and picture in the horizontal direction time, first Long baselines is the Long baselines at the top of the line of text closest to file and picture that extracted baseline comprises, and sub-zone dividing step can perform from the text filed top of file and picture successively to bottom.
Preferably, subregion baseline extension step can comprise the Short baseline with maximum length at least one Short baseline selecting to comprise in this subregion further; Selected Short baseline is extended based on two Long baselines comprised in this subregion; And utilize the Short baseline through extending that this sub-zone dividing is become two new subregions, wherein, one of these two new subregions are limited by one of these two Long baselines and this Short baseline through extending, and another in these two new subregions is limited by another in this Short baseline through extension and this two Long baselines, wherein, for each in these two new subregions, order performs this selection, extension and division, until this at least one Short baseline comprised in this subregion is all extended.
Preferably, this baseline extends step and can comprise for being positioned at text filed top or the Short baseline of bottom further, based on extracted Long baselines and the Short baseline through extending whole in two baselines of this Short baseline of next-door neighbour extend this Short baseline.
Preferably, the method can comprise text filed border determining step further, for the text filed border determining to comprise in file and picture based on the end points of extracted Long baselines and the Short baseline through extending, and text filed border determining step can comprise unjustified baseline identification step, for identify extracted Long baselines and the Short baseline through extending whole among its end points be the baseline of unjustified end points; Unjustified base wavelet step, for for its identified end points being each in the baseline of unjustified end points, based on extracted Long baselines and the Short baseline through extending whole among two baselines of baseline of identifying of next-door neighbour revise the unjustified end points of the baseline that (rectify) identifies, and border generation step, for utilizing the end points of all baselines comprising revised unjustified baseline to generate the text filed border of file and picture.
Preferably, unjustified baseline identification step can comprise for extracted Long baselines and the Short baseline through extending whole in each baseline, based on extracted Long baselines and the Short baseline through extending whole in the end points of baseline of the predetermined quantity adjacent with this baseline generate and decide line (rulingline); And whether the end points identifying this baseline based on this ruling line is unjustified end points.
Preferably, this ruling line is by directly to connect or the end points of baseline of this predetermined quantity of matching generates.
Preferably, when identifying whether the end points of baseline is unjustified end points based on ruling line, for left end point, if left end point is positioned at the right side of ruling line and the distance of itself and this ruling line is greater than the 3rd threshold value, then left end point is identified as unjustified end points, and for right endpoint, if right endpoint is positioned at the left side of ruling line and the distance of itself and this ruling line is greater than the 4th threshold value, then right endpoint is identified as unjustified end points.
Preferably, unjustified base wavelet step can comprise further: directly the baseline end points of two baselines of connection or matching this baseline the most contiguous is to generate line; And make this baseline extend to intersect towards generated line, thus intersection point is used as the correction end points of this baseline.
Preferably, the border in text region is by directly to connect or the end points of all baselines of unjustified baseline that curve comprises through revising generates.
Preferably, aligning step can comprise further: twisted slices generation step, for generating twisted slices based on the text filed borders that are whole and that determined by it of extracted Long baselines and the Short baseline through extending, and correct the distortion of text image based on generated twisted slices.
[technique effect]
The invention provides new text filed distortion correction method, and the technical matters of the determination that can effectively solve about the distortion of file and picture and correction.
Especially, proposed a solution of the present invention, its extend file and picture text filed at least one short essay one's own profession that comprises to determine distortion information, then correct based on such distortion information.
Compare with the text based method of prior art, solution of the present invention efficiently utilizes usual uncared-for short essay one's own profession, thus determines that the text filed distortion information of file and picture is for correction more accurately.
Therefore, for the file and picture that can comprise multiple short essay one's own profession (comprising the short row that multiple short essay one's own profession, top short essay one's own profession and paragraph end up), this solution of the present invention carrys out by the Short baseline extending short essay one's own profession the distortion accurately and effectively determining and correct the document image.
Proposed another solution of the present invention, its revise file and picture text filed in the end points of baseline of line of text, to determine text filed border, then correct based on so text filed border.
With prior art based on border method compared with, this solution of the present invention does not need to suppose that document surfaces is made up of two relative boundary curves, therefore can tackle any situation.
Therefore, distort as nonlinear file and picture for wherein vertical, this solution of the present invention is by accurately determining that the distortion of file and picture accurately and is effectively determined and correct in text filed border, left and right.
In addition, for comprising multiple short essay one's own profession and wherein vertically distorting as nonlinear file and picture, the distortion of the document image still can accurately and effectively be determined and correct to this solution of the present invention.
Therefore, even if solution of the present invention indenting or still can provide gratifying correction result in unjustified situation in complexity.
With prior art based on 3D method compared with, the present invention does not rely on any additional equipment, and only just can obtain based on caught image and correct result accurately.
Other property feature of the present invention and advantage are by from being described clearly below with reference to accompanying drawing.
Accompanying drawing explanation
To be incorporated in instructions and the accompanying drawing forming a part for instructions shows embodiments of the invention, and together with the description for explaining principle of the present invention.In the accompanying drawings, similar Reference numeral indicates similar project.
Figure 1A illustrates the canonical process of the distortion correction for file and picture of the prior art, and Figure 1B and 1C illustrates the correction result for file and picture by prior art distortion correction method.
Fig. 2 is the block diagram of the exemplary hardware arrangement that the computer system that can realize embodiments of the invention is shown.
Fig. 3 is the process flow diagram of the distortion correction method of the file and picture illustrated according to the first embodiment of the present invention.
Fig. 4 illustrates the process flow diagram according to the process in the baseline extraction step of the first embodiment of the present invention.
Fig. 5 schematically shows the result that the connection component (CC) in baseline extraction step is analyzed.
Fig. 6 schematically shows the result of the line of text trace in baseline extraction step.
Fig. 7 schematically shows the result of the spline-fitting in baseline extraction step.
Fig. 8 illustrates that the baseline according to the first embodiment of the present invention extends the process flow diagram of the process in step.
Fig. 9 illustrates the text filed example being divided into subregion.
Figure 10 illustrates that the subregion baseline according to the first embodiment of the present invention extends the process flow diagram of the process in step.
Figure 11 illustrates the example of the Short baseline extended in subregion.
Figure 12 illustrates the example splitting subregion based on the baseline through extending in subregion.
Figure 13 illustrates the text filed result that wherein Short baseline has been extended.
Figure 14 illustrates the example of the Short baseline lying along text filed bottom.
Figure 15 illustrates the text filed example results that wherein top or bottom Short baseline have been extended.
Figure 16 illustrates the example of the text filed twisted slices generating file and picture.
Figure 17 illustrates the example of the distortion correcting file and picture based on generated twisted slices.
Figure 18 is the block diagram of the distortion correction equipment illustrated according to the first embodiment of the present invention.
Figure 19 illustrates the comparison between the distortion correction result that the method by prior art and the method according to the first embodiment of the present invention obtain.
Figure 20 is the process flow diagram of the process illustrated in the text filed border determining step in method according to a second embodiment of the present invention.
Figure 21 illustrates the example of the left end point revising baseline.
Figure 22 illustrates the example of the right endpoint revising baseline.
Figure 23 illustrates that the correction end points based on baseline generates text filed left margin and the example results of right margin.
Figure 24 is the block diagram of the baseline end points amending unit illustrated according to a second embodiment of the present invention.
Figure 25 illustrate generate text filed border by the method for prior art and method according to a second embodiment of the present invention result between comparison.
Concrete facts mode
Hereafter describe embodiments of the invention in detail with reference to the accompanying drawings.
It should be noted that Reference numeral similar in the accompanying drawings and the similar project of letter instruction, and once a project is defined in an accompanying drawing, then for accompanying drawing subsequently without the need to discussing it again.
First by explaining the implication of some term used in context of the present disclosure, to contribute to understanding the present invention.
In context of the present disclosure, image can refer to any one in the image (such as coloured image, gray level image etc.) of any type, and usually can comprise that at least one is text filed wherein.Should point out, in the context of the present specification, image type is not particularly restricted, as long as the distortion of such image can be determined and correct.In the context of the present specification, image comprises the text filed text filed image referring to image and comprise object.
Text filed in file and picture refers to continuous text content images region, and it generally includes capable or other the similar line of text of continuous print text character, and can comprise at least one the continuous text paragraph comprising such as header line.File and picture mainly comprises that at least one is text filed, is therefore commonly referred to as in the document image for the distortion of file and picture and correction the text filed distortion and correction that comprise.Comprise in text image at least one text filed can (under these circumstances can be considered as a whole text filed) adjacent one another are, or discrete (can be processed separately under these circumstances), and unless otherwise specific statement, otherwise described in the context of instructions for text filed process can be applied to equally at least one that comprise in file and picture text filed in each.
In the context of instructions, horizontal direction can refer to the direction consistent with line of text and vertical direction is the direction vertical with horizontal direction.Horizontal direction is not confined to this horizontal direction in a strict sense, and but the direction that basic horizontal that the distortion due to file and picture causes tilts a little can be contained, and vertical direction is not also confined to this vertical direction in a strict sense, but and the direction substantially vertically tilted a little that causes of the distortion can contained due to file and picture.
In the disclosure, term " first ", " second " etc. only for distinguishing element or step, instead of want order instruction time, prioritizing selection or importance.
Fig. 2 is the block diagram of the hardware configuration that the computer system 1000 can implementing embodiments of the invention is shown.
As shown in Figure 2, computer system comprises computing machine 1110.Computing machine 1110 comprises processing unit 1120, system storage 1130, non-removable non-volatile memory interface 1140, removable non-volatile memory interface 1150, user's input interface 1160, network interface 1170, video interface 1190 and exports peripheral interface 1195, and they are connected by system bus 1121.
System storage 1130 comprises ROM (ROM (read-only memory)) 1131 and RAM (random access memory) 1132.BIOS (Basic Input or Output System (BIOS)) 1133 resides in ROM1131.Operating system 1134, application program 1135, other program module 1136 and some routine datas 1137 reside in RAM1132.
Non-removable nonvolatile memory 1141 (such as hard disk) is connected to non-removable non-volatile memory interface 1140.Non-removable nonvolatile memory 1141 such as can store operating system 1144, application program 1145, other program module 1146 and some routine datas 1147.
Removable nonvolatile memory (such as floppy disk 1151 and CD-ROM drive 1155) is connected to removable non-volatile memory interface 1150.Such as, diskette 1 152 can insert floppy disk 1151, and CD (compact-disc) 1156 can insert CD-ROM drive 1155.
Such as the input equipment of mouse 1161 and keyboard 1162 is connected to user's input interface 1160.
Computing machine 1110 is connected to remote computer 1180 by network interface 1170.Such as, network interface 1170 can be connected to remote computer 1180 through LAN (Local Area Network) 1171.Alternatively, network interface 1170 can be connected to modulator-demodular unit (modulator-demodulator) 1172, and modulator-demodular unit 1172 is connected to remote computer 1180 through wide area network 1173.
Remote computer 1180 can comprise the storer 1181 of such as hard disk, and it stores remote application 1185.
Video interface 1190 is connected to monitor 1191.
Export peripheral interface 1195 and be connected to printer 1196 and loudspeaker 1197.
Computer system shown in Fig. 2 is only illustrative, and never intends limit the present invention, its application or use.
Computer system shown in Fig. 2 can be implemented as standalone computer for any embodiment, or the disposal system in equipment, wherein can remove one or more unnecessary assembly or can add one or more additional assembly.
Distortion correction method and equipment according to an embodiment of the invention are hereafter described with reference to the accompanying drawings.
[the first embodiment]
Hereafter with reference to Fig. 3 to 17, distortion correction method is according to an embodiment of the invention described.In this method, the short essay one's own profession of the text filed middle extraction comprised from file and picture is extended, thus accurately can determine the text filed distortion information of file and picture based on such short essay one's own profession through extending.
Fig. 3 is the process flow diagram of the distortion correction method of the file and picture illustrated according to the first embodiment of the present invention.
In the realization of the method according to the first embodiment, in step S100 (being also called as baseline extraction step), extract in file and picture comprise text filed in the baseline of line of text, wherein each line of text corresponds to a baseline.
In step S200 (be also called as baseline and extend step), extend the Short baseline comprised in extracted baseline based on the Long baselines comprised in extracted baseline.
In step S300 (being also called as aligning step), the text filed Long baselines based on being extracted comprised in file and picture and the whole of Short baseline through extending are corrected.
Hereafter in detail the process according to each step in the method for the first embodiment of the present invention will be described.
Fig. 4 illustrates the process flow diagram according to the process in the baseline extraction step in the method for the first embodiment of the present invention.
In the baseline extraction of input file and picture, first, use CC (connection component) to analyze and extract character CC (S110).Then, based on the top of CC or central point or bottom by CC cluster to different line of text (S120).Such as, the bottom of CC is used to by CC cluster to different line of text, and certainly, the top of CC or central point also can by clusters.Finally, make each line of text regularization (S130) by spline-fitting, thus the baseline of each line of text comprised can be obtained in the line of text in file and picture.
In CC analyzes, first, one group of CC is extracted from the file and picture of input.Such as, multiple method can be used, such as color cluster, self-adaption binaryzation, Morphological scale-space etc. in CC extracts.In this embodiment, CC is generated by self-adaption binaryzation result.Should point out, CC extracting method is not limited to this, and other method in this area is also possible.
Preferably, CC filters and can be employed to remove non-textual CC (comprising the CC of some noises CC and picture region (picture such as, in document and chart)) from extracted CC.Feature for filtering comprises CC size, CC aspect ratio, CC pixel length of stroke in the vertical direction and the horizontal direction.Should point out, CC filters and can realize as in the state of the art, and is not particularly limited.After such CC filters, remaining CC will be text CC.
Then, fragment CC (such as, a part for character) is incorporated into character CC respectively in the vertical direction and the horizontal direction.This process mainly tackles the fragment CC that will cause incorrect line of text trace (tracing).Feature for combining comprises overlap ratio, the overlap ratio in vertical direction and the CC height after combination in the distance of CC, the nest relation of CC, horizontal direction.
CC analysis result shown in Fig. 5.As shown in Figure 5, in the view corresponding to " after CC extraction ", the content of being surrounded by white blocks be extracted and may filtered CC, and those not besieged contents (in such as character " i " the original top ". " comprised) will be counted as fragment CC.In the view corresponding to " after CC merging ", those fragments and character CC merge to follow line of text.
In line of text trace, for the CC that still exists after merging at CC, can be connected according to the bottom connecting criterion and be grouped into these CC of line of text.This connection criterion relates generally to the overlap ratio in the distance of such as CC and horizontal projection.
Especially, when by carrying out trace to line of text bottom connection CC, if the amount of the CC in line of text is less than threshold value N (such as, N=4), be then removed by the row of trace.
Result after line of text trace is illustrated in figure 6, and the white wire instruction wherein below each line of text is by the line of text of trace.
After line of text trace, perform spline-fitting by for by the line of text of trace.Specifically, consider for by the line of text of trace, usually there is some subscripts or the subscript that affects Baseline detection wherein, execution spline-fitting is with such as by for being used its consecutive point to carry out revising the row by trace by each point comprised in the line of text of trace.
For by the current point in the line of text of trace, first, local distortion direction (local line) is estimated based on the left side consecutive point of this point and right side consecutive point.If current point is away from local line, then current point may be subscript or subscript, and the coordinate in Y-axis of current point will be corrected according to its X-axis coordinate and local line function.
After spline-fitting, the line of text be respectively corrected is expressed by traditional cubic spline interpolation (NCS), then, obtains corresponding baseline.The result of spline-fitting is illustrated in the figure 7, and in the figure 7, the white wire in the right part of view is the final baseline from the text area extraction input file and picture.
Should point out, the description for baseline extraction is only exemplary and not restrictive, and is also possible for other baseline extraction process in the OCR of file and picture.Such as, line of text correction can be realized by the mode except spline-fitting.
Hereafter describe according to the process in the baseline extension step of the first embodiment of the present invention with reference to Fig. 8.
Baseline is extended, first will be divided into two subsets from all baselines of the text area extraction of file and picture: Long baselines and Short baseline.In one implementation, Long baselines is that its length in extracted baseline is longer than or equals the line of specific threshold (hereinafter referred to as " first threshold "), and Short baseline is its length in extracted baseline is less than the line of specific threshold.Based on such definition, file and picture text filed in extraction all baselines in Short baseline can be identified to extend, for estimating the text filed distortion of file and picture.
Such as, such classification of Long baselines and Short baseline can be realized as follows.
First, from all baselines (such as, N is the quantity of baseline) of the text area extraction of file and picture by classified for the length according to them.Such as, the length of baseline can refer to the quantity of the pixel on the direction of corresponding line of text comprised in baseline.
Then, most Long baselines can be selected to be added in candidate collection, and for residue baseline i (i=2,3 ... N) re-treatment will be performed.
In each step of re-treatment, for current basic line i, the length (Li) of current basic line and the average length (Avg) of current candidate set are compared.If (α is empirical value to Li> α * Avg, 0.7< α <=1, such as, α=0.9), then current basic line can be counted as Long baselines, and will be added in current candidate set, thus the Avg of candidate collection after upgrading will be updated accordingly.Then, will carry out this process to next baseline, the baseline be extracted until all is classified.Finally, the baseline in final candidate collection will be Long baselines, and all the other baselines are Short baseline.
In the process above, being worth α * Avg will corresponding to above-mentioned specific threshold.Should point out, such specific threshold is here exemplarily described, and this specific threshold can adopt other value, such as constant threshold when classifying.
Should point out, such classification is only exemplary, and other mode classification is also possible, such as classifies for Short baseline.
In baseline extends, in step S210 (being also called as sub-zone dividing step), from the first Long baselines in extracted Long baselines, text filedly be divided at least one subregion, wherein every sub regions is limited by every two the adjacent Long baselines in extracted Long baselines.
In step S220 (be also called as subregion baseline and extend step), for each in this at least one subregion, when this subregion comprises at least one Short baseline, the Short baseline comprised in this subregion can be extended based on the Long baselines comprised in this subregion.
In step S230, for being positioned at text filed top or the Short baseline of bottom, can based on extracted Long baselines and the Short baseline through extending whole in two baselines be close to this Short baseline extend this Short baseline.Should point out, the such process in step S203 is optional, and when file and picture text filed in there is not any top or bottom Short baseline time, the step in step S230 do not need perform.
Hereafter in detail the process in each step will be described.
In sub-zone dividing step, usually, when the line of text in file and picture in the horizontal direction time, first Long baselines be the text filed top closest to file and picture that comprises in extracted baseline (namely, text filed beginning) Long baselines, therefore from the text filed top of file and picture to performing sub-zone dividing successively bottom it.
Based on the Long baselines in determined candidate collection, text filedly can be at least divided into little text filed (being also called as subregion).In all subregion, the baseline of beginning and the baseline of ending should be respectively Long baselines.In one implementation, two adjacent subarea territories should share a Long baselines, the ending baseline of such as subregion should be same baseline with the beginning baseline in the next son region be adjacent, and such division of subregion is illustrated in fig .9, wherein four sub regions (subregion 1 to 4) are divided out and are illustrated by with different line styles, and the bottom baseline of subregion and the upper base line in the next son region be adjacent are same baselines, such as Long baselines 1 shares by subregion 1 with immediately preceding the subregion (not shown in Fig. 9) before subregion 1, Long baselines 2 is shared by subregion 1 and subregion 2, Long baselines 3 is shared by subregion 2 and subregion 3, etc..
Should point out, such sub-zone dividing is not limited to such order, and can be performed by with other order, such as sub-zone dividing can perform from text filed bottom to text filed top, from text filed, part is clipped to text filed top and bottom execution etc., as long as the text filed of file and picture is divided at least one subregion (every sub regions should comprise two adjacent Long baselines).Even if when the line of text in file and picture is in other direction (such as vertical direction, vergence direction), sub-zone dividing also can be performed similarly.
Hereafter, the process extending step according to the subregion baseline in the method for the first embodiment is described in detail with reference to Figure 10.Such process performs for each order at least one subregion be divided, and can be performed in any order, such as, from the top to the bottom, from bottom to top etc., as long as all subregions will be processed.
In subregion baseline extends, for every sub regions, if there is at least one Short baseline in this subregion, the whole middle of at least one Short baseline comprised is selected the Short baseline with maximum length, otherwise this process will proceed to next son region from this subregion.Here, the length of selected Short baseline will limit as above similarly.
Then, selected Short baseline is extended based on two Long baselines comprised in current sub-region (that is, starting Long baselines and ending Long baselines).
Here, describe the process of the Short baseline extended in subregion with reference to Figure 11, wherein for the purpose of describing, the line of text in text filed is assumed that left-justify, and what therefore extend Short baseline instruction extends to right margin by the right endpoint of Short baseline.
As shown in figure 11, light from the right-hand member of current Short baseline, Short baseline can by with fixed step size (here, step-length in x-axis is 1 pixel, certainly, other step-length is also possible) extend to text filed border, right side, as shown in the dotted ellipse in Figure 11.At each extended position place, the y-axis position of this position is confirmed as meeting the following conditions:
d 2 &prime; d 1 &prime; = d 2 d 1 ,
Here, d1 and d2 is the distance of current endpoint respectively and between the top of this subregion and bottom Long baselines (measuring in y-axis) of Short baseline, and the distance of position respectively and between the top of this subregion and bottom Long baselines that a step extends to by the current endpoint that d1 ' and d2 ' is this Short baseline.
The horizontal coordinate x ' of extended position is defined as:
x &prime; = d 2 d 1 + d 2 &CenterDot; x top + d 1 d 1 + d 2 &CenterDot; x bottom ,
Here, x topand x bottombe the x-axis coordinate of the point of top Long baselines and bottom Long baselines respectively, this point corresponds to extended position and has y-axis coordinate as defined above.
Therefore, Short baseline will progressively extend, until it finally extends point (x e, y e) coordinate (x1, y1) of the corresponding end points based on top Long baselines and bottom Long baselines and (x2, y2) are determined.
Should point out, such description is only exemplary, and the Right Aligns situation that the left end point that this process can be applied to wherein Short baseline equally will be extended to left margin, wherein the right endpoint of Short baseline and left end point will be extended the central alignment situation to right margin and left margin respectively.
Then, current sub-region is divided into two new subregions by by the Short baseline through extending, wherein one of these two new subregions by one of these two Long baselines (such as, top Long baselines) and Short baseline through extending limit (these two baselines using as the top Long baselines of this new subregion and bottom Long baselines), and in these two new subregions another by this through extend Short baseline and these two Long baselines in another (such as, bottom Long baselines) limit (these two baselines using as the top Long baselines of this new subregion and bottom Long baselines).
Figure 12 illustrates the example splitting subregion based on the baseline through extending in subregion.As shown in Figure 12, the subregion 1 shown in Fig. 9 is divided into two new subregions (subregion 11 and 12).In the new subregion 11 divided, its top Long baselines is the original top Long baselines (Long baselines 1 shown in Fig. 9) of subregion 1, and Long baselines is the current baseline through extending bottom it, as shown in solid line rectangle frame, and in the new subregion 12 divided, its top Long baselines is the current baseline through extending, and bottom Long baselines is the original bottom Long baselines (Long baselines 2 shown in Fig. 9) of subregion 1.
Next, for each in this new subregion, incite somebody to action similarly and sequentially perform above-mentioned selection, extension and division process, until all Short baseline comprised in this new subregion are extended.Thus, all Short baseline comprised in current sub-region will be extended.After this, above-mentioned selection, extension and division process will proceed to next son region, until all subregions are processed, thus Short baseline in all subregions (that is, text filed in all Short baseline) is extended.
Figure 13 illustrates the text filed result that wherein Short baseline has been extended.
Should point out, such description is only exemplary, and such Short baseline extend left end point that process can be applied to wherein Short baseline equally by be extended to left margin Right Aligns situation (namely, the left side of Short baseline is the situation staying white region), wherein the right endpoint of Short baseline and left end point will be extended the central alignment situation (that is, the left and right sides of Short baseline is all the situation staying white region) to right margin and left margin respectively.
Consider that text filed top or bottom baseline may be the particular cases of Short baseline.Because such Short baseline will not be comprised in the subregion of above-mentioned division, such Short baseline may extend process without undergoing above-mentioned baseline.Therefore, such Short baseline will be extended individually.
The extension of such top or bottom Short baseline by based on its adjacent Long baselines, and is described with reference to Figure 13, and Figure 13 schematically shows the extension (schematically corresponding to step S230) of bottom Short baseline.
As shown in figure 13, text filed baseline (L0) is Short baseline.
In extension process, will select and immediate two Long baselines (L1 and L2) of this Short baseline.In one implementation, immediate two Long baselines by adjacent with this Short baseline and subjected in the subregion that above-mentioned subregion extends and selected, therefore these immediate two Long baselines by be the Long baselines that is extracted and the Short baseline through extending whole in two immediate baselines.
Light from the right-hand member of current Short baseline, this Short baseline will be extended to text filed border, right side here with fixed step size (, the step-length in x-axis is 1 pixel, and certainly, other step-length is also possible).At each extended position place, the y-axis position of this position is confirmed as meeting the following conditions:
h 12 &prime; h 01 &prime; = h 12 h 01
Here, h 01and h 02be this Short baseline of current endpoint place of Short baseline and this first closest to the distance between Long baselines and first closest to Long baselines and second closest to the distance (measuring in y-axis) between Long baselines, and h 01' and h 02' be this Short baseline of position and this subregion one step extended at the current endpoint of this Short baseline first closest to the distance between Long baselines and this first closest to Long baselines and second closest to the distance between Long baselines.
The horizontal coordinate x ' of extended position is defined as
x &prime; = h 01 + h 12 2 h 01 + h 12 &CenterDot; x 1 + h 01 2 h 01 + h 12 &CenterDot; x 2 ,
Here, x 1and x 2be respectively this first and second closest to the x-axis coordinate of the point of Long baselines, this point corresponds to this extended position and has y-axis coordinate as defined above.
Therefore, Short baseline will progressively extend, until it finally extends point (x e, y e) coordinate of the corresponding end points based on this first and second Long baselines is determined.
Should point out, above-mentioned bottom baseline extends the upper base line processing and can be applied to equally as Short baseline.The extension result of top and bottom Short baseline is illustrated in fig .15.
In addition, with mentioned above similar, be performed in left-justify situation although above-mentioned bottom baseline extends process, but its left end point that can be applied to wherein Short baseline equally by be extended to left margin Right Aligns situation (namely, the left side of Short baseline is the situation staying white region), wherein the right endpoint of Short baseline and left end point will be extended the central alignment situation (that is, the left and right sides of Short baseline is all the situation staying white region) to right margin and left margin respectively.
Should point out, (namely above-mentioned Short baseline extends mode, the determination mode of the coordinate of the extension point of Short baseline) be only exemplary, and such extension is mainly intended to extend Short baseline to follow with reference to Long baselines (top in subregion extension and bottom Long baselines, or two immediate Long baselines during top and bottom Short baseline extend), the mode that extends thus is not limited to shown mode, and other extension mode is also possible, also can be other baseline except above-mentioned Long baselines with reference to Long baselines.
Hereafter, with reference to Figure 16 and 17, the process according to the aligning step in the method for the first embodiment of the present invention is described.Usually, correct the information text filed baseline first based on obtained file and picture being drawn (comprising the whole of the Short baseline of Long baselines and the extension be extracted) the text filed distortion about file and picture, then correct the distortion of file and picture based on distortion information.
In the one typical case of correction process realizes, information about the text filed distortion of file and picture can be twisted slices (warpingmesh), and aligning step generates such twisted slices to correct by based on obtained baseline thus.
Reflect that text filed grid generally includes mesh lines (meshline), this mesh lines is intersected with each other, thus text filed grid can be divided into multiple grid, and the shape of grid can be determined based on the layout of mesh lines.Mesh lines can refer to file and picture text filed in line of text be formed.In a kind of typical case realizes, mesh lines in grid can be made up of horizontal gridlines and vertical gridlines, wherein horizontal gridlines is by consistent and substantially identical with the baseline extracted for line of text with line of text, and vertical gridlines is the mesh lines substantially vertical with horizontal gridlines, thus the grid of grid will be quadrilateral.
In other realizes, mesh lines can be that any other is arranged, such as forms triangular lattice, and other suitable polygonal gird any, as long as a networking ruling can consistent with line of text (such as, substantially identical with the baseline extracted for line of text).
Usually, except the end points of baseline, twisted slices also generates based on the text filed border of file and picture, and this border obtains by the end points of baseline.When file and picture line of text in the horizontal direction, text filed border will refer to text filed left margin and right margin.Certainly, if line of text in vertical direction, then border will be coboundary and lower boundary.
Should point out, text filed border can based on obtained baseline (comprising extracted Long baselines and the Short baseline through extending) in every way (such as, by directly connecting the end points of each baseline, or the end points by each baseline of curve) determined, be not thus particularly limited.Determination due to border make use of the Short baseline through extending, instead of ignore these Short baseline as in the state of the art, then the text filed border of file and picture then its distortion information can be obtained exactly, even if this border preparation method is prior art is also like this.
Consider that generated twisted slices should comprise the whole text filed of file and picture, especially, the first text filed line of text and final line of text should be comprised, should perform in the generation of twisted slices guarantee first and last line of text covered in process in the grid of generation.Hereafter, description is used for the process of the first line of text exemplarily, and such example can be applied to final line of text equally.
In the process for the first line of text, first, calculate the average meshes height of the baseline (mesh lines of grid) of all acquisitions, and be labeled as avg_H, then, calculate the text height (calculating the NCS function of the top line of the first line of text in step 100) of the first line of text, and be labeled as H_text.Next, extend left side text border to be used as the starting point of top boundary, and extended distance can be the maximal value of avg_H and H_text, then, the baseline by following the first line of text from this starting point carrys out extension line to obtain the top line of grid.Should point out, this extension can be implemented by the mode similar to the mode of above-mentioned extension upper base line, but extension is not limited thereto, and other extension mode also can be used, as long as can obtain the suitable top line of twisted slices.
By the baseline of text filed all acquisitions of file and picture (comprising extracted Long baselines and the Short baseline through extending) and top and bottom line and determined left margin and right margin, each line in baseline and top and bottom line is divided into the segmentation with equal length, and (this length calculates according to the starting point of baseline and terminal and the number of fragments that is set, here number of fragments is determined by the maximum length of baseline and the ratio of average meshes height), thus the text filed twisted slices of file and picture can be generated, as shown in figure 16.
Next, by the distortion correction based on generated grid perform document image.
For each grid in twisted slices, will generate transformed mappings figure, each point in fault image can be mapped to warp image by this transformed mappings figure.Such as, by referring to interpolating on sides technology, the image mapped of the image after from the distorted document image of input to correction can be built according to obtained twisted slices.After acquisition transformed mappings figure, generate the image after correcting by carrying out bilinear interpolation to the pixel in original document region.Thus, can according to the distortion correction of such Mapping implementation file and picture, as shown in figure 17.
Should point out, the description of aligning step is only exemplary, and when the information of the distortion of file and picture is represented in any other way, the aligning step according to the method for the first embodiment can be realized by with the alternate manner different from aforesaid way.
Hereafter, describe the distortion correction equipment according to the first embodiment of the present invention with reference to Figure 18, Figure 18 is the block diagram of the distortion correction equipment illustrated according to the first embodiment of the present invention.
Distortion correction equipment 1800 can comprise baseline extraction unit 1801, be arranged to extract in file and picture comprise text filed in the baseline of line of text, wherein each line of text corresponds to a baseline; Baseline extension apparatus 1802, is arranged to and extends based on the Long baselines comprised in extracted baseline the Short baseline comprised in the baseline extracted; And correcting unit 1803, be arranged to the distortion correcting file and picture based on extracted Long baselines and the Short baseline through extending.
Preferably, baseline extraction unit 1802 can comprise sub-zone dividing unit 1802-1, be arranged to and be divided at least one subregion by text filed, wherein from the first Long baselines in extracted Long baselines, each in this at least one subregion is limited by every two the adjacent Long baselines in extracted Long baselines; And subregion baseline extension apparatus 1802-2, be arranged to for each at least one subregion described, when this subregion comprises at least one Short baseline, two Long baselines comprised based on this region extend in this subregion comprise described at least one Short baseline.
Preferably and alternatively, baseline extension apparatus 1802 can comprise unit 1802-3, it is arranged to for being positioned at text filed top or the Short baseline of bottom, based on extracted Long baselines and the Short baseline through extending whole in two baselines of this Short baseline of next-door neighbour extend this Short baseline.
Preferably, subregion baseline extension apparatus 1802-2 can comprise the unit with the Short baseline of maximum length at least one Short baseline being arranged to and selecting to comprise in this subregion further; Be arranged to the unit extending selected Short baseline based on two Long baselines comprised in this subregion; And be arranged to the unit by the Short baseline through extending, this sub-zone dividing being become two new subregions, wherein, one in these two new subregions by one of these two Long baselines with this through extend Short baseline limit, and another in these two new subregions is limited by another in this Short baseline through extension and these two Long baselines
Wherein, for each in these two new subregions, order performs described selection, extension and division, until at least one Short baseline described comprised in this subregion is all extended.
[advantageous effects]
As mentioned above, propose a solution of the present invention, it extends the information determining the text filed distortion about file and picture from least one Short baseline of the text filed middle extraction of file and picture, then corrects based on such distortion information.
Compare with the text based method of prior art, solution of the present invention efficiently utilizes usual uncared-for short essay one's own profession, thus determines that the text filed distortion information of file and picture is for correction more accurately.
Therefore, for the file and picture that may comprise multiple short essay one's own profession (comprising the short row that the one's own profession of many short essays, top short essay one's own profession and paragraph end up), this solution of the present invention carrys out by the Short baseline extending short essay one's own profession the distortion accurately and effectively determining and correct the document image.
Figure 19 shows by the comparison between prior art and the distortion correction result obtained according to the method for the first embodiment of the present invention.As shown in Figure 19, for the original input file and picture with some pole short essay one's own professions (such as header line " birth background (Song typeface 14) "), as shown in the medial view in Figure 19, the method of prior art will lose such header line, and as shown in the right side view in Figure 19, solution of the present invention can detect such header line exactly.
[the second embodiment]
Distortion correction method according to a second embodiment of the present invention is hereafter described with reference to the accompanying drawings.In order to correct text filed distortion (such as, by building text filed twisted slices), text filed border in file and picture (such as, when line of text in the horizontal direction text filed left margin and right margin) should be determined.Left margin and the right margin of prior art supposition paragraph are near linear, be modeled, but this hypothesis are untenable in vertical distortion is for nonlinear situation by Hough transformation.Be provided to solve this situation according to the method for the second embodiment.For the sake of clarity, the element identical with the first embodiment in the second embodiment and step are indicated by with same reference numerals, and their description will be omitted.
In this method, the end points of the text filed baseline of file and picture is revised further accurately to determine the text filed border of file and picture, thus the text filed border of determined file and picture also can be used to obtain text filed distortion information more accurately for correction.
In one implementation, such correction can be applied directly to the end points of baseline (this baseline by process of the prior art by from text area extraction), to obtain text filed border more accurately, thus compared with those methods of the prior art (wherein not performing such correction), text filed distortion still can be corrected based on extracted baseline and the border that obtains as described above, such as especially effective for the text filed situation that may not comprise Short baseline.
In such a case, text filed grid obtain process can extract comprise in file and picture text filed in the baseline of line of text; Its end points identifying in the baseline of all extractions is the baseline of unjustified end points; Be each in the baseline of unjustified end points for its identified end points; two normal baselines based on the baseline of this identification of next-door neighbour in the baseline of all extractions revise the unjustified end points of identified baseline, and utilize the end points comprising all baselines of revised unjustified baseline to generate the text filed border of file and picture.Therefore, mesh lines is formed to obtain text filed grid based on extracted baseline and the text filed border generated.
In one preferably realizes, when text filed comprise at least one Short baseline, such correction can be performed after baseline as in the first embodiment described extends, that is, the solution of the first embodiment and the second embodiment may be combined with to obtain more favourable effect.
Obtain in process at the text filed grid of such situation, after baseline has been extracted and Short baseline wherein has been extended as in the first embodiment, the process of the second embodiment by operation with identify the Long baselines that extracts and the Short baseline through extending whole in its end points be the baseline of unjustified end points; Be each in the baseline of unjustified end points for its identified end points; based on extracted Long baselines and the Short baseline through extending whole in two normal baselines of baseline of this identification of next-door neighbour revise the unjustified end points of identified baseline, and utilize the end points comprising all baselines of revised unjustified baseline to generate the text filed border of file and picture.Therefore, mesh lines is formed to obtain text filed grid based on extracted Long baselines, Short baseline through extending and the text filed border that generates.
Hereafter, the process of the text border determining step of method is according to a second embodiment of the present invention described with reference to Figure 20.
In step S2010 (being also called as unjustified baseline identification step), its end points identifying in the baseline of all acquisitions is the baseline of unjustified end points.When text border determining step is applied to the first embodiment, the baseline of all acquisitions will be the Long baselines of the extraction obtained as in the first embodiment and the Short baseline through extending.
In unjustified baseline identifying processing, for each in the baseline of all acquisitions, generate the ruling line based on the end points of the baseline of the predetermined quantity adjacent with this baseline in the baseline of all acquisitions, whether the end points then based on this baseline of ruling line identification generated is unjustified end points.
This ruling line is by directly to connect or matching generates immediately preceding the end points of the baseline before and after this baseline.In a further implementation, this ruling line generates by the end points of the baseline of the matching predetermined quantity adjacent with this baseline.Matching can be the Mathematical Fitting of any mode of the prior art.
In step S2020 (being also called as unjustified base wavelet step); for each baseline that its identified end points is in the baseline of unjustified end points, revise the unjustified end points of the baseline of this identification based on two normal baselines be close to the baseline of this identification in the baseline of all acquisitions.
In unjustified base wavelet step, end points immediately preceding the baseline before and after identified baseline is connected directly or matching decides line to generate, then make identified baseline extend to intersect towards generated ruling line, thus intersection point is used as the correction end points of this baseline.
In step S2030, the end points based on all baselines comprising revised baseline determines text filed border.
Hereafter, describe with reference to Figure 21 to 23 realization that the text border being used for left margin and right margin determines process in detail.This description is for text filed left-Aligned situation, and those skilled in the art can expect, such text border determines that process can be applied to other situation (such as, text filed Right Aligns, text filed central alignment) equally.
For the left end point of the baseline that will be identified, by the ruling line that uses by directly to connect or the left end point of baseline of matching adjacent with this baseline (such as, immediately preceding before and after this baseline) generates.In this case, the adjacent baselines of predetermined quantity is two baselines.
Then, can determine that whether the left end point of baseline is unjustified based on the ruling line generated, such as whether deciding the right side of line and (the 3rd threshold value with it in a distance by the left end point that judges this baseline, such as be greater than the distance of the half of text height, certainly, other value is also possible).If so, then this left end point is marked as indentation or unjustified end points.As shown in Figure 21, the end points indicated by circle is identified as unjustified end points.
Then, its left end point is identified as to the baseline of indentation or unjustified end points, the baseline be identified is extended to the left, until crossing with ruling line, thus intersection point is using the correction left end point as this baseline, and generates text filed left margin by being used for.Should point out, the extension of baseline can be implemented in a number of ways.In the preferred implementation, baseline tangentially can extend at left end point place.
For the right endpoint of baseline, the process of the above-mentioned left end point for baseline also can be applicable to right endpoint to revise unjustified right endpoint.
But, consider always there is continuous indentation or unjustified (for left align text in left-justify situation, left side there is not continuous indentation or unjustified situation in text filed border usually), in order to obtain the better correction result of right endpoint, preferably, the process different from the process for left end point can be applied.
In the process of the right endpoint for revising baseline, first, the end points by the baseline of the matching predetermined quantity adjacent with this baseline generates ruling line.This matching can be the Mathematical Fitting of any mode of the prior art, such as linear regression, is the RANSAC (RANSAC algorithm) etc. of linear hypothesis based on local distortion.The quantity of adjacent baselines can be freely set, and is such as 5, and is not limited thereto.
Figure 22 (a) shows the ruling line that the right endpoint for baseline obtains, this ruling line is used to the vertical distortion of the regional area estimated near the right endpoint of current basic line, this regional area is made up of the terminal area of N bar (such as, 5) baseline.
Then, can determine that whether the right endpoint of this baseline is unjustified based on the ruling line generated from the normal end points of adjacent baselines (neither indentation neither unjustified end points).
For the right endpoint of baseline, if its ruling line left side and with it in a distance (the 4th threshold value, such as, be greater than the distance of the half of text height, and certainly, other value is also possible).If so, then this right endpoint is marked as unjustified end points (the circle instruction in Figure 22 (b)).Otherwise it is marked as normal end points.
Next, by revise identify the unjustified right endpoint of Short baseline.For the unjustified right endpoint (indicated by the circle near such as word " power ") of this identification, two normal end points (end points such as, indicated by the circle near word " elder generation " and " he " respectively) closest to this end points will be selected.Then, these two normal end points are used to carry out fit line.Then, extended to the right by the baseline be identified, until crossing with by the line of matching, thus intersection point will be the correction right endpoint of baseline, and generate text filed right margin (as Suo Shi Figure 22 (c)) by being used for.Should point out, the extension of baseline can be implemented in a number of ways.In one preferably realizes, baseline can extend in the tangential direction at this right endpoint place.
Based on the correction aft terminal of baseline, text filed border can be generated with improvement.As shown in figure 23, text filed left margin and right margin is generated respectively based on the left end point revised and right endpoint.Border generates by the end points correcting adjacent baselines.Certainly, other border generating mode is also possible.
Should point out, in the foregoing description, the unjustified end points of baseline is corrected by the baseline of the predetermined quantity around this baseline, but foregoing description is only exemplary and not restrictive.For the unjustified top of its end points and bottom baseline, top and bottom baseline are corrected by the adjacent baselines of the predetermined quantity before or after this baseline.
Hereafter, with reference to Figure 24 description text filed border determining unit according to a second embodiment of the present invention, Figure 24 is the block diagram of the text filed border determining unit illustrated according to a second embodiment of the present invention.
Text zone boundary determining unit can directly apply to current distortion correction equipment, also determines text filed border thus for use in the baseline revising all acquisitions.In the preferred implementation, this border determining unit can be used for the distortion correction equipment according to the first embodiment of the present invention, to revise the Long baselines that extracts and the whole of the Short baseline through extending also determine text filed border thus.
In a second embodiment, text filed border determining unit 2400 can be arranged in baseline extension apparatus 1802 according to the first embodiment or correcting unit 1803, or can according to the baseline extension apparatus 1802 of the first embodiment of the present invention or correcting unit 1803 outside mutual with it.
Text filed border determining unit 2400 even can be positioned at the distortion correction device external according to the first embodiment.
Text filed border determining unit 2400 can comprise unjustified baseline recognition unit 2401, be arranged to Long baselines that identification extracts and the baseline through extending whole in end points be the baseline of unjustified end points; Unjustified base wavelet unit 2402; being arranged to for identified end points is each in the baseline of unjustified end points; based on the Long baselines extracted and the baseline through extending whole in two normal baselines of baseline of identifying of next-door neighbour revise the unjustified end points of identified baseline; and border generation unit 2403, be arranged to and utilize the end points comprising all baselines of revised unjustified baseline to generate the text filed border of file and picture.
Preferably, unjustified baseline recognition unit 2401 can comprise be arranged to based on extracted Long baselines and the baseline through extending whole in the end points of baseline of the predetermined quantity adjacent with this baseline generate the unit deciding line; And be arranged to and identify whether the end points of baseline is unjustified end points based on described ruling line, wherein, for extracted Long baselines and the Short baseline through extending whole in each order perform such generation and identification.
Preferably, unjustified base wavelet unit 2402 can comprise further be arranged to directly connect or the baseline end points of two of this baseline of matching most adjacent baselines to generate the unit of line; And be arranged to make baseline towards generated line extend with intersect with the unit making intersection point be used as the correction end points of this baseline.
[advantageous effects]
As mentioned above, proposed a solution of the present invention, its revise file and picture text filed in the end points of baseline of line of text, accurately to determine text filed border, then correct based on so text filed border.
With prior art based on border method compared with, this solution of the present invention can process four edges circle be all nonlinear curve form any situation.
Therefore, distort as nonlinear file and picture for wherein vertical, this solution of the present invention is by accurately determining that the distortion of file and picture accurately and is effectively determined and correct in text filed border, left and right.
Figure 25 illustrates the comparison between the baseline correction result that obtained by the method for prior art and method according to a second embodiment of the present invention.As shown in figure 25, for the file and picture of original input, such as, the left margin of the text filed first paragraph obtained by prior art is still distorted (as the medial view in Figure 25 illustrates) a little, and such left margin can be detected by solution of the present invention and represent (as the right side view in Figure 25 illustrates) exactly.
In addition, in conjunction with the first embodiment, distort as nonlinear file and picture for wherein vertical, even if text image intricately distorts and can comprise multiple short essay one's own profession, solution of the present invention is still by accurately determining that text filed left margin and right margin come accurately and effectively to determine with correcting distorted.
[industrial applicability]
The present invention can be used for many application.Such as, the present invention can be used for identifying and processing the file and picture of being caught by camera, and especially for being equipped with the handheld device of camera (being furnished with the mobile phone of camera) to be favourable.
Should point out, the method and apparatus described in this instructions can be implemented as software, firmware, hardware or their any combination.Some assemblies can such as be implemented as the software run on digital signal processor or microprocessor.Other assembly such as can be implemented as hardware and/or special IC.
Various ways can be adopted to carry out method and apparatus of the present invention.Such as, method and system of the present invention is carried out by software, hardware, firmware or their any combination.The order of the step of the method mentioned above is only illustrative, and unless specifically stated otherwise, otherwise the step of method of the present invention is not limited to specifically described order above.In addition, in certain embodiments, the present invention also can be embodied as the program recorded in recording medium, comprises the machine readable instructions for implementing according to method of the present invention.Therefore, the present invention also covers the recording medium stored for implementing the program according to method of the present invention.
Although reference example embodiment describes the present invention, it will be appreciated by those skilled in the art that above-mentioned example is only illustrative instead of intends to limit the scope of the invention.It will be understood by those skilled in the art that above-described embodiment can be modified when not deviating from scope and spirit of the present invention.Scope of the present invention is defined by the appended claims, and the scope of appended claim will be given the most wide in range explanation, to comprise all such modifications and equivalent structure and function.

Claims (13)

1., for a distortion correction equipment for file and picture, comprising:
Baseline extraction unit, be arranged to extract in file and picture comprise text filed in the baseline of line of text, wherein each line of text corresponds to a baseline;
Baseline extension apparatus, is arranged to and extends based on the Long baselines comprised in extracted baseline the Short baseline comprised in the baseline extracted; And
Correcting unit, is arranged to the distortion correcting file and picture based on extracted Long baselines and the Short baseline through extending.
2. equipment according to claim 1, wherein, Long baselines is the baseline that length in extracted baseline is greater than or equal to first threshold, and Short baseline is length in extracted baseline is less than the baseline of first threshold.
3. equipment according to claim 1, wherein, described baseline extension apparatus comprises further:
Sub-zone dividing unit, be arranged to and be divided at least one subregion by text filed, wherein from the first Long baselines in extracted Long baselines, limit each in this at least one subregion by every two the adjacent Long baselines in extracted Long baselines; And
Subregion baseline extension apparatus, be arranged to for each at least one subregion described, when this subregion comprises at least one Short baseline, based on two Long baselines comprised in this subregion extend in this subregion comprise described at least one Short baseline.
4. equipment according to claim 3, wherein, described subregion baseline extension apparatus comprises further:
Be arranged to the unit with the Short baseline of maximum length at least one Short baseline described selecting to comprise in this subregion;
Be arranged to the unit extending selected Short baseline based on two Long baselines comprised in this subregion; And
Be arranged to the unit by this Short baseline through extending, this sub-zone dividing being become two new subregions, wherein, one in these two new subregions is limited by one of these two Long baselines and this Short baseline through extending, and another in these two new subregions is limited by another in this Short baseline through extending and these two Long baselines
Wherein, for each in these two new subregions, order performs described selection, extension and division, until at least one Short baseline described comprised in this subregion is all extended.
5. equipment according to claim 3, wherein, described baseline extension apparatus comprises further:
Be arranged to for being positioned at text filed top or the Short baseline of bottom, based on extracted Long baselines and the Short baseline through extending whole in two baselines of this Short baseline of next-door neighbour extend the unit of this Short baseline.
6. equipment according to claim 1, comprise text filed border determining unit further, the end points be arranged to based on extracted Long baselines and the Short baseline through extending determines the text filed border that file and picture comprises, and described text filed border determining unit comprises:
Unjustified baseline recognition unit, be arranged to Long baselines that identification extracts and the Short baseline through extending whole in end points be the baseline of unjustified end points;
Unjustified base wavelet unit; being arranged to for identified end points is each in the baseline of unjustified end points; based on extracted Long baselines and the Short baseline through extending whole in two normal baselines of baseline of identifying of next-door neighbour revise the unjustified end points of identified baseline, and
Border generation unit, is arranged to and utilizes the end points comprising all baselines of revised unjustified baseline to generate the text filed border of file and picture.
7. equipment according to claim 6, wherein, unjustified baseline recognition unit comprises:
Be arranged to based on extracted Long baselines and the baseline through extending whole in the end points of baseline of the predetermined quantity adjacent with this baseline generate the unit deciding line; And
Be arranged to and identify that whether the end points of baseline is the unit of unjustified end points based on described ruling line,
Wherein, for extracted Long baselines and through extend Short baseline whole in each order perform such generation and identification.
8. equipment according to claim 7, wherein, described ruling line is by directly to connect or the baseline end points of baseline of predetermined quantity that matching is adjacent with this baseline generates.
9. equipment according to claim 7, wherein, when whether the end points based on described ruling line identification baseline is unjustified end points,
For left end point, if this left end point is positioned at the right side of described ruling line and the distance apart from described ruling line is greater than the 3rd threshold value, then left end point is identified as unjustified end points, and
For right endpoint, if this right endpoint is positioned at the left side of described ruling line and the distance apart from described ruling line is greater than the 4th threshold value, then right endpoint is identified as unjustified end points.
10. equipment according to claim 7, wherein, unjustified base wavelet unit comprises further:
Be arranged to directly connect or the baseline end points of two of this baseline of matching most adjacent baselines to generate the unit of line; And
Be arranged to and make baseline extend to intersect towards generated line, thus intersection point is used as the unit of the correction end points of this baseline.
11. equipment according to claim 7, wherein, generate text filed border by the end points directly connected or curve comprises all baselines of revised unjustified baseline.
12. equipment according to claim 1, wherein, described correcting unit comprises further:
Twisted slices generation unit, the text filed borders that are whole and that determined by it be arranged to based on extracted Long baselines and the Short baseline through extending generate twisted slices, and
Be arranged to the unit of the distortion correcting file and picture based on generated twisted slices.
13. 1 kinds, for the distortion correction method of file and picture, comprising:
Baseline extraction step, for extract comprise in file and picture text filed in the baseline of line of text, wherein each line of text corresponds to a baseline;
Baseline extends step, for extending the Short baseline comprised in the baseline that extracts based on the Long baselines comprised in extracted baseline; And
Aligning step, for correcting the distortion of file and picture based on extracted Long baselines and the Short baseline through extending.
CN201410286333.5A 2014-06-24 2014-06-24 Distortion correction method and equipment for file and picture Active CN105225218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410286333.5A CN105225218B (en) 2014-06-24 2014-06-24 Distortion correction method and equipment for file and picture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410286333.5A CN105225218B (en) 2014-06-24 2014-06-24 Distortion correction method and equipment for file and picture

Publications (2)

Publication Number Publication Date
CN105225218A true CN105225218A (en) 2016-01-06
CN105225218B CN105225218B (en) 2018-12-21

Family

ID=54994168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410286333.5A Active CN105225218B (en) 2014-06-24 2014-06-24 Distortion correction method and equipment for file and picture

Country Status (1)

Country Link
CN (1) CN105225218B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106249236A (en) * 2016-07-12 2016-12-21 北京航空航天大学 A kind of spaceborne InSAR long-short baselines image associating method for registering
CN108875744A (en) * 2018-03-05 2018-11-23 南京理工大学 Multi-oriented text lines detection method based on rectangle frame coordinate transform
CN109241966A (en) * 2018-08-22 2019-01-18 东北农业大学 A kind of plant leaf blade nondestructive collection method
CN109829437A (en) * 2019-02-01 2019-05-31 北京旷视科技有限公司 Image processing method, text recognition method, device and electronic system
CN110852229A (en) * 2019-11-04 2020-02-28 泰康保险集团股份有限公司 Method, device and equipment for determining position of text area in image and storage medium
CN111724320A (en) * 2020-06-19 2020-09-29 北京波谱华光科技有限公司 Blind pixel filling method and system
CN111967463A (en) * 2020-06-23 2020-11-20 南昌大学 Method for detecting curve fitting of curved text in natural scene

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070206877A1 (en) * 2006-03-02 2007-09-06 Minghui Wu Model-based dewarping method and apparatus
CN101515984A (en) * 2008-02-19 2009-08-26 佳能株式会社 Electronic document producing device and electronic document producing method
US20100073735A1 (en) * 2008-05-06 2010-03-25 Compulink Management Center, Inc. Camera-based document imaging
CN101789122A (en) * 2009-01-22 2010-07-28 佳能株式会社 Method and system for correcting distorted document image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070206877A1 (en) * 2006-03-02 2007-09-06 Minghui Wu Model-based dewarping method and apparatus
CN101460937A (en) * 2006-03-02 2009-06-17 计算机连接管理中心公司 Model- based dewarping method and apparatus
CN101515984A (en) * 2008-02-19 2009-08-26 佳能株式会社 Electronic document producing device and electronic document producing method
US20100073735A1 (en) * 2008-05-06 2010-03-25 Compulink Management Center, Inc. Camera-based document imaging
CN101789122A (en) * 2009-01-22 2010-07-28 佳能株式会社 Method and system for correcting distorted document image

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106249236A (en) * 2016-07-12 2016-12-21 北京航空航天大学 A kind of spaceborne InSAR long-short baselines image associating method for registering
CN106249236B (en) * 2016-07-12 2019-01-22 北京航空航天大学 A kind of spaceborne InSAR long-short baselines image joint method for registering
CN108875744A (en) * 2018-03-05 2018-11-23 南京理工大学 Multi-oriented text lines detection method based on rectangle frame coordinate transform
CN109241966A (en) * 2018-08-22 2019-01-18 东北农业大学 A kind of plant leaf blade nondestructive collection method
CN109829437A (en) * 2019-02-01 2019-05-31 北京旷视科技有限公司 Image processing method, text recognition method, device and electronic system
CN109829437B (en) * 2019-02-01 2022-03-25 北京旷视科技有限公司 Image processing method, text recognition device and electronic system
CN110852229A (en) * 2019-11-04 2020-02-28 泰康保险集团股份有限公司 Method, device and equipment for determining position of text area in image and storage medium
CN111724320A (en) * 2020-06-19 2020-09-29 北京波谱华光科技有限公司 Blind pixel filling method and system
CN111967463A (en) * 2020-06-23 2020-11-20 南昌大学 Method for detecting curve fitting of curved text in natural scene

Also Published As

Publication number Publication date
CN105225218B (en) 2018-12-21

Similar Documents

Publication Publication Date Title
CN105450900A (en) Distortion correction method and equipment for document image
CN105225218A (en) For distortion correction method and the equipment of file and picture
CN109670500B (en) Text region acquisition method and device, storage medium and terminal equipment
US20210256253A1 (en) Method and apparatus of image-to-document conversion based on ocr, device, and readable storage medium
CN109086714B (en) Form recognition method, recognition system and computer device
JP6255486B2 (en) Method and system for information recognition
JP5775225B2 (en) Text detection using multi-layer connected components with histograms
WO2017162069A1 (en) Image text identification method and apparatus
CN107909081B (en) Method for quickly acquiring and quickly calibrating image data set in deep learning
JP6547386B2 (en) Image processing apparatus and method
KR20090004904A (en) Model-based dewarping method and apparatus
CN105469027A (en) Horizontal and vertical line detection and removal for document images
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
KR20110139113A (en) System and method for clean document reconstruction from annotated document images
US20180184012A1 (en) Image processing apparatus, control method, and storage medium
TW200529093A (en) Face image detection method, face image detection system, and face image detection program
JP4904330B2 (en) Method and apparatus for extracting text from an image
CN114049499A (en) Target object detection method, apparatus and storage medium for continuous contour
CN104240259B (en) High photographing instrument voucher intelligence cutting edge correction system and method based on contours segmentation
Zhang et al. Marior: Margin removal and iterative content rectification for document dewarping in the wild
CN108197624A (en) The recognition methods of certificate image rectification and device, computer storage media
WO2015092059A1 (en) Method and system for correcting projective distortions.
JP5201184B2 (en) Image processing apparatus and program
CN110321887B (en) Document image processing method, document image processing apparatus, and storage medium
EP2223265A1 (en) A method for resolving contradicting output data from an optical character recognition (ocr) system, wherein the output data comprises more than one recognition alternative for an image of a character

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant