CN105868759A - Method and apparatus for segmenting image characters - Google Patents

Method and apparatus for segmenting image characters Download PDF

Info

Publication number
CN105868759A
CN105868759A CN201510031629.7A CN201510031629A CN105868759A CN 105868759 A CN105868759 A CN 105868759A CN 201510031629 A CN201510031629 A CN 201510031629A CN 105868759 A CN105868759 A CN 105868759A
Authority
CN
China
Prior art keywords
binary image
character
straight line
image
connected domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510031629.7A
Other languages
Chinese (zh)
Other versions
CN105868759B (en
Inventor
王楠
杜志军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510031629.7A priority Critical patent/CN105868759B/en
Publication of CN105868759A publication Critical patent/CN105868759A/en
Application granted granted Critical
Publication of CN105868759B publication Critical patent/CN105868759B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Input (AREA)

Abstract

The invention provides a method and apparatus for segmenting image characters. The method includes the following steps that: binarization processing is performed on an original image where single-row characters to be segmented are located, so that a binarized image can be obtained; straight line detection processing is performed on the binarized image, so that a tilt correction parameter can be obtained, and background straight lines are removed from the binarized image when the background straight lines are detected out; tilt correction is performed on the binarized image according to the tilt correction parameter; a character region where the single-row characters are located is determined according to a pixel communication region in the corrected binarized image; and a character segmentation position is determined based on the character region. With the method and apparatus adopted, the accuracy of character segmentation can be improved.

Description

The method and device of segmentation image character
[technical field]
The application relates to technical field of image processing, particularly relates to a kind of method splitting image character and dress Put.
[background technology]
Along with the development of smart mobile phone, increasing mobile phone has camera function.Mobile phone is utilized to clap Take the photograph the information such as identity card, bank card, business card and bill, to facilitate use.Word in these images or The characters such as numeral are typically all key message, identify that these characters have important meaning the most accurately and rapidly Justice.During identifying, it is necessary first to these characters are split, and the quality split directly affects Accuracy to recognition result.
When shooting image, often occur that some special circumstances, such as shooting angle tilt, subject There are some interference straight lines etc. on body, these situations all can affect segmentation effect, causes splitting accuracy relatively Low.
[summary of the invention]
The many aspects of the application provide a kind of method and device splitting image character, in order to improve segmentation The accuracy of image character.
The one side of the application, it is provided that a kind of method splitting image character, including:
The original image at single file character place to be split is carried out binary conversion treatment, it is thus achieved that binary image;
Described binary image is carried out straight-line detection process, it is thus achieved that slant correction parameter, and the back of the body detected During scape straight line, described background straight line is removed from described binary image;
According to described slant correction parameter, described binary image is carried out slant correction;
According to the pixel connected domain in the described binary image after correction, determine described single file character place Character zone;
Character segmentation position is determined based on described character zone.
The another aspect of the application, it is provided that a kind of device splitting image character, including:
Binary conversion treatment module, for the original image at single file character place to be split is carried out binary conversion treatment, Obtain binary image;
Straight-line detection module, for carrying out straight-line detection process to described binary image, it is thus achieved that slant correction Parameter, and when background straight line being detected, described background straight line is removed from described binary image;
Slant correction module, for according to described slant correction parameter, tilts described binary image Correction;
Character zone determines module, the pixel connected domain in described binary image after being used for according to correction, Determine the character zone at described single file character place;
Split position determines module, for determining Character segmentation position based on described character zone.
In this application, the original image at single file character place to be split is carried out binary conversion treatment, it is thus achieved that Binary image, carries out straight-line detection process based on binary image, on the one hand obtains slant correction parameter, On the other hand, when straight line being detected, straight line will be removed, to overcome straight line to determining Character segmentation position Interference, carries out slant correction according to slant correction parameter to binary image afterwards, to reduce image Impact on Character segmentation accuracy, further according to the pixel connected domain in the binary image after correction, really The character zone at order line character place, determines Character segmentation position based on character zone, by further Reduce single file character regional extent in the picture, and the regional extent after reducing based on this determines that character divides Cut position, be conducive to improving the accuracy determining Character segmentation position further.As can be seen here, the application Technical scheme can improve the accuracy of separating character split position.
[accompanying drawing explanation]
For the technical scheme being illustrated more clearly that in the embodiment of the present application, below will be to embodiment or existing In technology description, the required accompanying drawing used is briefly described, it should be apparent that, in describing below Accompanying drawing is some embodiments of the application, for those of ordinary skill in the art, is not paying creation On the premise of property is laborious, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
The schematic flow sheet of the method for the segmentation image character that Fig. 1 provides for the application one embodiment;
Background straight line that Fig. 2 provides for the application one embodiment and the position view that character intersects;
Background straight line that Fig. 3 provides for the application one embodiment and the enlarged diagram of character intersection location;
The signal of the contingent several situations of boundary rectangle frame that Fig. 4 provides for the application one embodiment Figure;
The structural representation of the device of the segmentation image character that Fig. 5 provides for the application one embodiment;
The structural representation of the device of the segmentation image character that Fig. 6 provides for another embodiment of the application.
[detailed description of the invention]
For making the purpose of the embodiment of the present application, technical scheme and advantage clearer, below in conjunction with this Shen Please accompanying drawing in embodiment, the technical scheme in the embodiment of the present application is clearly and completely described, Obviously, described embodiment is some embodiments of the present application rather than whole embodiments.Based on Embodiment in the application, those of ordinary skill in the art are obtained under not making creative work premise The every other embodiment obtained, broadly falls into the scope of the application protection.
The schematic flow sheet of the method for the segmentation image character that Fig. 1 provides for the application one embodiment.Such as figure Shown in 1, the method includes:
101, the original image at single file character place to be split is carried out binary conversion treatment, it is thus achieved that binaryzation Image.
The executive agent of the present embodiment can be the device of segmentation image character, and this device can be various need Identify and show the equipment of image, such as photographing unit, mobile phone, computer, ipad etc..
The method that the present embodiment provides is primarily adapted for use in be split single file character, therefore with single file character As a example by illustrate, but be not limited thereto.Here character can be numeral, word, letter, with And various symbols etc..In each embodiment of the application, Character segmentation is primarily referred to as determining Character segmentation position Process.
The device of segmentation image character obtains the original image at single file character place to be split in advance.Wherein, The original image at single file character place can be gray level image or coloured image.Optionally, if original image Being coloured image, coloured image can also be converted to gray level image by the device of segmentation image character in advance, So can reduce amount of image information, reduce processing load.
In a kind of application scenarios, the device of segmentation image character can directly obtain captured by camera arrangement Complete image (this complete image is also gray level image or coloured image), and from complete image extract The original image at single file character place to be split, the most to be split when the original image at line character place be A part in whole image.Such as, whole identity card can be shot, to obtain body by camera arrangement Part card image;The device of segmentation image character can extract ID (identity number) card No. place from ID Card Image Parts of images, the ID (identity number) card No. in this parts of images is the single file character that the present embodiment is to be split, should Parts of images is the original image at single file character place in the present embodiment.The most such as, camera arrangement is permissible Taking pictures whole list, to obtain form image, this list includes multiple information row, such as, collect money Organization belongs to an information row, and payer title belongs to an information row, date and name of product Combine and belong to information row etc.;The device of segmentation image character can extract from form image The parts of images at certain information row place, in this parts of images, the information in information row is in the present embodiment Single file character to be split, this parts of images is the original image at single file character place in the present embodiment.
In Another Application scene, the device of segmentation image character can have camera module, this photograph mould Block can provide prompting frame of taking pictures, in order to shooting single file character.Concrete, user opens camera module, Can show that in imaging center region a rectangle frame is prompting frame of taking pictures, the character meeting in prompting frame of taking pictures Divided extraction, the character taken pictures outside prompting frame is disregarded.After user's selected shooting character, photograph Module starts auto-focusing, stably starts the character to taking pictures in prompting frame after focusing certain time and claps According to, thus obtain the original image at character place.
Such as, user can make, by the device of mobile segmentation image character, the identification card number that needs shoot Code bit, in prompting frame of taking pictures, utilizes the camera module of the device of segmentation image character to take pictures afterwards, Obtain the original image at ID (identity number) card No. place.The most such as, user can be by mobile segmentation image character Device, make the information row in the list that needs shoot be positioned at prompting frame of taking pictures, utilize segmentation figure afterwards As the camera module of the device of character is taken pictures, it is thus achieved that the original image at this information row place.
After the original image obtaining single file character place, the device of segmentation image character is to single file character institute Original image carry out binary conversion treatment, it is thus achieved that binary image.Here binary conversion treatment refer to by Coloured image or greyscale image transitions are the process of bianry image, and the pixel value in binary image only has 0 With 1.In the binary image of the present embodiment, the pixel value in character region is 0, other back ofs the body The value of scene element is 1.
102, binary image is carried out straight-line detection process, it is thus achieved that slant correction parameter, and in detection During to background straight line, background straight line is removed from binary image.
In actual applications, for having the printing document of form as Bank bills, payment document etc., typically Can shoot to enter, i.e. by the horizontal or vertical lines on document when the character shot on these documents simultaneously Original image in addition to including single file character to be split, also can exist the horizontal linear as background or The vertical straight line of person, referred to as background straight line.Wherein, can be by according to the position relationship of background straight line and character It is divided into two kinds of situations: first kind situation be background straight line and character non-intersect, i.e. character is not printed upon On the straight line of form, both can be directly separated and be independent of each other;The second situation is background straight line and word Symbol intersects, and i.e. character print is on the straight line of form, and both not directly separate.
Which kind of situation the most above-mentioned, Character segmentation would generally be produced not by the background straight line in original image Good impact, such as, cause Character segmentation mistake etc..For this problem, the device of segmentation image character is obtaining After obtaining the binary image at single file character place, this binary image is carried out straight-line detection process, a side Whether detection binary image in face exists background straight line, and when background straight line being detected, background is straight Line is removed from binary image, to overcome the impact on Character segmentation accuracy of the background straight line;The opposing party Face is in order to obtain the slant correction parameter needed for follow-up slant correction.
In an optional embodiment, can use but be not limited to following manner and binary image is carried out straight line Detection processes:
Binary image is carried out Hough transformation, to detect, whether binary image exists background straight line; Such as, for using the image of prompting frame shooting of taking pictures, it can be determined that whether the length of horizontal linear exceedes The prescribed percentage of prompting frame width of taking pictures, or judge whether the height of vertical straight line has exceeded to take pictures and carry Show the prescribed percentage of frame height, if the determination result is YES, it is determined that horizontal linear or vertical line are the back of the body Scape straight line.Such as, it is intended that percentage ratio can be 60%, 70% or 80% etc..
When detect binary image exists background straight line time, on the one hand need from binary image Except background straight line, on the other hand also needed to before removing background straight line, obtain background straight line slope and Intercept, and record slope and the intercept of background straight line, in order to slope and intercept according to background straight line obtain Obtain the slant correction parameter needed for follow-up slant correction step.
For the first situation above-mentioned, i.e. background straight line and the situation of character separation, can be to background straight line Pixel value on non-background straight line around carries out numerical operation, it is thus achieved that substitute pixel value, and directly will the back of the body Pixel value on scape straight line directly replaces with this substitute pixel value, thus reaches to remove the purpose of background straight line. A kind of obtain substitute pixel value mode be: the pixel value on the non-background straight line around background straight line is entered Row is average, it is thus achieved that substitute pixel value.The mode of another kind of acquisition substitute pixel value is: to background straight line week The pixel value on non-background straight line enclosed is weighted averagely, it is thus achieved that substitute pixel value.What deserves to be explained is, The present embodiment does not limit the scope of " pixel value on non-background straight line around ".
Situation about intersecting for above-mentioned the second situation, i.e. background straight line and character, can be by background straight line On the part that do not intersects with character remove, retain the part intersected with character on background straight line, thus reach Remove the purpose of background straight line.As a example by background straight line is as horizontal linear, remove the step of background straight line such as Under:
A, in binary image, determine the altitude range of background straight line;
Intersecting of b, traversal searching background straight line and the intersection location of character, a kind of background straight line and character Position view is as shown in Figure 2;
C, determine four joinings of background straight line and character, the most each two joinings, and on meeting The distance of two joinings in face is similar to the distance of following two joining, the simultaneously level of two groups of joinings Skew can not be excessive;The enlarged diagram of a kind of four joinings is as shown in Figure 3;
The distance of above-mentioned both the above joining is similar to the distance of following two joining and is primarily referred to as The distance of two joinings in face is not more than predetermined threshold value with the difference of the distance of following two joining;Above two The horizontal-shift of group joining can not excessive be primarily referred to as not can exceed that the amount of specifying Offsets.
D, four joinings are configured to a tetragon, determine that all positions in this tetragon are the back of the body The location of pixels that scape straight line and character intersect;
E, the pixel value keeping the location of pixels that the above-mentioned background straight line determined and character intersect are constant, will On background straight line, the value of other location of pixels replaces with the value of the location of pixels on the most non-background straight line and carries out The result (this operation result is substitute pixel value) of numerical operation, thus reach to remove background straight line Purpose.
What deserves to be explained is, the present embodiment method is based primarily upon binary image and determines Character segmentation position, But actual application can preserve original image simultaneously, in addition to removing the background straight line in binary image, Background straight line in original image also can also be removed.
When detect there is not background straight line in binary image time, to binary image is positioned at binary picture Pixel connected domain in inconocenter region is fitted processing, it is thus achieved that fitting a straight line, oblique according to fitting a straight line Rate and intercept, it is thus achieved that the slant correction parameter needed for follow-up slant correction step.
Binary image in view of the present embodiment includes single file character, the most typically has the feature that Single file character is normally at the position of binary image intermediate altitude, and the upper-lower position of binary image may Have other unnecessary character informations, but the pixels tall shared by redundant character information is not over to be split Pixels tall shared by character.
Therefore, when detect there is not background straight line in binary image time, it is not necessary to binary image is entered Row line removal processes, but the pixel connected domain being pointed in binary image central area is fitted place Reason, it is thus achieved that fitting a straight line, in order to determine slant correction parameter according to this fitting a straight line.This fitting a straight line can position In binary image central area.Wherein, the central area of binary image refers to distance binary image Intermediate altitude specifies the region of scope.
Based on features above, the parallel lines of three fixing levels can be set in advance in binary image Thus constructing three classes, a horizontal linear is positioned at the intermediate altitude of binary image, another two levels Straight line lays respectively at binary image position on the upper side and position on the lower side, typically can be by straight for upper and lower two levels Line is individually set to top edge and the lower limb of binary image.The device of segmentation image character is to binaryzation Pixel connected domain in image carries out clustering processing, it is thus achieved that be positioned at the pixel of binary image central area Connected domain.In binary image, if the eight neighborhood of certain pixel exists the pixel value the same with it, Then thinking that both are connections, based on this definition, the device of segmentation image character can analyze binaryzation Pixel connected domain in image.
Concrete, for each pixel connected domain, calculate the center of this pixel connected domain respectively with Article three, the Euclidean distance of horizontal linear, determines the horizontal linear that minimum euclidean distance is corresponding, by this pixel even Award the class belonging to horizontal linear corresponding to minimum euclidean distance in logical territory.Based on this, segmentation image character Device may determine that in binary image those clusters pixel connected domain to the class belonging to middle straight line, for It is positioned at the pixel connected domain of binary image central area.
Owing to single file character to be split is generally within the intermediate altitude of binary image, so single file character Corresponding pixel connected domain can be closer from middle straight line, even if occurring because charcter topology is (in Chinese character Up-down structure), print, character fracture that the problem such as shooting causes, the pixel that most character is corresponding Connected domain still can be near middle straight line.And the character institute of single file character to be split is not belonging to for those The pixel connected domain belonged to, can be divided into the class belonging to upper and lower two straight lines.
What deserves to be explained is, be positioned at binary image central area except using above-mentioned clustering method to obtain Pixel connected domain outside, it is also possible to use additive method, such as preassign the scope of central area, To each pixel connected domain, obtain the part being positioned at central area in this pixel connected domain with whole pixel even The percentage ratio in logical territory, is more than the pixel connected domain of designated ratio threshold value as being positioned at this center using percentage ratio Pixel connected domain in territory.
After obtaining the pixel connected domain being positioned at binary image central area, the dress of segmentation image character Centre coordinate and the number of pixels of putting the pixel connected domain that can will be located in binary image central area are made For fitting parameter, carry out least square fitting process, it is thus achieved that fitting a straight line.
Concrete, the device of segmentation image character can will be located in the pixel in binary image central area Number of pixels in connected domain is as the weights of this pixel connected domain, by the centre coordinate of this pixel connected domain As the coordinate of this pixel connected domain, and carry out least square fitting in this, as fitting parameter, it is thus achieved that intend Closing straight line, and then slope based on fitting a straight line and intercept obtain slant correction parameter, this mode is permissible Reduce the issuable pixel in binarization less and the more multipair overall performance of pixel connected domain Impact.
103, according to above-mentioned slant correction parameter, binary image is carried out slant correction.
In view of in the shooting process of image, can incline because of camera arrangement (such as camera, mobile phone etc.) Tiltedly cause the character of shooting not on a horizontal line, it is therefore desirable to binary image is rotated to horizontal position Put, in order to more accurately character is split.Therefore, the device of segmentation image character can root The slant correction parameter obtained according to above-mentioned steps, carries out slant correction to binary image.
Concrete, slant correction parameter is usually a slant correction angle, tilts binary image Binary image, mainly according to slant correction angle, is rotated by correction.
104, according to the pixel connected domain in the binary image after correction, above-mentioned single file character institute is determined Character zone.
After binary image is carried out slant correction, according to the pixel connection in the binary image after correction Territory, determines the character zone at above-mentioned single file character place, to reduce the region at single file character place further, In smaller area, carry out Character segmentation, be conducive to improving further Character segmentation accuracy.
In an optional embodiment, the device of segmentation image character can be to the binary image after correction In pixel connected domain carry out clustering processing, it is thus achieved that be positioned at the binary image central area after correction Pixel connected domain;Will be located in isolating in the pixel connected domain in the binary image central area after correction Pixel connected domain is removed, it is thus achieved that character zone.
Concrete, if having preset three parallel lines in above-mentioned steps, the most in this step, Three that the device of segmentation image character can be directly based upon in the binary image after being arranged at correction straight Line, carries out clustering processing to pixel connected domain, it is thus achieved that be positioned at the binary image central area after correction Pixel connected domain.Otherwise, need binary image the most after calibration sets three fixing levels Parallel lines thus construct three classes, a horizontal linear is positioned at the intermediate altitude of binary image, Another two horizontal linears lay respectively at binary image position on the upper side and position on the lower side, typically can be by up and down Article two, horizontal linear is individually set to top edge and the lower limb of binary image.Afterwards, based on setting Three straight lines in binary image after correction, carry out clustering processing to pixel connected domain, it is thus achieved that position The pixel connected domain in binary image central area after correction.
Concrete, for each pixel connected domain, calculate the center of this pixel connected domain respectively with Article three, the Euclidean distance of horizontal linear, determines the horizontal linear that minimum euclidean distance is corresponding, by this pixel even Award the class belonging to horizontal linear corresponding to minimum euclidean distance in logical territory.Based on this, segmentation image character Device may determine that in the binary image after correction, those cluster the pixel of the class belonging to middle straight line even Logical territory, for being positioned at the pixel connected domain of the binary image central area after correction.
What deserves to be explained is, owing in advance binary image having been carried out slant correction, so for those Yin Fashengyin image taking angle tilt to cause one part of pixel connected domain heart height the most in the picture and may It is judged as the pixel connected domain of other classes, can be by correct cluster to the class belonging to middle straight line.
What deserves to be explained is, except using above-mentioned clustering method to obtain in the binary image after being positioned at correction Outside pixel connected domain in heart region, it is also possible to use additive method, such as preassign central area Scope, to each pixel connected domain, obtain and this pixel connected domain be positioned at the part of central area with whole The percentage ratio of individual pixel connected domain, is more than the pixel connected domain of designated ratio threshold value as being positioned at using percentage ratio Pixel connected domain in this central area.
Consider in the pixel connected domain of the binary image central area after being positioned at correction, it is possible to exist Isolated pixel connected domain, isolated pixel connected domain typically falls within interference information, then by these isolated pictures Element connected domain is removed, using residual pixel connected domain as the character zone at single file character place.Isolated pixel Connected domain refers to be all higher than the pixel connected domain of predeterminable range threshold value with the distance of other pixel connected domains.
105, Character segmentation position is determined based on above-mentioned character zone.
Character zone determined by based on determines that Character segmentation position, beneficially raising determine Character segmentation position The accuracy put.
In an optional embodiment, the device of segmentation image character will be background pixel in character zone Row as vertical divider, according in vertical divider and the subregion that is partitioned into by vertical divider Maximum pixel position and minimum pixel position, determine Character segmentation position.
Concrete, every string pixel in the device statistics character zone of segmentation image character, if in the middle of string All pixels be background pixel (i.e. value is 1), then using these row as a vertical divider, In order to single file character is carried out vertical segmentation;Afterwards, the subregion being partitioned into by vertical divider is determined In maximum pixel position and minimum pixel position, by vertical divider and maximum pixel position and minimum Location of pixels constructs the boundary rectangle frame of this subregion, and this boundary rectangle frame constitutes the word in this subregion Symbol split position.
In the present embodiment, the original image at single file character place to be split is carried out binary conversion treatment, obtains Obtain binary image, carry out straight-line detection process based on binary image, on the one hand obtain slant correction ginseng Number, on the other hand when straight line being detected, will remove straight line, to overcome straight line to separating character degree of accuracy Interference, according to slant correction parameter, binary image is carried out slant correction afterwards, inclines reducing image The tiltedly impact on Character segmentation accuracy, further according to the pixel connected domain in the binary image after correction, Determine the character zone at single file character place, determine Character segmentation position based on character zone, by entering one Step reduces single file character regional extent in the picture, and the regional extent after reducing based on this determines character Split position, is conducive to improving the accuracy determining Character segmentation position further.As can be seen here, this reality Execute example and can improve the accuracy determining Character segmentation position.
Participate in it addition, the method that the present embodiment provides need not user, determine the operation of Character segmentation position Simply, it is easy to accomplish, Character segmentation efficiency is higher.
Behind the above-mentioned Character segmentation position determining single file character, further can by character recognition engine according to This Character segmentation position carries out character recognition, due to the present embodiment provide Character segmentation position accuracy relatively Height, therefore recognition result has higher confidence level.What deserves to be explained is, character recognition engine can root The Character segmentation position determined according to above-described embodiment, the gray level image after removing background straight line carries out word Symbol identifies;Or, the Character segmentation position that character recognition engine can determine according to above-described embodiment, Binary image enterprising line character identification after above-mentioned correction.
What deserves to be explained is, although the Character segmentation position accuracy that the present embodiment determines is higher, but such as figure Shown in 4, boundary rectangle frame is it can also happen that following three kinds of situations:
1) in boundary rectangle frame it is a complete character;
2) due to the adhesion of character so that there is multiple character in boundary rectangle frame;
3) unintelligible character is caused to rupture owing to printing or shooting so that a character has been assigned to many Individual boundary rectangle frame.
Based on above-mentioned, during for being a complete character in boundary rectangle frame, character recognition engine can To identify this complete character, and there is higher confidence level;For boundary rectangle circle surely too much or The situation of the very few character of person, character recognition engine may go out the character in external rectangle frame by None-identified, and Return recognition failures information, segmentation failure cause can also be returned further, such as confine character too much or Very few etc..
Based on above-mentioned, the device of segmentation image character according to recognition failures information, and can combine segmentation mistake Lose reason Character segmentation position is adjusted, such as, will confine multicharacter boundary rectangle frame and tear open Divide, and utilize the position relationship of boundary rectangle frame, the boundary rectangle frame confining very few character is merged, So that character recognition engine can re-start character recognition according to adjusted Character segmentation position, with Improve character recognition accuracy.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore by its all table Stating as a series of combination of actions, but those skilled in the art should know, the application is by being retouched The restriction of the sequence of movement stated because according to the application, some step can use other orders or with Shi Jinhang.Secondly, those skilled in the art also should know, embodiment described in this description all belongs to In preferred embodiment, necessary to involved action and module not necessarily the application.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not has in certain embodiment The part described in detail, may refer to the associated description of other embodiments.
The structural representation of the device of the segmentation image character that Fig. 5 provides for the application one embodiment.Such as figure Shown in 5, this device includes: binary conversion treatment module 51, straight-line detection module 52, slant correction mould Block 53, character zone determine that module 54 and split position determine module 55.
Binary conversion treatment module 51, for carrying out at binaryzation the original image at single file character place to be split Reason, it is thus achieved that binary image.
Straight-line detection module 52, is connected with binary conversion treatment module 51, for binary conversion treatment module 51 The binary image obtained carries out straight-line detection process, it is thus achieved that slant correction parameter, and is detecting that background is straight During line, background straight line is removed from binary image.
Slant correction module 53, is connected with straight-line detection module 52, for obtaining according to straight-line detection module 52 The slant correction parameter obtained, carries out slant correction to binary image.
Character zone determines module 54, is connected with slant correction module 53, for according to slant correction module The pixel connected domain in binary image after 53 corrections, determines the character zone at single file character place.
Split position determines module 55, determines that with character zone module 54 is connected, for based on character area Territory determines that the character zone that module 54 determines determines Character segmentation position.
In an optional embodiment, as shown in Figure 6, the one of straight-line detection module 52 realizes structure Including: detector unit the 521, first acquiring unit 522, second acquisition unit 523 and line removal list Unit 524.
Whether detector unit 521, for carrying out Hough transformation to binary image, to detect in binary image There is background straight line.
First acquiring unit 522, is connected with detector unit 521, for detecting two in detector unit 521 When value image exists background straight line, obtain slope and the intercept of background straight line, and according to background straight line Slope and intercept, it is thus achieved that slant correction parameter.
Second acquisition unit 523, is connected with detector unit 521, for detecting two in detector unit 521 When value image does not exist background straight line, to binary image is positioned at this binary image central area Pixel connected domain be fitted process, it is thus achieved that fitting a straight line, according to slope and the intercept of fitting a straight line, obtain Obtain slant correction parameter.
Line removal unit 524, is connected with detector unit 521, for detecting two in detector unit 521 When value image exists background straight line, background straight line is removed from binary image.
In an optional embodiment, second acquisition unit 523 is particularly used in:
Pixel connected domain in binary image is carried out clustering processing, obtains and be positioned at binary image center Pixel connected domain in territory;
The centre coordinate and the number of pixels that will be located in the pixel connected domain in binary image central area are made For fitting parameter, carry out least square fitting process, it is thus achieved that fitting a straight line.
In an optional embodiment, character zone determines that module 54 is particularly used in:
Pixel connected domain in binary image after correction is carried out clustering processing, obtains after being positioned at correction Pixel connected domain in binary image central area;
Will be located in the isolated pixel connected domain in the pixel connected domain in the binary image central area after correction Remove, it is thus achieved that character zone.
In an optional embodiment, split position determines that module 55 is particularly used in:
Using character zone is background pixel row as vertical divider;
According to the maximum pixel position in vertical divider and the subregion that is partitioned into by vertical divider and Low location of pixels, determines Character segmentation position.
In an optional embodiment, as shown in Figure 6, this device also includes: photo module 56, is used for Detect single line text be positioned at take pictures prompting frame time, single file character is shot, it is thus achieved that original graph Picture.Photo module 56 is connected with binary conversion treatment module 51, for carrying to binary conversion treatment module 51 For original image.
The device of the segmentation image character that the present embodiment provides can be that various needs identifies and show image Equipment, such as photographing unit, mobile phone, computer, ipad etc..
The device of the segmentation image character that the present embodiment provides, the original graph to single file character place to be split As carrying out binary conversion treatment, it is thus achieved that binary image, carry out straight-line detection process based on binary image, On the one hand obtain slant correction parameter, on the other hand when straight line being detected, straight line will be removed, to overcome The straight line interference to separating character degree of accuracy, inclines to binary image according to slant correction parameter afterwards Tiltedly correction, to reduce the image inclination impact on Character segmentation accuracy, further according to the binaryzation after correction Pixel connected domain in image, determines the character zone at single file character place, determines word based on character zone Symbol split position, by reducing single file character regional extent in the picture further, and reduces based on this After regional extent determine Character segmentation position, be conducive to improving further the standard determining Character segmentation position Really property.As can be seen here, use the device splitting image character that the present embodiment provides, single file character is entered Row dividing processing, can improve the accuracy determining Character segmentation position.
Those skilled in the art is it can be understood that arrive, and for convenience and simplicity of description, above-mentioned retouches The specific works process of the system stated, device and unit, is referred to the correspondence in preceding method embodiment Process, does not repeats them here.
In several embodiments provided herein, it should be understood that disclosed system, device and Method, can realize by another way.Such as, device embodiment described above is only shown Meaning property, such as, the division of described unit, be only a kind of logic function and divide, actual can when realizing There to be other dividing mode, the most multiple unit or assembly can in conjunction with or be desirably integrated into another System, or some features can ignore, or do not perform.Another point, shown or discussed each other Coupling direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, device or unit Or communication connection, can be electrical, machinery or other form.
The described unit illustrated as separating component can be or may not be physically separate, makees The parts shown for unit can be or may not be physical location, i.e. may be located at a place, Or can also be distributed on multiple NE.Can select according to the actual needs part therein or The whole unit of person realizes the purpose of the present embodiment scheme.
It addition, each functional unit in each embodiment of the application can be integrated in a processing unit In, it is also possible to it is that unit is individually physically present, it is also possible to two or more unit are integrated in one In individual unit.Above-mentioned integrated unit both can realize to use the form of hardware, it would however also be possible to employ hardware adds The form of SFU software functional unit realizes.
The above-mentioned integrated unit realized with the form of SFU software functional unit, can be stored in a computer In read/write memory medium.Above-mentioned SFU software functional unit is stored in a storage medium, including some fingers Make with so that a computer equipment (can be personal computer, server, or the network equipment etc.) Or processor (processor) performs the part steps of method described in each embodiment of the application.And it is aforementioned Storage medium include: USB flash disk, portable hard drive, read only memory (Read-Only Memory, ROM), Random access memory (Random Access Memory, RAM), magnetic disc or CD etc. are various The medium of program code can be stored.
Last it is noted that above example is only in order to illustrate the technical scheme of the application, rather than to it Limit;Although the application being described in detail with reference to previous embodiment, the ordinary skill of this area Personnel it is understood that the technical scheme described in foregoing embodiments still can be modified by it, or Person carries out equivalent to wherein portion of techniques feature;And these amendments or replacement, do not make corresponding skill The essence of art scheme departs from the spirit and scope of the application each embodiment technical scheme.

Claims (12)

1. the method splitting image character, it is characterised in that including:
The original image at single file character place to be split is carried out binary conversion treatment, it is thus achieved that binary image;
Described binary image is carried out straight-line detection process, it is thus achieved that slant correction parameter, and the back of the body detected During scape straight line, described background straight line is removed from described binary image;
According to described slant correction parameter, described binary image is carried out slant correction;
According to the pixel connected domain in the described binary image after correction, determine described single file character place Character zone;
Character segmentation position is determined based on described character zone.
Method the most according to claim 1, it is characterised in that described described binary image is carried out Straight-line detection processes, it is thus achieved that slant correction parameter, including:
Described binary image is carried out Hough transformation, to detect in described binary image whether there is background Straight line;
When detect there is background straight line in described binary image time, obtain described background straight line slope and Intercept, and according to the slope of described background straight line and intercept, it is thus achieved that described slant correction parameter;
When detect there is not background straight line in described binary image time, be positioned in described binary image Pixel connected domain in described binary image central area is fitted processing, it is thus achieved that fitting a straight line, according to The slope of described fitting a straight line and intercept, it is thus achieved that described slant correction parameter.
Method the most according to claim 2, it is characterised in that described to position in described binary image Pixel connected domain in described binary image central area is fitted processing, it is thus achieved that fitting a straight line, bag Include:
Pixel connected domain in described binary image is carried out clustering processing, obtains and be positioned at described binary picture Pixel connected domain in inconocenter region;
The centre coordinate and the number of pixels that will be located in the pixel connected domain in described binary image central area are made For fitting parameter, carry out least square fitting process, it is thus achieved that described fitting a straight line.
Method the most according to claim 1, it is characterised in that described according to the described two-value after correction Change the pixel connected domain in image, determine the character zone at described single file character place, including:
Pixel connected domain in described binary image after correction is carried out clustering processing, obtains and be positioned at correction After described binary image central area in pixel connected domain;
Will be located in the isolated pixel in the pixel connected domain in the described binary image central area after correction even Logical territory is removed, it is thus achieved that described character zone.
5. according to the method described in any one of claim 1-4, it is characterised in that described based on described character Region determines Character segmentation position, including:
Described character zone will be the row of background pixel as vertical divider;
According to the maximum pixel in described vertical divider and the subregion that is partitioned into by described vertical divider Position and minimum pixel position, determine described Character segmentation position.
6. according to the method described in any one of claim 1-4, it is characterised in that described to single file to be split The original image at character place carries out binary conversion treatment, it is thus achieved that before binary image, also includes:
Detect described single line text be positioned at take pictures prompting frame time, described single file character is shot, Obtain described original image.
7. the device splitting image character, it is characterised in that including:
Binary conversion treatment module, for the original image at single file character place to be split is carried out binary conversion treatment, Obtain binary image;
Straight-line detection module, for carrying out straight-line detection process to described binary image, it is thus achieved that slant correction Parameter, and when background straight line being detected, described background straight line is removed from described binary image;
Slant correction module, for according to described slant correction parameter, tilts described binary image Correction;
Character zone determines module, the pixel connected domain in described binary image after being used for according to correction, Determine the character zone at described single file character place;
Split position determines module, for determining Character segmentation position based on described character zone.
Device the most according to claim 7, it is characterised in that described straight-line detection module includes:
Detector unit, for carrying out Hough transformation to described binary image, to detect described binary image In whether there is background straight line;
First acquiring unit, in time there is background straight line in described binary image being detected, obtains institute State slope and the intercept of background straight line, and according to the slope of described background straight line and intercept, it is thus achieved that described inclination Correction parameter;
Second acquisition unit, in time there is not background straight line in described binary image being detected, to institute State and binary image is positioned at the pixel connected domain of described binary image central area is fitted processing, Obtain fitting a straight line, according to slope and the intercept of described fitting a straight line, it is thus achieved that described slant correction parameter;
Line removal unit, in time there is background straight line in described binary image being detected, by described Background straight line is removed from described binary image.
Device the most according to claim 8, it is characterised in that described second acquisition unit specifically for:
Pixel connected domain in described binary image is carried out clustering processing, obtains and be positioned at described binary picture Pixel connected domain in inconocenter region;
The centre coordinate and the number of pixels that will be located in the pixel connected domain in described binary image central area are made For fitting parameter, carry out least square fitting process, it is thus achieved that described fitting a straight line.
Device the most according to claim 7, it is characterised in that described character zone determines that module has Body is used for:
Pixel connected domain in described binary image after correction is carried out clustering processing, obtains and be positioned at correction After described binary image central area in pixel connected domain;
Will be located in the isolated pixel in the pixel connected domain in the described binary image central area after correction even Logical territory is removed, it is thus achieved that described character zone.
11. according to the device described in any one of claim 7-10, it is characterised in that described split position is true Cover half block specifically for:
Described character zone will be the row of background pixel as vertical divider;
According to the maximum pixel in described vertical divider and the subregion that is partitioned into by described vertical divider Position and minimum pixel position, determine described Character segmentation position.
12. according to the device described in any one of claim 7-10, it is characterised in that also include:
Photo module, for detect described single line text be positioned at take pictures prompting frame time, to described single file Character shoots, it is thus achieved that described original image.
CN201510031629.7A 2015-01-22 2015-01-22 The method and device of segmented image character Active CN105868759B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510031629.7A CN105868759B (en) 2015-01-22 2015-01-22 The method and device of segmented image character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510031629.7A CN105868759B (en) 2015-01-22 2015-01-22 The method and device of segmented image character

Publications (2)

Publication Number Publication Date
CN105868759A true CN105868759A (en) 2016-08-17
CN105868759B CN105868759B (en) 2019-11-05

Family

ID=56623193

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510031629.7A Active CN105868759B (en) 2015-01-22 2015-01-22 The method and device of segmented image character

Country Status (1)

Country Link
CN (1) CN105868759B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372638A (en) * 2016-08-31 2017-02-01 佛山绿怡信息科技有限公司 Method and system for identifying brand and model of electronic equipment
CN106778744A (en) * 2016-12-17 2017-05-31 北京小米移动软件有限公司 A kind of method and apparatus of the information of ID card
CN106991667A (en) * 2017-03-08 2017-07-28 浙江大学 A kind of prawn integrality method of discrimination for building characteristics of image spectrum
CN107886093A (en) * 2017-11-07 2018-04-06 广东工业大学 A kind of character detection method, system, equipment and computer-readable storage medium
CN108537229A (en) * 2018-04-24 2018-09-14 大连民族大学 Block letter language of the Manchus recognition methods based on language of the Manchus component cutting
CN108805116A (en) * 2018-05-18 2018-11-13 浙江蓝鸽科技有限公司 Image text detection method and its system
CN108846802A (en) * 2018-01-25 2018-11-20 湖南省自兴人工智能研究院 A kind of removal chromosomal G-banding mid-term gray level image Noise Method
CN109977959A (en) * 2019-03-29 2019-07-05 国家电网有限公司 A kind of train ticket character zone dividing method and device
CN110766630A (en) * 2019-10-18 2020-02-07 天津津航计算技术研究所 Character edge tracing device based on assembly line
CN111582259A (en) * 2020-04-10 2020-08-25 支付宝实验室(新加坡)有限公司 Machine-readable code identification method and device, electronic equipment and storage medium
CN111860502A (en) * 2020-07-15 2020-10-30 北京思图场景数据科技服务有限公司 Picture table identification method and device, electronic equipment and storage medium
CN112890736A (en) * 2019-12-03 2021-06-04 精微视达医疗科技(武汉)有限公司 Method and device for obtaining field mask of endoscopic imaging system
CN113449729A (en) * 2020-03-26 2021-09-28 富士通株式会社 Image processing apparatus, image processing method, and storage medium for eliminating lines

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847209A (en) * 2010-06-01 2010-09-29 福建新大陆电脑股份有限公司 Character image correction method
CN102236791A (en) * 2011-07-14 2011-11-09 青岛海信网络科技股份有限公司 Method for subdividing characters of slant license plate
CN102915522A (en) * 2012-09-12 2013-02-06 康佳集团股份有限公司 Smart phone name card extraction system and realization method thereof
CN103440492A (en) * 2013-09-05 2013-12-11 湖北省烟草公司荆州市公司 Hand-held cigarette recognizer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847209A (en) * 2010-06-01 2010-09-29 福建新大陆电脑股份有限公司 Character image correction method
CN102236791A (en) * 2011-07-14 2011-11-09 青岛海信网络科技股份有限公司 Method for subdividing characters of slant license plate
CN102915522A (en) * 2012-09-12 2013-02-06 康佳集团股份有限公司 Smart phone name card extraction system and realization method thereof
CN103440492A (en) * 2013-09-05 2013-12-11 湖北省烟草公司荆州市公司 Hand-held cigarette recognizer

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杜晓刚: ""车牌识别***中牌照定位、倾斜校正及字符分割技术的研究"", 《中国优秀硕士学位论文全文数据库-信息科技辑》 *
金晅红、王海: ""一种改进的图像水平倾斜角度测量算法的应用"", 《传感器与微***》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372638A (en) * 2016-08-31 2017-02-01 佛山绿怡信息科技有限公司 Method and system for identifying brand and model of electronic equipment
CN106778744A (en) * 2016-12-17 2017-05-31 北京小米移动软件有限公司 A kind of method and apparatus of the information of ID card
CN106991667A (en) * 2017-03-08 2017-07-28 浙江大学 A kind of prawn integrality method of discrimination for building characteristics of image spectrum
CN107886093A (en) * 2017-11-07 2018-04-06 广东工业大学 A kind of character detection method, system, equipment and computer-readable storage medium
CN108846802A (en) * 2018-01-25 2018-11-20 湖南省自兴人工智能研究院 A kind of removal chromosomal G-banding mid-term gray level image Noise Method
CN108537229A (en) * 2018-04-24 2018-09-14 大连民族大学 Block letter language of the Manchus recognition methods based on language of the Manchus component cutting
CN108805116B (en) * 2018-05-18 2022-06-24 浙江蓝鸽科技有限公司 Image text detection method and system
CN108805116A (en) * 2018-05-18 2018-11-13 浙江蓝鸽科技有限公司 Image text detection method and its system
CN109977959A (en) * 2019-03-29 2019-07-05 国家电网有限公司 A kind of train ticket character zone dividing method and device
CN110766630A (en) * 2019-10-18 2020-02-07 天津津航计算技术研究所 Character edge tracing device based on assembly line
CN112890736A (en) * 2019-12-03 2021-06-04 精微视达医疗科技(武汉)有限公司 Method and device for obtaining field mask of endoscopic imaging system
CN112890736B (en) * 2019-12-03 2023-06-09 精微视达医疗科技(武汉)有限公司 Method and device for obtaining field mask of endoscopic imaging system
CN113449729A (en) * 2020-03-26 2021-09-28 富士通株式会社 Image processing apparatus, image processing method, and storage medium for eliminating lines
CN111582259A (en) * 2020-04-10 2020-08-25 支付宝实验室(新加坡)有限公司 Machine-readable code identification method and device, electronic equipment and storage medium
CN111582259B (en) * 2020-04-10 2024-04-16 支付宝实验室(新加坡)有限公司 Machine-readable code identification method, device, electronic equipment and storage medium
CN111860502A (en) * 2020-07-15 2020-10-30 北京思图场景数据科技服务有限公司 Picture table identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN105868759B (en) 2019-11-05

Similar Documents

Publication Publication Date Title
CN105868759A (en) Method and apparatus for segmenting image characters
US10885644B2 (en) Detecting specified image identifiers on objects
WO2019169532A1 (en) License plate recognition method and cloud system
CN104298982B (en) A kind of character recognition method and device
CN106156712A (en) A kind of based on the ID (identity number) card No. recognition methods under natural scene and device
CN108596166A (en) A kind of container number identification method based on convolutional neural networks classification
WO2014092979A1 (en) Method of perspective correction for devanagari text
MX2011002293A (en) Text localization for image and video ocr.
CN103914680A (en) Character image jet-printing, recognition and calibration system and method
CN103593642A (en) Card-information acquisition method and system
CN108154132A (en) A kind of identity card text extraction method, system and equipment and storage medium
CN104376318A (en) Removal of underlines and table lines in document images while preserving intersecting character strokes
CN110647882A (en) Image correction method, device, equipment and storage medium
CN109559344B (en) Frame detection method, device and storage medium
CN103310211A (en) Filling mark recognition method based on image processing
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN107122775A (en) A kind of Android mobile phone identity card character identifying method of feature based matching
CN107330430A (en) Tibetan character recognition apparatus and method
Nevetha et al. Automatic book spine extraction and recognition for library inventory management
WO2013121647A1 (en) Character-extraction method and character-recognition device and program using said method
CN114708186A (en) Electronic signature positioning method and device
CN104182744A (en) Text detection method and device, and text message extraction method and system
Choi et al. Localizing slab identification numbers in factory scene images
Karanje et al. Survey on text detection, segmentation and recognition from a natural scene images
CN105654082A (en) Method and equipment for character identification post-processing and image picking equipment comprising equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201010

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201010

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.