CN102567732A - Method and system for detecting document setting type - Google Patents

Method and system for detecting document setting type Download PDF

Info

Publication number
CN102567732A
CN102567732A CN2011104457934A CN201110445793A CN102567732A CN 102567732 A CN102567732 A CN 102567732A CN 2011104457934 A CN2011104457934 A CN 2011104457934A CN 201110445793 A CN201110445793 A CN 201110445793A CN 102567732 A CN102567732 A CN 102567732A
Authority
CN
China
Prior art keywords
file
picture
document
boundary rectangle
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011104457934A
Other languages
Chinese (zh)
Other versions
CN102567732B (en
Inventor
胡希驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Founder International Co Ltd
Founder International Beijing Co Ltd
Original Assignee
Founder International Co Ltd
Founder International Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Founder International Co Ltd, Founder International Beijing Co Ltd filed Critical Founder International Co Ltd
Priority to CN 201110445793 priority Critical patent/CN102567732B/en
Publication of CN102567732A publication Critical patent/CN102567732A/en
Application granted granted Critical
Publication of CN102567732B publication Critical patent/CN102567732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for detecting a document setting type, and belongs to the field of document setting type detection. The method comprises the following steps of: voting in a parameter space of Hough transformation according to the characteristics of parallelism and periodicity of word rows/ columns in a document and according to the characteristics that the row pitch is greater than the word space and the like by using the central point coordinates of a minimum enclosing rectangle of a character connected domain as an input point set of the Hough transformation, periodically analyzing voting extreme points, and determining the setting type of the document according to the periodicity of the extreme big points in the parameter space in different directions. By the method and the system, the setting type in various setting conditions of the document is determined.

Description

A kind of detection method of document typesetting type and system
Technical field
The present invention relates to the detection range of document typesetting, be specifically related to a kind of detection method and system of document typesetting type.
Background technology
File and picture has the branch of horizontal version and vertical setting of types version.The situation that horizontally-arranged and vertical setting of types mix in the comparatively complicated space of a whole page, also can occur, promptly a part of zone is a horizontal version, and a part of zone is the vertical setting of types version.Writing direction information also is the information of an outbalance in the printed page analysis.A lot of algorithms all will depend on this information and adjust.In books processing in enormous quantities, going input by manual work, is a very loaded down with trivial details job, is unfavorable for the robotization of whole processing, influences whole working (machining) efficiency.
When the type-setting mode of document is judged, a kind of mode relatively commonly used be row that file and picture is done black pixel respectively to or row to projection.Bigger in a certain direction upside deviation, the projection peak perhaps occurs and separate, then can judge corresponding writing direction.But generally be subject to factor affecting such as noise, illustration.Application number is 200910084862.6, name is called the determination methods that has proposed a kind of typesetting directions of text regions in the patent of " method of judging typesetting directions of text regions "; This method is utilized the method for projection; Carry out statistical study according to the projection histogram that obtains, judge recently that according to the length and width of text filed boundary rectangle the direction of composing is horizontally-arranged or vertical setting of types.Though this method can judge that the basic composing type in the document is horizontally-arranged or vertical setting of types, still can not judge the situation of mixed composition (existing horizontally-arranged has vertical setting of types again).And be subject to factor affecting such as noise, illustration.
Summary of the invention
To the defective that exists in the prior art, the object of the present invention is to provide a kind of detection method and system of document typesetting type, the periodicity of arranging through Hough transformation parameter spatial analysis character realizes the affirmation to the multiple type-setting mode of file and picture.
For realizing above-mentioned purpose, the technical scheme that the present invention adopts is following:
A kind of detection method of document typesetting type may further comprise the steps:
(1) selected file and picture to be detected, and said file and picture is carried out binary conversion treatment obtain binary image;
(2) calculate the connected domain of binary image, and calculate the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle;
(3) with the center point coordinate of said minimum boundary rectangle input point set as the Hough transformation calculations, the space ballot paper account of Hough transformation parameter add up matrix A (θ, ρ);
Wherein, (θ ρ) with θ is the X axle to the matrix A that adds up, and θ representes normal and the angle of X axle positive axis of central point place straight line of minimum boundary rectangle of every row or every row of file and picture, 0≤θ≤180; ρ representes the distance of central point place straight line and X axle of minimum boundary rectangle of every row or every row in file and picture space ,-r≤ρ≤+ r, r is half of file and picture catercorner length;
(4) detect the ballot extreme point of the matrix that adds up, and extreme point is carried out periodicity analysis, confirm the composing type of document according to the periodicity of extreme point on different θ directions.
Further, the detection method of aforesaid a kind of document typesetting type, in the step (1), file and picture carried out binary conversion treatment before, file and picture is carried out pre-service, said pre-service comprises gray scale adjustment and noise reduction process.
Further, the detection method of aforesaid a kind of document typesetting type, in the step (3), when the space ballot paper account of Hough transformation parameter adds up matrix, the corresponding relation of polling place and original document image space mid point in the recording parameters space.
Further, the detection method of aforesaid a kind of document typesetting type, said extreme point is meant the maximum point of parameter space, extreme point is carried out periodicity analysis be meant the enterprising line period property analysis of direction 0 ° and 90 ° to θ.
Further, the detection method of aforesaid a kind of document typesetting type in the step (4), is carried out periodicity analysis to extreme point, confirms that the concrete mode of the composing type of document is:
A) when extreme point only had periodically on an angle direction, it periodically was horizontal version that θ has in 90 ° of directions, and θ has periodicity in 0 ° of direction and is the vertical setting of types version;
B) when extreme point all has periodically, confirm that mode is following on two angle directions:
B1) if on 0 ° or 90 °, only there is the one-period sequence; Then type-setting mode is single composing type; If θ in the periodic quantity on 0 ° of direction greater than the periodic quantity on 90 ° of directions then be single vertical setting of types version, if θ in the periodic quantity on 0 ° of direction less than the periodic quantity on 90 ° of directions then be single horizontal version;
B2) if on 0 ° or 90 °, have two or the periodic sequence more than both, then type-setting mode is a mixed composition.
Further; The detection method of aforesaid a kind of document typesetting type; Periodic quantity with periodic extreme point is greater than the length of the minimum boundary rectangle of character in the file and picture or wide, and less than the length of the minimum boundary rectangle of character in the file and picture or wide k doubly, 2≤k≤6.
Further, the detection method of aforesaid a kind of document typesetting type, the length of the minimum boundary rectangle of said character or wide be the minimum boundary rectangle of all connected domains in the document length or wide among maximal value.
Further again, the detection method of aforesaid a kind of document typesetting type, the threshold range of the ballot value of said maximum point is (3,10).
Further, the detection method of aforesaid a kind of document typesetting type, the threshold value preferred value of the ballot value of said maximum point is 5.
A kind of detection system of document typesetting type comprises:
Binaryzation device: be used for file and picture to be detected is carried out binary conversion treatment, obtain binary image;
Connected domain calculation element: be used to calculate the connected domain of binary image, and calculate the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle;
The Hough converting means: be used for the center point coordinate of minimum boundary rectangle input point set as the Hough transformation calculations, the space ballot paper account of Hough transformation parameter add up matrix A (θ, ρ);
Wherein, (θ ρ) with θ is the X axle to the matrix A that adds up, and θ representes normal and the angle of X axle positive axis of central point place straight line of minimum boundary rectangle of every row or every row of file and picture, 0≤θ≤180; ρ representes the distance of central point place straight line and X axle of minimum boundary rectangle of every row or every row in file and picture space ,-r≤ρ≤+ r, r is half of file and picture catercorner length;
The composing type is confirmed device: be used to detect the ballot extreme point of matrix of adding up, and extreme point is carried out periodicity analysis, confirm the composing type of document according to the periodicity of extreme point on different θ directions.
Effect of the present invention is: method and system of the present invention; Based on word in the file and picture capable/collimation of row, periodically and line space generally greater than the characteristics of word space; With the center of the minimum boundary rectangle of literal connected domain as the input data; In the periodicity that Hough transformation parameter spatial analysis character is arranged, confirm the type-setting mode of document.Realized to file and picture being the judgement of multiple situation such as horizontally-arranged, vertical setting of types or mixed composition, overcome the situation defective that can't handle mixed composition in the existing composing type confirmation method, can confirming to the multiple type-setting mode in the file and picture.
Description of drawings
Fig. 1 is the structured flowchart of the detection system of a kind of document typesetting type of the present invention;
Fig. 2 is the process flow diagram of the detection method of a kind of document typesetting type of the present invention;
Fig. 3 is the file and picture of horizontal version to be detected in the embodiment;
Fig. 4 is the minimum boundary rectangle synoptic diagram in document image connectivity territory among Fig. 3;
Fig. 5 is file and picture extreme point periodicity analysis results figure in Hough transformation parameter space among Fig. 3;
Fig. 6 is the file and picture of vertical setting of types version to be detected among the embodiment;
Fig. 7 is file and picture extreme point periodicity analysis results figure in Hough transformation parameter space among Fig. 6;
Fig. 8 is the file and picture of mixed composition to be detected among the embodiment;
Fig. 9 is file and picture extreme point periodicity analysis results figure in Hough transformation parameter space among Fig. 8.
Embodiment
Main thought of the present invention is: the main foundation of method and system of the present invention be the collimation of word capable (row) in the document, periodically, line space generally confirms the composing type of document greater than word space.With the center of literal connected domain boundary rectangle as the input data; Parameter space in the Hough conversion is analyzed the periodicity that character is arranged; Utilize the relation of line space and word space to judge simultaneously; On parameter space 90 degree and 0 degree, two row peak points can occur, these points are generally by periodically arranging, and its gap periods has been represented line space or word space.Document for character only aligns in one direction then only can have periodically by an extreme point on an angle.If all alignment on two directions, then the size in cycle capable of using is judged, because the line space of general document all can be greater than word space.Can confirm horizontally-arranged and vertical setting of types zone in the mixing image.
Below in conjunction with Figure of description and embodiment the present invention is done further detailed description.
Fig. 1 shows the structured flowchart of the detection system of a kind of document typesetting type of the present invention, by finding out that this system mainly comprises with lower device among the figure:
Binaryzation device 11: be used for file and picture to be detected is carried out binary conversion treatment, obtain binary image;
Connected domain calculation element 12: be used to calculate the connected domain of binary image, and calculate the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle;
Hough converting means 13: be used for the center point coordinate of minimum boundary rectangle input point set as the Hough transformation calculations, the space ballot paper account of Hough transformation parameter add up matrix A (θ, ρ);
The composing type is confirmed device 14: be used to detect the ballot extreme point of matrix of adding up, and extreme point is carried out periodicity analysis, confirm the composing type of document according to the periodicity of extreme point on different θ directions.
Fig. 2 shows the process flow diagram based on the detection method of a kind of document typesetting type of detection system among Fig. 1, by finding out that this method mainly may further comprise the steps among the figure:
Step S21: file and picture is carried out binary conversion treatment obtain binary image;
Selected file and picture to be detected like the file and picture among Fig. 3, and carries out binary conversion treatment to said file and picture and obtains binary image.The Hough mapping algorithm mainly is the pixel that is applied to two-value, and method of the present invention is will be with the central point of the minimum boundary rectangle of word or radical as input point; Ask for central point and need calculate connected domain; Need document figure to be detected be converted into bianry image and will calculate connected domain; And the common noise in the file and picture has very big influence to the quality of testing result; So generally before file and picture is carried out binary conversion treatment, need carry out pre-service, comprise gray scale adjustment processing and noise reduction process to file and picture.
Step S22: the connected domain, the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle that calculate binary image;
Calculate the connected domain of binary image, and calculate the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle.The mark of binary image connected domain adopts prior art, is not described in detail in this embodiment, the file and picture among Fig. 3 is carried out connected domain calculate, and the result of the minimum boundary rectangle of calculating connected domain is as shown in Figure 4.
Step S23: the center point coordinate of minimum boundary rectangle is voted at parameter space as the input point set of Hough conversion;
With the center point coordinate of the minimum boundary rectangle of the bianry image connected domain of being calculated among the step S22 input point set as the Hough transformation calculations, and the space ballot paper account of Hough transformation parameter add up matrix A (θ, ρ).In the Hough conversion, the distance (radius) of initial point straight line in the image space in the general presentation video of the ρ space is crossed the radius of initial point and the angle of X axle positive axis in the θ presentation video space.For in the file and picture with the character of delegation or same row; The center point coordinate of the minimum boundary rectangle of its connected domain should be point-blank; So in the present invention, (θ is the X axle with the θ angle ρ) to the matrix A that adds up; θ representes normal and the angle of X axle positive axis of central point place straight line of minimum boundary rectangle of every row or every row of file and picture, 0≤θ≤180; ρ representes the distance of central point place straight line and X axle of minimum boundary rectangle of every row or every row in file and picture space ,-r≤ρ≤+ r, r is half of file and picture catercorner length.When the space ballot paper account of Hough transformation parameter adds up matrix, the corresponding relation of polling place and original document image space mid point in the while recording parameters space.
Step S24: the add up ballot extreme point of matrix of check and analysis, confirm the composing type of document according to the periodicity of extreme point.
Add up in the detected parameters space ballot extreme point of matrix, and extreme point carried out periodicity analysis, confirm the composing type of document according to extreme point periodicity in different directions.Extreme point among the present invention refers to maximum point; In this embodiment, described maximum point is meant that this point adds up the ballot value of matrix more than the point of the ballot value of its former and later two points, for example is 5 point for a ballot value; If ballot value of 2 is all less than 5 before and after it; Then this is a maximum point, if the ballot value of 2 of front and back one of them be not less than 5 or all be not less than 5, then this point is not a maximum point.Extreme point is carried out periodicity analysis refer to the enterprising line period property analysis of direction 0 ° and 90 ° to θ, what the direction of 0 ° and 90 ° was corresponding in fact is the vertical setting of types version row or the horizontal version both direction of document.
For file and picture, no matter be horizontally-arranged or vertical setting of types, word is capable/arrangement in file and picture of word row is to have periodically, and row/column pitch is generally all greater than the character pitch of same delegation/row.In the parameter space of Hough conversion; θ two row peak points can occur on the direction of 90 degree and 0 degree, and these are put generally by periodic arrangement, and gap periods has been represented the line space or the character pitch of document; Through to above-mentioned periodic analysis, can confirm the composing type of document.The concrete mode of wherein judging for the composing type in this embodiment is following:
A) when extreme point only has periodically on an angle direction, it periodically is horizontal version that θ has on 90 ° of directions, and θ has periodicity and is the vertical setting of types version on 0 ° of direction; This is because θ representes normal and the angle of X axle positive axis of central point place straight line of minimum boundary rectangle of every row or every row of file and picture; In fact be exactly normal and the angle of X axle of the straight line at each row or characters of each row place; If θ is 90 °; The straight line at description character place is parallel with the X axle so, and the X axle is horizontal, so can be judged as horizontal version this moment.
B) when extreme point all has periodically on both direction, reach θ and all have periodically at 0 ° and 90 °, at this moment to divide following two kinds of situation to judge:
B1) if all just have the one-period sequence on 0 ° or 90 ° of directions, promptly the cycle on 0 ° or the 90 ° of directions is the single cycle, and the composing type of document is confirmed as unicity and set type.If at this moment the periodic quantity of 0 ° of direction is greater than on 90 ° of directions, document is single vertical setting of types version so, if the periodic quantity of 0 ° of direction less than on 90 ° of directions, document is single horizontal version so.
B2) if having two or more periodic sequences on 0 ° or 90 ° of directions, the document typesetting type is a mixed composition so.
Rule of thumb, in the document of vertical setting of types version, normal words arrangement in the horizontal also is neat, so can cause on 0 ° of Hough transformation space and 90 ° of both directions, all having periodically.When on both direction, all having periodically; Check again the period on 0 ° or the 90 ° of directions is whether periodic quantity is unique; If be the b1 situation; If the periodic quantity on this moment 0 ° of direction should be the situation of single vertical setting of types version greater than the periodic quantity on 90 ° of directions; If the periodic quantity on 90 ° of directions should be the situation of single horizontal version greater than the periodic quantity on 0 ° of direction, this is the characteristics decision that generally is greater than the spacing of two adjacent characters in same delegation/row by line space/column pitch in the file and picture.If the period of 0 ° or 90 ° direction is not unique, be the b2 situation, can be judged to the mixing version this moment.
In addition, when the composing of reality is judged, when the space ballot paper account of Hough transformation parameter adds up matrix; The ballot value has certain lowest threshold; When the ballot value of the above-mentioned maximum point of selecting during greater than lowest threshold, just can become the maximum point that is adopted when judging, promptly need screen maximum point; Rule of thumb the optional scope of this lowest threshold generally greater than 3 less than 10; Preferred value is 5 (empirical values), and the votes of the maximum point of only selecting is greater than setting threshold, just as the maximum point of judging process of typeset.Method utilization of the present invention be that extreme point is periodically judged, and should be line space/column pitch and the character pitch in the file and picture cycle in fact.Line space/column pitch in this embodiment is the distance between the straight line that the central point of straight line and the minimum boundary rectangle of the character of adjacent lines/row at central point place of minimum boundary rectangle of the character of same delegation/row belongs to, and character pitch is the distance between the central point of the minimum boundary rectangle of adjacent two characters in same delegation/row.Because line space line space/column pitch or character pitch generally all are be greater than single character wide or high, thus periodic quantity also should be worth greater than this, if periodic quantity is too little; Might be caused by noise also, so in this embodiment, periodic quantity is greater than the length of the minimum boundary rectangle of character in the file and picture or wide; While is less than the length of the minimum boundary rectangle of character in the file and picture or wide k times; 2≤k≤6, the span of k generally are 2~6, and preferred value is 3.According to the minimum boundary rectangle of all characters in the file and picture, count maximum length and width value, when judging the composing type, check whether whether periodic quantity is determined greater than the maximum length and width value of statistical probability in all length and width values is normal periodic quantity.
Embodiment below in conjunction with concrete further explains the present invention.
Embodiment
For the file and picture shown in Fig. 3, at first carry out binary conversion treatment, obtain binary image; And calculate the connected domain that identifies binary image, and calculate the minimum boundary rectangle of connected domain, as shown in Figure 4; With the center point coordinate of the minimum boundary rectangle input point set as the Hough transformation calculations, the matrix A (θ that adds up is calculated in the ballot in Hough transformation parameter space then; ρ), the corresponding relation of recording parameters space polling place and original image space station mid point simultaneously, the maximum point of matrix adds up in the detected parameters space afterwards; And on 0 ° of θ and 90 ° of directions, carry out the periodicity analysis of extreme point, and extreme point is screened its result (being θ laterally among the figure) as shown in Figure 5; By finding out among the figure, θ only has periodically on 90 degree directions, so the type-setting mode of the file and picture in the process decision chart 3 is a horizontal version.
For the file and picture among Fig. 6, carry out the analysis of extreme point in Hough transformation parameter space, and it is as shown in Figure 7 that extreme point is screened its result of back; By finding out among the figure, θ all has periodically on 0 degree and 90 degree directions, and its periodic quantity of the periodicity on each direction only has one; Be the b1 situation; Be single composing, and θ is greater than the periodic quantity on the 90 degree directions in the periodic quantity on the 0 degree direction, think is when the situation with the vertical setting of types version.
For the file and picture among Fig. 8; Carry out the analysis of extreme point in Hough transformation parameter space, and it is as shown in Figure 9 that extreme point is screened its result of back (the ballot value that is the maximum point selected satisfies the condition greater than the ballot threshold value of setting), by finding out among the figure; θ all has periodically on 0 degree and 90 degree directions; On 0 degree direction, have only the one-period value, and have two periodic quantities (the first six extreme point on the 90 degree directions has same periodic quantity, and the extreme point of back has the another one periodic quantity) on the 90 degree directions; Institute thinks and the b2 situation is the mixing situation.And, can find out, on transversely arranged direction, also to have periodically by the actual composing situation of Fig. 8 document for the vertical setting of types part in the mixing, so θ has two periodic quantities on 90 directions, tally with the actual situation.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technology thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims (10)

1. the detection method of a document typesetting type may further comprise the steps:
(1) selected file and picture to be detected, and said file and picture is carried out binary conversion treatment obtain binary image;
(2) calculate the connected domain of binary image, and calculate the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle;
(3) with the center point coordinate of said minimum boundary rectangle input point set as the Hough transformation calculations, the space ballot paper account of Hough transformation parameter add up matrix A (θ, ρ);
Wherein, (θ ρ) with θ is the X axle to the matrix A that adds up, and θ representes normal and the angle of X axle positive axis of central point place straight line of minimum boundary rectangle of every row or every row of file and picture, 0≤θ≤180; ρ representes the distance of central point place straight line and X axle of minimum boundary rectangle of every row or every row in file and picture space ,-r≤ρ≤+ r, r is half of file and picture catercorner length;
(4) detect the ballot extreme point of the matrix that adds up, and extreme point is carried out periodicity analysis, confirm the composing type of document according to the periodicity of extreme point on different θ directions.
2. the detection method of a kind of document typesetting type as claimed in claim 1 is characterized in that: in the step (1), file and picture carried out binary conversion treatment before, file and picture is carried out pre-service, said pre-service comprises gray scale adjustment and noise reduction process.
3. the detection method of a kind of document typesetting type as claimed in claim 1; It is characterized in that: in the step (3); When the space ballot paper account of Hough transformation parameter adds up matrix, the corresponding relation of polling place and original document image space mid point in the recording parameters space.
4. the detection method of a kind of document typesetting type as claimed in claim 3; It is characterized in that: in the step (4); Said extreme point is meant the maximum point of parameter space, extreme point is carried out periodicity analysis be meant the enterprising line period property analysis of direction 0 ° and 90 ° to θ.
5. the detection method of a kind of document typesetting type as claimed in claim 4 is characterized in that: in the step (4), extreme point is carried out periodicity analysis, confirm that the concrete mode of the composing type of document is:
A) when extreme point only had periodically on an angle direction, it periodically was horizontal version that θ has in 90 ° of directions, and θ has periodicity in 0 ° of direction and is the vertical setting of types version;
B) when extreme point all has periodically, confirm that mode is following on two angle directions:
B1) if on 0 ° or 90 °, only there is the one-period sequence; Then type-setting mode is single composing type; If θ in the periodic quantity on 0 ° of direction greater than the periodic quantity on 90 ° of directions then be single vertical setting of types version, if θ in the periodic quantity on 0 ° of direction less than the periodic quantity on 90 ° of directions then be single horizontal version;
B2) if on 0 ° or 90 °, have two or the periodic sequence more than both, then type-setting mode is a mixed composition.
6. the detection method of a kind of document typesetting type as claimed in claim 5; It is characterized in that: the periodic quantity with periodic extreme point is greater than the length of the minimum boundary rectangle of character in the file and picture or wide; And less than the length of the minimum boundary rectangle of character in the file and picture or wide k doubly, 2≤k≤6.
7. the detection method of a kind of document typesetting type as claimed in claim 6 is characterized in that: in the said file and picture length of the length of the minimum boundary rectangle of character or wide minimum boundary rectangle for all connected domains in the document or wide among maximal value.
8. like the detection method of the described a kind of document typesetting type of one of claim 4 to 6, it is characterized in that: the threshold range of the ballot value of said maximum point is (3,10).
9. the detection method of a kind of document typesetting type as claimed in claim 8 is characterized in that: the threshold value preferred value of the ballot value of said maximum point is 5.
10. the detection system of a document typesetting type comprises:
Binaryzation device: be used for file and picture to be detected is carried out binary conversion treatment, obtain binary image;
Connected domain calculation element: be used to calculate the connected domain of binary image, and calculate the minimum boundary rectangle of connected domain and the center point coordinate of minimum boundary rectangle;
The Hough converting means: be used for the center point coordinate of minimum boundary rectangle input point set as the Hough transformation calculations, the space ballot paper account of Hough transformation parameter add up matrix A (θ, ρ);
Wherein, (θ ρ) with θ is the X axle to the matrix A that adds up, and θ representes normal and the angle of X axle positive axis of central point place straight line of minimum boundary rectangle of every row or every row of file and picture, 0≤θ≤180; ρ representes the distance of central point place straight line and X axle of minimum boundary rectangle of every row or every row in file and picture space ,-r≤ρ≤+ r, r is half of file and picture catercorner length;
The composing type is confirmed device: be used to detect the ballot extreme point of matrix of adding up, and extreme point is carried out periodicity analysis, confirm the composing type of document according to the periodicity of extreme point on different θ directions.
CN 201110445793 2011-12-28 2011-12-28 Method and system for detecting document setting type Active CN102567732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110445793 CN102567732B (en) 2011-12-28 2011-12-28 Method and system for detecting document setting type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110445793 CN102567732B (en) 2011-12-28 2011-12-28 Method and system for detecting document setting type

Publications (2)

Publication Number Publication Date
CN102567732A true CN102567732A (en) 2012-07-11
CN102567732B CN102567732B (en) 2013-11-06

Family

ID=46413105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110445793 Active CN102567732B (en) 2011-12-28 2011-12-28 Method and system for detecting document setting type

Country Status (1)

Country Link
CN (1) CN102567732B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577026A (en) * 2013-11-15 2014-02-12 浪潮(北京)电子信息产业有限公司 Method for inspecting direction of words on single board
CN107798355A (en) * 2017-11-17 2018-03-13 山西同方知网数字出版技术有限公司 A kind of method automatically analyzed based on file and picture format with judging

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101593277A (en) * 2008-05-30 2009-12-02 电子科技大学 A kind of complicated color image Chinese version zone automatic positioning method and device
CN101833648A (en) * 2009-03-13 2010-09-15 汉王科技股份有限公司 Method for correcting text image
CN101882215A (en) * 2009-05-25 2010-11-10 汉王科技股份有限公司 Method for judging typesetting directions of text regions
CN102496018A (en) * 2011-12-08 2012-06-13 方正国际软件有限公司 Document skew detection method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101593277A (en) * 2008-05-30 2009-12-02 电子科技大学 A kind of complicated color image Chinese version zone automatic positioning method and device
CN101833648A (en) * 2009-03-13 2010-09-15 汉王科技股份有限公司 Method for correcting text image
CN101882215A (en) * 2009-05-25 2010-11-10 汉王科技股份有限公司 Method for judging typesetting directions of text regions
CN102496018A (en) * 2011-12-08 2012-06-13 方正国际软件有限公司 Document skew detection method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577026A (en) * 2013-11-15 2014-02-12 浪潮(北京)电子信息产业有限公司 Method for inspecting direction of words on single board
CN103577026B (en) * 2013-11-15 2016-05-18 浪潮(北京)电子信息产业有限公司 A kind of veneer method of word direction above that checks
CN107798355A (en) * 2017-11-17 2018-03-13 山西同方知网数字出版技术有限公司 A kind of method automatically analyzed based on file and picture format with judging

Also Published As

Publication number Publication date
CN102567732B (en) 2013-11-06

Similar Documents

Publication Publication Date Title
US20190294663A1 (en) Method and device for positioning table in pdf document
CN103093181A (en) License plate image locating method and device
CN101408937A (en) Method and apparatus for locating character row
CN103955659B (en) Batch true-known code identification method
CN102496018B (en) Document skew detection method and system
CN101320422B (en) Normative decision method and apparatus for cross, connection and separation relationship of handwritten Chinese character strokes
CN104504717A (en) Method and device for detection of image information
JP6515010B2 (en) Method and system for determining ground truth value in lane departure warning
CN102213767A (en) Positioning control method for closed region of vehicle-mounted GPS (Global Positioning System)
CN106446888A (en) Camera module multi-identifier identification method and camera module multi-identifier identification equipment
CN102567732B (en) Method and system for detecting document setting type
CN104156725A (en) Novel Chinese character stroke combination method based on angle between stroke segments
CN101877062A (en) Method for profile analysis in image layout area
CN101791912A (en) Image processing equipment, PRN device and image processing method
CN117854402A (en) Abnormal display detection method and device of display screen and terminal equipment
CN104636708A (en) Partial document image comparison method and system
CN105069455A (en) Method and device for filtering official seal of invoice
CN104778432A (en) Image recognition method
US8588507B2 (en) Computing device and method for analyzing profile tolerances of products
US20180121139A1 (en) Information processing system, component lifetime determining method, and non-transitory recording medium
CN103473518A (en) Waybill information input and black-and-white block coding and decoding system
CN101365043B (en) Spot array stage pixel point color calibrating method and device
US8308064B2 (en) Barcode evaluation method and barcode evaluation apparatus
CN108154497B (en) Automatic detection method and system for graphic road conditions
CN106095826A (en) A kind of method and system uploading papery document

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant