CN103281474B - Image and text separation method for scanned image of multifunctional integrated printer - Google Patents
Image and text separation method for scanned image of multifunctional integrated printer Download PDFInfo
- Publication number
- CN103281474B CN103281474B CN201310159078.3A CN201310159078A CN103281474B CN 103281474 B CN103281474 B CN 103281474B CN 201310159078 A CN201310159078 A CN 201310159078A CN 103281474 B CN103281474 B CN 103281474B
- Authority
- CN
- China
- Prior art keywords
- image
- fritter
- pixel
- threshold value
- separated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Facsimile Image Signal Circuits (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image and text separation method for a scanned image of a multifunctional integrated printer. The conventional image and text separation methods for the scanned image comprise a connected-domain-based analysis algorithm and a texture-characteristic-based segmentation algorithm; by the connected-domain-based analysis algorithm, a connected domain is difficult to accurately extract when an image with a more complex background or poorer quality is processed; and by the texture-characteristic-based segmentation algorithm, an ideal effect is difficult to achieve when ideographs are processed. Characteristics of the multifunctional integrated printer are combined, and the image and text separation method is simple, time-saving and easy to implement by hardware. According to the method, different characteristics of a text area and an image area in the image are combined, and two separation threshold values and the traversing times of small image blocks are used for image and text separation. According to the technical scheme, the image area and the text area of the image can be accurately separated.
Description
Technical field
The invention belongs to digital image processing field, relate to a kind of graph separation of multifunctional all printer scan image.
Background technology
A function of multifunctional all printer is scan image, then prints and obtains duplicate.One width picture and text the scan image deposited, it is clear that people often expect to obtain character area, the duplicate that image-region is smooth.If unified, enhancing process or smooth treatment are carried out to image, be difficult to obtain such printing effect.On the contrary, if the character area in image and image-region are separated, character area is done and strengthens process, smooth treatment is done to image-region, just can reach the effect that people expect.Now, Engineering drawing is carried out to the scan image of multifunctional all printer and just become a technical problem urgently to be resolved hurrily.
Traditional Algorithm for Text/Photo Separation can be roughly divided into two large classes, and a kind of is parser based on connected domain, and another kind is the partitioning algorithm based on textural characteristics.The former takes full advantage of the systematicness of character string, thus tries to achieve the threshold value of picture and text cutting, the advantage of this type of algorithm be simple directly, fast operation, but when image background more complicated or second-rate time, this type of algorithm is difficult to extract connected domain accurately.In addition, the rule used in this type of algorithm and threshold value are determined according to specific image set, and this algorithm is confined to concrete application, lack robustness, are difficult to promote; Latter utilizes character and figure to have this feature of different textures to carry out Engineering drawing, but, based on the Algorithm for Text/Photo Separation of texture mostly for the alphabetic literal of small size character set taking English as representative, literary composition waits during pictograph and is difficult to obtain desirable effect in processes.In addition, be a difficult point based on how choosing effective and general textural characteristics in the Algorithm for Text/Photo Separation of textural characteristics, the calculating also more complicated of process texture, causes arithmetic speed slower.
Summary of the invention
For overcoming prior art defect, the present invention proposes a kind of graph separation of multifunctional all printer scan image.
The technical solution adopted in the present invention is: a kind of graph separation of multifunctional all printer scan image, is characterized in that, comprise the following steps:
Step 1: being transformed into grayscale mode by scanning the original image A obtained, obtaining the image B of 256 grades of gray scales;
Step 2: in the image B described in calculating, the ratio of the number of pixels of each gray scale in total pixel obtains the frequency of gray scale, with described gray scale for abscissa, the frequency of described gray scale is ordinate, is obtained the gray histogram curve of described image B by the method for matching;
Step 3: the gray scale finding the frequency of gray scale the highest in the scope of the gray histogram curve 128 grades to 256 grades of described image B is designated as gray scale G, gray scale corresponding to the trough place nearest apart from described gray scale G is separated threshold value T1 as first, and T1-40 is separated threshold value T2 as second;
Step 4: described image B is divided into several fritters, each fritter contains M × N number of pixel, and several C that passes through of each fritter is initially 0, wherein M >=1, N >=1;
Step 5: in the fritter described in judgement, whether all pixels pass through the described first separation threshold value T1, the second separation threshold value T2, and the fritter described in statistics passes through several C;
Step 6: several C that passes through of described fritter is judged:
If: 0.2*M × N<C, then described fritter is word fritter;
Otherwise: described fritter is image fritter;
Step 7: the pixel in word fritters all in described original image A is stored in text file, merges into text file; The pixel of image fritters all in described original image A is stored in image file, merges into image file.
As preferably, described image B is divided into several fritters, and each fritter contains 16 × 16 pixels.
As preferably, in described fritter, all pixels are separated threshold value T1, second to described first and are separated the determination methods whether threshold value T2 pass through and are: get a certain pixel in described fritter, if this pixel be separated threshold value T and meet following condition:
((x
m, y
n)-T) × ((x
m+1, y
n)-T) <0 or ((x
m, y
n)-T) × ((x
m, y
n+1)-T) <0,
Then this pixel is passed through and is separated threshold value T, wherein, and (x
m, y
n) for m in described fritter is capable, the gray value of the n-th row; (x
m+1, y
n) for m+1 in described fritter is capable, the gray value of the n-th row; (x
m, y
n+1) for m in described fritter is capable, the gray value of the (n+1)th row.
Wherein, when pixel is positioned at the lower boundary of described fritter, (x
m+1, y
n) do not exist, only need judge this pixel and be separated threshold value T and whether meet following condition: ((x
m, y
n)-T) × ((x
m, y
n+1)-T) <0, then this pixel is passed through and is separated threshold value T;
Wherein, when pixel is positioned at the right margin of described fritter, (x
m, y
n+1) do not exist, only need judge this pixel and be separated threshold value T and whether meet following condition: ((x
m, y
n)-T) × ((x
m+1, y
n)-T) <0, then this pixel is passed through and is separated threshold value T;
Wherein, as (x
m+1, y
n), (x
m, y
n+1) when all not existing, then think that pixel is passed through and be separated threshold value T.
As preferably, described fritter passes through several C computational methods and is separated threshold value T2 passes through judgement for: pixel all in the fritter described in traversal is separated threshold value T1, described second to described first, if the pixel of traversal is separated threshold value T2 to the described first separation threshold value T1, described second and can both passes through, then described fritter passes through several C and adds 1.
As preferably, the method for described matching is the method for cubic polynomial matching.
As preferably, described by scan the original image A obtained be transformed into grayscale mode time, the method adopted is 0.2989*R+0.5870*G+0.1140*B, and wherein, R, G, B are respectively the red, green, blue component of pixel in described original image A.
The present invention carries out Engineering drawing according to this feature of number of passing through of each fritter of scan image to image, can be separated with character area by the automatic image-region by image exactly; The method of cubic polynomial matching is adopted to obtain its gray histogram curve to processing the gray-scale map obtained.And, when computed image fritter passes through several first is separated threshold value, and to be separated threshold value with second be unfixed, and the gray scale needing trough place that in distance distribution histogram curve gray scale 128 to gray scale 256 scope, gray scale frequency highest point gray scale is nearest corresponding is determined.During enforcement, the method can be integrated in scanner driving, user can be processed automatically to scan image.
Present invention incorporates the feature of multifunctional all printer, be a kind of simple, save time, be easy to the graph separation realized on hardware.
Accompanying drawing explanation
Fig. 1: the flow chart of the embodiment of the present invention.
Embodiment
Below in conjunction with the drawings and specific embodiments, the solution of the present invention is further elaborated.
Ask for an interview Fig. 1, the technical solution adopted in the present invention is: a kind of graph separation of multifunctional all printer scan image, comprises the following steps:
Step 1: being transformed into grayscale mode by scanning the original image A obtained, obtaining the image B of 256 grades of gray scales;
In specific embodiment, the original image A obtained if scan not is grayscale mode, needs to be converted into grayscale mode, it is 0.2989*R+0.5870*G+0.1140*B that RGB image is transformed into the method that grayscale mode adopts, and wherein, R, G, B are respectively the red, green, blue component of pixel in image A;
Step 2: in computed image B, the ratio of the number of pixels of each gray scale in total pixel obtains the frequency of gray scale take gray scale as abscissa, and the frequency of gray scale is ordinate, is obtained the gray histogram curve of image B by the method for matching;
In concrete enforcement, the method for cubic polynomial matching that what the matching of gray histogram curve adopted is, cubic polynomial fits to prior art, and it will not go into details in the present invention.
Step 3: the gray scale finding the frequency of gray scale the highest in the scope of the gray histogram curve 128 grades to 256 grades of image B is designated as gray scale G, gray scale corresponding to the nearest trough place of distance gray scale level G is separated threshold value T1 as first, and T1-40 is separated threshold value T2 as second.
Step 4: image B is divided into several fritters, each fritter contains M × N number of pixel, and several C that passes through of each fritter is initially 0, wherein M >=1, N >=1; In concrete enforcement, containing 16 × 16 pixels in each fritter.
Step 5: judge in fritter, whether all pixels pass through the first separation threshold value T1, the second separation threshold value T2, and statistics fritter passes through several C;
Wherein judge that the method for passing through is: get a certain pixel in fritter, if this pixel be separated threshold value T and meet following condition:
((x
m, y
n)-T) × ((x
m+1, y
n)-T) <0 or ((x
m, y
n)-T) × ((x
m, y
n+1)-T) <0,
Then this pixel is passed through and is separated threshold value T, wherein, and (x
m, y
n) for m in fritter is capable, the gray value of the n-th row; (x
m+1, y
n) for m+1 in fritter is capable, the gray value of the n-th row; (x
m, y
n+1) for m in fritter is capable, the gray value of the (n+1)th row;
Wherein, when pixel is positioned at the lower boundary of described fritter, (x
m+1, y
n) do not exist, only need judge this pixel and be separated threshold value T and whether meet following condition: ((x
m, y
n)-T) × ((x
m, y
n+1)-T) <0, then this pixel is passed through and is separated threshold value T;
Wherein, when pixel is positioned at the right margin of described fritter, (x
m, y
n+1) do not exist, only need judge this pixel and be separated threshold value T and whether meet following condition: ((x
m, y
n)-T) × ((x
m+1, y
n)-T) <0, then this pixel is passed through and is separated threshold value T;
Wherein, as (x
m+1, y
n), (x
m, y
n+1) when all not existing, then think that pixel is passed through and be separated threshold value T.
Wherein, the method that statistics fritter passes through several C is: pixels all in traversal fritter is separated threshold value T2 to the first separation threshold value T1, second and passes through judgement, be separated threshold value T2 can both pass through if the pixel of traversal is separated threshold value T1, second to first, then fritter passes through several C and adds 1.In computed image B, passing through of all fritters counts until traveled through all fritters in image B successively, and what obtain each fritter passes through several C.
Step 6: several C that passes through of fritter is judged:
If: 0.2*M × N<C, then fritter is word fritter;
Otherwise: fritter is image fritter;
In concrete enforcement, pass through several C according to each fritter, utilize formula 0.2*M × N<C to judge, if this formula is set up, then this fritter is labeled as 1, represents that this traversal fritter is word, otherwise this fritter is labeled as-1, represents that this traversal fritter is image.
Step 7: the pixel in word fritters all in original image A is stored in text file, merges into text file; The pixel of image fritters all in original image A is stored in image file, merges into image file;
In concrete enforcement, according to the mark of each fritter, if this fritter be labeled as 1, then the pixel in original image A corresponding for this fritter is stored in text file, if this fritter be labeled as-1, then the pixel in original image A corresponding for this fritter is stored in image file.
Specific embodiment described herein is only to the explanation for example of the present invention's spirit.Those skilled in the art can make various amendment or supplement or adopt similar mode to substitute to described specific embodiment, but can't depart from spirit of the present invention or surmount the scope that appended claims defines.
Claims (5)
1. a graph separation for multifunctional all printer scan image, is characterized in that, comprises the following steps:
Step 1: being transformed into grayscale mode by scanning the original image A obtained, obtaining the image B of 256 grades of gray scales;
Step 2: in the image B described in calculating, the ratio of the number of pixels of each gray scale in total pixel obtains the frequency of gray scale, with described gray scale for abscissa, the frequency of described gray scale is ordinate, is obtained the gray histogram curve of described image B by the method for matching;
Step 3: the gray scale finding the frequency of gray scale the highest in the scope of the gray histogram curve 128 grades to 256 grades of described image B is designated as gray scale G, gray scale corresponding to the trough place nearest apart from described gray scale G is separated threshold value T1 as first, and T1-40 is separated threshold value T2 as second;
Step 4: described image B is divided into several fritters, each fritter contains M × N number of pixel, and several C that passes through of each fritter is initially 0, wherein M >=1, N >=1;
Step 5: in the fritter described in judgement, whether all pixels pass through the described first separation threshold value T1, the second separation threshold value T2, and the fritter described in statistics passes through several C; Described fritter passes through several C computational methods and is separated threshold value T2 passes through judgement for: pixel all in the fritter described in traversal is separated threshold value T1, described second to described first, if the pixel of traversal is separated threshold value T2 to the described first separation threshold value T1, described second and can both passes through, then described fritter passes through several C and adds 1;
Step 6: several C that passes through of described fritter is judged:
If: 0.2*M × N<C, then described fritter is word fritter;
Otherwise: described fritter is image fritter;
Step 7: the pixel in word fritters all in described original image A is stored in text file, merges into text file; The pixel of image fritters all in described original image A is stored in image file, merges into image file.
2. the graph separation of multifunctional all printer scan image according to claim 1, it is characterized in that: described image B is divided into several fritters, each fritter contains 16 × 16 pixels.
3. the graph separation of multifunctional all printer scan image according to claim 1, it is characterized in that: in described fritter, all pixels are separated threshold value T1, second to described first and are separated the threshold value T2 determination methods of whether passing through and are: get a certain pixel in described fritter, if this pixel be separated threshold value T and meet following condition:
((x
m, y
n)-T) × ((x
m+1, y
n)-T) <0 or ((x
m, y
n)-T) × ((x
m, y
n+1)-T) <0,
Then this pixel is passed through and is separated threshold value T, wherein, and (x
m, y
n) for m in described fritter is capable, the gray value of the n-th row; (x
m+1, y
n) for m+1 in described fritter is capable, the gray value of the n-th row; (x
m, y
n+1) for m in described fritter is capable, the gray value of the (n+1)th row;
Wherein, when pixel is positioned at the lower boundary of described fritter, (x
m+1, y
n) do not exist, only need judge this pixel and be separated threshold value T and whether meet following condition: ((x
m, y
n)-T) × ((x
m, y
n+1)-T) <0, then this pixel is passed through and is separated threshold value T;
Wherein, when pixel is positioned at the right margin of described fritter, (x
m, y
n+1) do not exist, only need judge this pixel and be separated threshold value T and whether meet following condition: ((x
m, y
n)-T) × ((x
m+1, y
n)-T) <0, then this pixel is passed through and is separated threshold value T;
Wherein, as (x
m+1, y
n), (x
m, y
n+1) when all not existing, then think that pixel is passed through and be separated threshold value T.
4. the graph separation of multifunctional all printer scan image according to claim 1, is characterized in that: the method for described matching is the method for cubic polynomial matching.
5. the graph separation of multifunctional all printer scan image according to claim 1, it is characterized in that: described by scan the original image A obtained be transformed into grayscale mode time, the method adopted is 0.2989*R+0.5870*G+0.1140*B, wherein, R, G, B are respectively the red, green, blue component of pixel in described original image A.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310159078.3A CN103281474B (en) | 2013-05-02 | 2013-05-02 | Image and text separation method for scanned image of multifunctional integrated printer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310159078.3A CN103281474B (en) | 2013-05-02 | 2013-05-02 | Image and text separation method for scanned image of multifunctional integrated printer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103281474A CN103281474A (en) | 2013-09-04 |
CN103281474B true CN103281474B (en) | 2015-04-15 |
Family
ID=49063909
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310159078.3A Expired - Fee Related CN103281474B (en) | 2013-05-02 | 2013-05-02 | Image and text separation method for scanned image of multifunctional integrated printer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103281474B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1588431A (en) * | 2004-07-02 | 2005-03-02 | 清华大学 | Character extracting method from complecate background color image based on run-length adjacent map |
CN101599125A (en) * | 2009-06-11 | 2009-12-09 | 上海交通大学 | The binarization method that the complex background hypograph is handled |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8103104B2 (en) * | 2002-01-11 | 2012-01-24 | Hewlett-Packard Development Company, L.P. | Text extraction and its application to compound document image compression |
-
2013
- 2013-05-02 CN CN201310159078.3A patent/CN103281474B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1588431A (en) * | 2004-07-02 | 2005-03-02 | 清华大学 | Character extracting method from complecate background color image based on run-length adjacent map |
CN101599125A (en) * | 2009-06-11 | 2009-12-09 | 上海交通大学 | The binarization method that the complex background hypograph is handled |
Also Published As
Publication number | Publication date |
---|---|
CN103281474A (en) | 2013-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8103104B2 (en) | Text extraction and its application to compound document image compression | |
CN101515325B (en) | Character extracting method in digital video based on character segmentation and color cluster | |
CN104751142B (en) | A kind of natural scene Method for text detection based on stroke feature | |
US20110243432A1 (en) | Determining the Scale of Images | |
JP2004320701A (en) | Image processing device, image processing program and storage medium | |
DE602005026862D1 (en) | SEGMENTATION OF A DIGITAL IMAGE AND MANUFACTURE OF A COMPACT REPRESENTATION | |
CN104598907B (en) | Lteral data extracting method in a kind of image based on stroke width figure | |
CN106020120A (en) | Method for generating G code by using image based on ios system | |
WO2022166865A1 (en) | Shadow elimination method and apparatus for text image, and electronic device | |
CN101106716A (en) | A shed image division processing method | |
CN104361335B (en) | A kind of processing method that black surround is automatically removed based on scan image | |
US8620081B2 (en) | Image processing apparatus, method, and storage medium for determining attributes | |
CN105701491A (en) | Method for making fixed-format document image template and application thereof | |
US8000535B2 (en) | Methods and systems for refining text segmentation results | |
CN109447117A (en) | The double-deck licence plate recognition method, device, computer equipment and storage medium | |
CN101599172A (en) | The illumination compensation splitting method of the text image of inhomogeneous illumination | |
US8223395B2 (en) | Methods and systems for refining text color in a digital image | |
CN110807747B (en) | Document image noise reduction method based on foreground mask | |
CN103281474B (en) | Image and text separation method for scanned image of multifunctional integrated printer | |
CN105160300B (en) | A kind of text abstracting method based on level-set segmentation | |
CN1275191C (en) | Method and appts. for expanding character zone in image | |
CN104835121B (en) | Tone mapping method with entropy principle is constrained based on Infinite Norm | |
CN112837329B (en) | Tibetan ancient book document image binarization method and system | |
CN112348103B (en) | Image block classification method and device and super-resolution reconstruction method and device thereof | |
CN107103321B (en) | The generation method and generation system of road binary image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20150415 Termination date: 20200502 |