WO2017118356A1 - 文本图像处理方法和装置 - Google Patents
文本图像处理方法和装置 Download PDFInfo
- Publication number
- WO2017118356A1 WO2017118356A1 PCT/CN2016/113843 CN2016113843W WO2017118356A1 WO 2017118356 A1 WO2017118356 A1 WO 2017118356A1 CN 2016113843 W CN2016113843 W CN 2016113843W WO 2017118356 A1 WO2017118356 A1 WO 2017118356A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- character
- blocks
- height
- block
- text image
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/15—Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/18086—Extraction of features or characteristics of the image by performing operations within image blocks or by using histograms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present application relates to the field of character recognition technology, and in particular, to a text image processing method and apparatus.
- character segmentation plays an important role in the field of character recognition. It mainly divides the characters at the position of the characters based on the image text area.
- Character segmentation methods usually include projection segmentation, clustering, and template matching.
- the projection segmentation method uses the binarized image preprocessed by the image to determine the region where the character is located by projection;
- the clustering method uses the connected region of the character, and the character of the connected region is passed through the character distribution characteristic of the overall page. Blocks are merged; template matching rules are mainly applied to fixed fonts or characters, so they are not widely used.
- the above character segmentation method can segment characters to a certain extent, but is often limited in practical applications.
- the projection segmentation method has a problem that multiple characters are segmented into one piece when the characters themselves are tilted, and
- the template matching rule is less usable and can only be used in specific text situations.
- the above-mentioned character segmentation method has many problems, in particular, it is limited in practical applications, and the versatility and accuracy are not high.
- the present application provides a text image processing method and a text image processing apparatus, which are capable of improving the versatility and accuracy of character segmentation.
- a text image processing method comprising the following steps:
- the merging process of the character blocks is performed according to the height of the character block to obtain a block of the text image.
- a text image processing apparatus comprising:
- a preprocessing module configured to preprocess the text image to obtain a binarized image, the binarized image comprising a plurality of connected regions;
- a convex hull obtaining module configured to obtain a convex hull corresponding to the plurality of connected areas by using a convex hull algorithm, and obtain a character area circumscribing the convex hull;
- a segmentation module configured to perform character segmentation on the obtained character region to obtain a plurality of character blocks
- a merging processing module configured to perform a merging process of the character block according to the height of the character block to obtain a block of the text image.
- the text image processing is first preprocessed to obtain a plurality of connected regions included in the binarized image and the binarized image, and the convex hull algorithm is used to obtain convex hulls corresponding to the plurality of connected regions and circumscribed to the convex hull
- the character area of the packet is subjected to character segmentation in the character area to obtain a plurality of character blocks distributed in the binarized image, and the character block is combined according to the height of the character block to obtain the block included in the text image.
- Character segmentation and based on characters in this text image processing The merging of the block height eliminates the characters with upper and lower structures in the character line on the basis of separating the separated characters, thereby improving the accuracy of character segmentation.
- Due to the process There are no restrictions, which are based on the distribution and height of the characters in the text, thus improving the versatility of character segmentation.
- FIG. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
- FIG. 2 is a flow chart of a text image processing method according to an embodiment of the present invention.
- FIG. 3 is a flow chart of a method for character segmentation of a character region to obtain a plurality of character blocks of a binarized image, in accordance with one embodiment of the present invention
- FIG. 4 is a flow chart of a method of locating a connecting portion of a character in accordance with one embodiment of the present invention
- Figure 5 is a schematic illustration of a character region formed by two characters in accordance with one embodiment of the present invention.
- Figure 6 is a schematic view showing a connecting portion obtained by positioning in the character area of Figure 5;
- FIG. 7 is a schematic diagram of a character block obtained by dividing a character region of FIG. 5;
- FIG. 8 is a flowchart of a method for obtaining a block of a text image by performing a merging process of a character block according to a height of a character block according to an embodiment of the present invention
- FIG. 9 is a schematic structural diagram of a text image processing apparatus according to an embodiment of the present invention.
- FIG. 10 is a schematic structural diagram of a dicing module according to an embodiment of the present invention.
- connection positioning unit 11 is a schematic structural view of a connection positioning unit according to an embodiment of the present invention.
- FIG. 12 is a schematic structural diagram of a merge processing module according to an embodiment of the present invention.
- the implementation of character segmentation usually only has accuracy in a specific scene, and other The accuracy of character segmentation in the scene is low, which affects the accuracy of content recognition in text recognition applications.
- the present application proposes a text image processing method and a text image processing apparatus.
- the method includes pre-processing a text image to obtain a binarized image, wherein the binarized image includes a plurality of connected regions; obtaining a convex hull corresponding to the plurality of connected regions by a convex hull algorithm; obtaining an circumscribing to the convex region a character region of the packet; performing character segmentation on the obtained character region to obtain a plurality of character blocks; and performing merging processing of the character blocks according to the height of the character block.
- FIG. 1 shows the structure of an electronic device in accordance with an embodiment of the present invention.
- the electronic device 100 is merely an example of the application of the present invention and is not to be considered as providing any limitation as to the scope of use of the present invention.
- electronic device 100 includes a processor 110, a memory 120, and a system bus 130.
- processor 110 is a hardware for executing computer program instructions through basic arithmetic and logic operations in a computer system.
- Memory 120 is a physical device for temporarily or permanently storing computer programs or data.
- the program 120 and a plurality of text images are stored in the memory 120; the processor 110 will execute the program instructions in the memory 120 to process the text image.
- the electronic device 100 also includes various input interfaces 170, input devices 140 to enable input of various operations.
- the input device 140 can be at least one of a touch screen, a button, a keyboard, and a mouse.
- the electronic device 100 may also include a local area network interface 150 and a mobile communication unit 160 for performing communication functions.
- the electronic device 100 also includes a storage device 180, which can be selected from a variety of computer readable storage media, which are any available media that can be accessed, including both mobile and fixed media.
- a computer readable medium including but not limited to a flash memory (eg, a micro SD card), a CD-ROM, a digital versatile disc (DVD) or other optical disc, a magnetic tape cartridge, a tape storage or other storage device, or a storage device Any other media that requires information and is accessible.
- the electronic device 100 can perform various operations of the text image processing of the embodiment of the present invention, that is, The various steps of the text image processing method are performed in the form of program instructions in processor 110 running memory 120.
- the present invention can be equally implemented by a hardware circuit or a hardware circuit in combination with software instructions, and thus, the present invention is not limited to any specific hardware circuit, software, or a combination of both.
- the text image processing method is as shown in FIG. 2, and includes:
- Step 210 Preprocess the text image to obtain a binarized image, the binarized image comprising a plurality of connected regions.
- a text image is an image containing any text content arbitrarily, which includes characters constituting text, which may be arranged in one or more lines, and the text image may also include spaces and punctuation between characters and between characters.
- the text image is preprocessed to obtain a binarized image that best reflects the image information and includes a plurality of connected regions (referred to as connected domains).
- connected domains a plurality of connected regions
- the direction of the character line is referred to as a horizontal direction.
- the preprocessing process includes: smoothing the text image and detecting the edge to obtain an edge in the text image, and then using the morphology to obtain the distribution area of the character, thereby obtaining the connected region of the character.
- Step 230 Obtain a convex hull corresponding to each of the plurality of connected areas and a character area circumscribing the convex hull by a convex hull algorithm.
- the convex hull corresponding to each connected region is obtained by the convex hull algorithm, and is selected according to the convex hull to obtain a character region circumscribing the convex hull.
- the maximum convex hull corresponding to each connected region is obtained by the convex hull algorithm to avoid the information related to the characters being removed, and the integrity of the information related to the characters is ensured.
- the character area circumscribing the convex hull is a rectangular area, for example, a rectangular area obtained by framing the convex hull with a minimum rectangle to adapt to the shape of the character, further ensuring the text image. The accuracy of the processing.
- Each of the plurality of convex hulls corresponding to the connected area has a corresponding character area, thereby obtaining a plurality of character areas in the binarized image.
- Step 250 performing character segmentation on the obtained character region to obtain multiple characters of the binarized image Piece.
- Step 270 Perform a combination process of the character blocks according to the height of the character block to obtain a block of the text image.
- the merging process of the character blocks is performed according to the heights of all the character blocks in the binarized image, thereby causing the character blocks forming the upper and lower structures in the same character line to be merged.
- the character blocks that are divided into two parts are merged by the merging process according to the height of the character block, thereby improving The subsequent recognition rate.
- each character line in the text image is sufficiently thinly divided to separate a single character as much as possible.
- the merging performed on the above will cause the character blocks constituting the whole upper and lower structures in the same character line to be merged together, thereby facilitating subsequent character recognition.
- the step 250 is as shown in FIG. 3, and includes:
- step 251 the connected portion of the character is located in the character area.
- the character area obtained by the convex hull in the connected area is substantially a preliminary frame selection of characters, and there are often interconnected characters in the character area, in order to split the characters in the character area, in the embodiment of the present invention, Position the connection between characters vertically.
- the character is located in the character area to obtain the connected part, it means that there are mutually connected characters in the character area, and the horizontal direction of the character area needs to be segmented based on the connected part.
- step 253 a plurality of character blocks of the binarized image are obtained by dividing the character region according to the connection portion.
- the pixel value corresponding to the connection portion is set to 0 to complete the segmentation of the character region according to the connection portion.
- the character block is obtained by horizontally dividing the character area, and is corresponding to the character subdivided as much as possible.
- the character area is segmented according to the connection part in the horizontal direction to obtain the division of the character connection area.
- At least two character blocks are output; on the other hand, for a character area where the connection portion does not exist, the character area is a character block.
- the segmentation of the character region is completed by the above process, and a plurality of character blocks distributed in the binarized image are obtained, so that the segmentation of the characters is sufficiently fine, and each character block in each character row after segmentation is a separate one. Characters greatly improve the accuracy of character segmentation.
- the step 251 includes:
- Step 2511 Perform pixel value comparison on adjacent pixels in each column of the character region to obtain the number of consecutive pixels in the column of each column.
- Each of the plurality of character regions in the binarized image is composed of a plurality of pixels. Therefore, the positioning of the connected portion is performed in units of columns in a plurality of character regions of the binarized image.
- pixels that belong to the foreground portion and are consecutive in the column of pixels are obtained by pairwise alignment between adjacent pixels.
- the foreground portion is a portion in which the pixel value is 1 in the pixel
- the pixel belonging to the foreground portion in each column of pixels refers to a pixel having a pixel value of 1 and consecutive pixels in the column of pixels.
- step 2513 for each column of pixels, it is determined whether the number of consecutive pixels belonging to the foreground portion is less than or equal to the preset number. If yes, the process proceeds to step 2515, and if not, the process ends.
- step 2515 it is determined that the pixels belonging to the foreground portion and the consecutive pixels are the connected portions of the characters.
- a preset number is used for identification of the connected portion.
- the preset number can be determined in advance according to experience.
- the preset number may be 3, then these pixels belonging to the foreground portion and consecutive pixels are connected portions of characters.
- the height and width of the character region are first calculated, in one embodiment, calculated The height and width of the rectangular area.
- rect_width as the width and rect_hight as the height
- 1 ⁇ i ⁇ rect_width, 1 ⁇ j ⁇ rect_hight is defined.
- the value corresponding to line_num_1 in the pixel of the i-th column is obtained, which is the number of pixels in the i-th column that belong to the foreground portion and are continuous.
- connection portion 330 in the character area 310 shown in FIG. 6 is obtained, that is, Three consecutive pixels in the vertical direction (the three pixels are in the same column), and the three consecutive pixels correspond to a pixel value of 1.
- connection portion the division is performed to obtain two character blocks as shown in FIG. 7, that is, the character block 410 in which the character S is located and the character block 430 in which the character a is located.
- step 270 includes:
- Step 271 calculating the height of each character block in the binarized image to obtain a height distribution of the character blocks in the binarized image.
- the total height of the character blocks in the binarized image that is, the sum of the heights of all the character blocks in the binarized image, is also calculated in this step.
- the binarized image contains a plurality of character blocks, and for any character block, the height is calculated to obtain the height corresponding to each character block in the binarized image. Then, by counting the number of character blocks of the same height in the binarized image to obtain the height of the character block in the binarized image distributed.
- the heights of all the character blocks in the binarized image are counted, the character blocks having the same height are grouped together, and the number of character blocks of each group is counted.
- the height of each group of character blocks in the binarized image is represented by an array static_height[n], and the number of character blocks corresponding to each height is correspondingly stored in the array num_rect[n], where 1 ⁇ n.
- Step 273 Determine, according to the height distribution of the character blocks in the obtained binarized image, which of the heights of the character blocks and the total height of the character blocks in the binarized image exceed a preset value.
- the target character block is selected from the character blocks of the binarized image, and the ratio of the sum of the heights of the target character blocks to the total height of the character blocks of the binarized image exceeds a preset value.
- the ratio of the sum of the heights of the character blocks of the groups to the sum of the heights of all the character blocks in the binarized image is greater than a preset value.
- the preset value is, for example, a value greater than 50%, for example, 80%.
- the array static_height[n] may be sorted from large to small or from small to large according to the array num_rect[n] to obtain static_height[n] sorted in descending order or ascending order according to num_rect[n].
- the height of the first k character blocks is taken out.
- the height of the extracted character blocks at this time is static_height[1], ..., static_height[k], so that the following formula is established for the first time, namely:
- Step 275 Calculate a height average of the target character block.
- the operation of the height average is performed based on the character blocks selected in step 273 and the sum of the heights of these character blocks.
- the obtained static_height[1],...,static_height[k] is subjected to a height average operation to obtain a height average height_avg, namely:
- Step 277 Combine the character blocks that are in the same character line and overlap in the horizontal direction in the binarized image according to the height average value, to obtain the word block of the text image.
- the operation obtains the height average, in the binarized image, for any character line, if the sum of the heights of the two character blocks is less than the height average and there is overlap in the horizontal direction, the two character blocks are merged.
- middle_rect_x(i) is the x-axis coordinate value of the center of the i-th character block
- middle_rect_width(i) is the width value of the i-th character block
- rect_height (i) is the height of the i-th character block.
- the i-th character block will be compared with the remaining character blocks. If the sum of the height of a character block k and the current i-th character block is less than the height average and overlaps in the horizontal direction, the above two conditions are satisfied.
- the character blocks having the upper and lower structures and the height and the average height are merged well, and the distribution of the character blocks in the horizontal direction is ensured, so that the horizontal direction character blocks can be well performed in the subsequent recognition process. Combination and identification.
- the method as described above further comprises:
- a text image containing the word block is identified, and the combination of the word blocks is performed in accordance with the order of the word blocks in the recognition of the text image to obtain the text content in the text image.
- the word block will be processed in the recognition algorithm according to the need to set the strategy, for example, the word block is selectively merged.
- the selective combination refers to the average block according to the character line.
- the width and height are combined for some too narrow adjacent blocks, or some extra wide blocks are finer.
- the recognition algorithm used may be based on a recognition algorithm corresponding to character feature extraction, for example, a gray image gradient feature and a character HOG feature.
- a text image processing apparatus is also provided accordingly.
- the apparatus includes a pre-processing module 510, a convex hull acquisition module 530, a sharding module 550, and a merging processing module 570, wherein:
- a preprocessing module 510 configured to preprocess a text image to obtain a binarized image, where the binarized image includes a plurality of connected regions;
- the convex hull obtaining module 530 is configured to obtain a convex hull corresponding to the plurality of connected areas by using a convex hull algorithm, and obtain a character area circumscribing the convex hull;
- a segmentation module 550 configured to perform character segmentation on the obtained character region to obtain a plurality of character blocks of the binarized image
- the merging processing module 570 is configured to perform merging processing of the character blocks according to the height of the character block to obtain a block of the text image.
- the segmentation module 550 includes a connection location unit 551 and a segmentation execution unit 553, where:
- connection positioning unit 551 is configured to locate a connection portion of the character in the character area.
- the segmentation execution unit 553 is configured to obtain a plurality of character blocks of the binarized image according to the segmentation character region of the connection portion.
- connection positioning unit 551 includes a pixel comparison sub-unit 5511 and a judgment sub-unit 5513, wherein:
- the pixel matching sub-unit 5511 is configured to perform pixel value comparison on adjacent pixels in each column of pixels of the character region to obtain the number of consecutive pixels in the column of each column.
- the determining sub-unit 5513 is configured to determine, for each column of pixels, whether the number of consecutive pixels belonging to the foreground portion is less than or equal to a preset number, and if so, to locate the connecting portion belonging to the foreground portion and the consecutive pixels are characters.
- the merge processing module 570 includes a distribution statistics unit 571, a pixel selection unit 573, an average value calculation unit 575, and a merge execution unit 577, where:
- the distribution statistics unit 571 is configured to calculate the height of each character block in the binarized image to obtain a distribution of the height of the character block in the binarized image.
- the distribution statistics unit 571 also calculates the total height of the character blocks, that is, the sum of the heights of all the character blocks in the binarized image.
- the pixel selecting unit 573 is configured to determine, according to the height distribution of the character blocks in the obtained binarized image, which of the heights of the character blocks and the total height of the character blocks in the binarized image exceeds a preset value. That is, the pixel selecting unit 573 selects the target character block from the character blocks, and the ratio of the sum of the heights of the target character blocks to the total height of the character blocks exceeds a preset value.
- the average value calculating unit 575 is configured to calculate a height average value of the target character block.
- the merging execution unit 577 is configured to merge the character blocks that are in the same character line and overlap in the horizontal direction in the binarized image according to the height average value to obtain the word block of the text image.
- the apparatus as described above further includes an identification module for identifying a text image containing the word block, and performing a combination of the word blocks according to the order of the word blocks in the identification of the text image to obtain The text content in the text image.
- a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
- the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Geometry (AREA)
- Computer Graphics (AREA)
- Artificial Intelligence (AREA)
- Character Input (AREA)
Abstract
Description
Claims (12)
- 一种文本图像处理方法,包括:预处理文本图像得到二值化图像,其中所述二值化图像包含多个连通区域;通过凸包算法得到所述多个连通区域分别对应的凸包;获取外接于所述凸包的字符区域;对得到的所述字符区域进行字符切分得到多个字符块;根据所述字符块的高度进行字符块的合并处理得到文本图像的字块。
- 根据权利要求1所述的方法,其中,所述对得到的所述字符区域进行字符切分得到多个字符块的步骤包括:在字符区域中定位字符的连接部分;按照所述连接部分切分所述字符区域得到所述多个字符块。
- 根据权利要求2所述的方法,其中,所述在字符区域中定位字符的连接部分的步骤包括:对字符区域的每列像素中的相邻像素进行像素值比对,得到每列像素中属于前景部分且连续的像素的数量;对每列像素,判断所述属于前景部分且连续的像素的数量是否小于或者等于预设数量,若为是,则确定所述属于前景部分且连续的像素为字符的连接部分。
- 根据权利要求1所述的方法,其中,所述根据所述字符块的高度进行字符块的合并处理的步骤包括:计算所述字符块的高度以得到所述字符块的高度分布和所述字符块的总高度;从所述字符块中选择目标字符块,所述目标字符块的高度之和与所述字 符块的总高度的比超出预设值;计算所述目标字符块的高度平均值;根据所述高度平均值合并在所述二值化图像中处于同字符行且在水平方向上有交叠的字符块。
- 根据权利要求1所述的方法,所述方法还包括:根据所述字块在所述文本图像中的顺序对所述字块进行组合,得到所述文本图像的文字内容。
- 一种文本图像处理装置,包括:预处理模块,用于预处理文本图像得到二值化图像,所述二值化图像包含多个连通区域;凸包获取模块,用于通过凸包算法得到所述多个连通区域分别对应的凸包以及获取外接于所述凸包的字符区域;切分模块,用于对得到的字符区域进行字符切分得到多个字符块;合并处理模块,用于根据所述字符块的高度进行字符块的合并处理得到文本图像的字块。
- 根据权利要求6所述的装置,其中,所述切分模块包括:连接定位单元,用于在字符区域中定位字符的连接部分;切分执行单元,用于按照所述连接部分切分所述字符区域得到所述多个字符块。
- 根据权利要求6所述的装置,其中,所述连接定位单元包括:像素比对子单元,用于对字符区域的每列像素中的相邻像素进行像素值比对,得到每列像素中属于前景部分且连续的像素的数量;判断子单元,用于对每列像素,判断属于前景部分且连续的像素的数量是否小于或者等于预设数量,若为是,则确定所述属于前景部分且连续的像 素为字符的连接部分。
- 根据权利要求6所述的装置,其中,所述合并处理模块包括:分布统计单元,用于计算所述字符块的高度以得到所述字符块的高度分布和所述字符块的总高度;像素选择单元,用于从所述字符块中选择目标字符块,所述目标字符块的字符块的高度之和与所述字符块的总高度的比超出预设值;平均值计算单元,用于计算所述目标字符块的高度平均值;合并执行单元,用于根据所述高度平均值合并在所述二值化图像中处于同字符行且水平方向上有交叠的字符块。
- 根据权利要求6所述的装置,所述装置还包括:识别模块,用于根据字块在所述文本图像中的顺序对所述字块进行组合,得到所述文本图像的文本内容。
- 一种文本图像处理设备,包括:一个或多个处理器;和存储器,所述存储器存储有程序指令,所述指令当由所述处理器执行时配置所述装置执行根据权利要求1-5中任一项所述的方法。
- 一种计算机可读存储介质,包括程序指令,所述指令当由计算装置的处理器执行时配置所述装置执行根据权利要求1-5中任一项所述的方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017559607A JP6628442B2 (ja) | 2016-01-05 | 2016-12-30 | テキスト画像処理方法および装置 |
KR1020177032664A KR102012819B1 (ko) | 2016-01-05 | 2016-12-30 | 텍스트 이미지 처리 방법 및 장치 |
MYPI2017703995A MY184167A (en) | 2016-01-05 | 2016-12-30 | Text image processing method and apparatus |
EP16883481.0A EP3401842B1 (en) | 2016-01-05 | 2016-12-30 | Text image processing method and apparatus |
US15/802,913 US10572728B2 (en) | 2016-01-05 | 2017-11-03 | Text image processing method and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610004431.4 | 2016-01-05 | ||
CN201610004431.4A CN106940799B (zh) | 2016-01-05 | 2016-01-05 | 文本图像处理方法和装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/802,913 Continuation US10572728B2 (en) | 2016-01-05 | 2017-11-03 | Text image processing method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017118356A1 true WO2017118356A1 (zh) | 2017-07-13 |
Family
ID=59273298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/113843 WO2017118356A1 (zh) | 2016-01-05 | 2016-12-30 | 文本图像处理方法和装置 |
Country Status (7)
Country | Link |
---|---|
US (1) | US10572728B2 (zh) |
EP (1) | EP3401842B1 (zh) |
JP (1) | JP6628442B2 (zh) |
KR (1) | KR102012819B1 (zh) |
CN (1) | CN106940799B (zh) |
MY (1) | MY184167A (zh) |
WO (1) | WO2017118356A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127478A (zh) * | 2019-12-13 | 2020-05-08 | 上海众源网络有限公司 | 一种视图块分割方法及装置 |
CN113076952A (zh) * | 2021-03-02 | 2021-07-06 | 西安中诺通讯有限公司 | 一种文本自动识别和增强的方法及装置 |
CN116092087A (zh) * | 2023-04-10 | 2023-05-09 | 上海蜜度信息技术有限公司 | Ocr识别方法、***、存储介质及电子设备 |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110135417A (zh) * | 2018-02-09 | 2019-08-16 | 北京世纪好未来教育科技有限公司 | 样本标注方法及计算机存储介质 |
CN109299718B (zh) * | 2018-09-21 | 2021-09-24 | 新华三信息安全技术有限公司 | 一种字符识别方法及装置 |
CN111353511B (zh) * | 2018-12-20 | 2024-03-08 | 富士通株式会社 | 号码识别装置及方法 |
CN110020655B (zh) * | 2019-04-19 | 2021-08-20 | 厦门商集网络科技有限责任公司 | 一种基于二值化的字符去噪方法及终端 |
CN110085709B (zh) * | 2019-05-05 | 2020-08-11 | 无锡职业技术学院 | 一种led图像全自动计数统计*** |
JP7406884B2 (ja) * | 2019-06-27 | 2023-12-28 | キヤノン株式会社 | 情報処理装置、プログラム及び制御方法 |
CN110728129B (zh) * | 2019-09-03 | 2023-06-23 | 北京字节跳动网络技术有限公司 | 对图片中的文本内容进行排版的方法、装置、介质和设备 |
CN111325214B (zh) * | 2020-02-27 | 2023-02-14 | 珠海格力智能装备有限公司 | 喷印字符提取处理方法、装置、存储介质和电子设备 |
CN111461126B (zh) * | 2020-03-23 | 2023-08-18 | Oppo广东移动通信有限公司 | 文本行中的空格识别方法、装置、电子设备及存储介质 |
CN111738326B (zh) * | 2020-06-16 | 2023-07-11 | 中国工商银行股份有限公司 | 句粒度标注训练样本生成方法及装置 |
CN112418204A (zh) * | 2020-11-18 | 2021-02-26 | 杭州未名信科科技有限公司 | 基于纸质文档的文本识别方法、***及计算机介质 |
JP7344916B2 (ja) * | 2021-03-22 | 2023-09-14 | 楽天グループ株式会社 | 情報処理装置、情報処理方法およびプログラム |
US20230368445A1 (en) * | 2022-05-13 | 2023-11-16 | Adobe Inc. | Layout-aware text rendering and effects execution |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101251892A (zh) * | 2008-03-07 | 2008-08-27 | 北大方正集团有限公司 | 一种字符切分方法和装置 |
CN104951741A (zh) * | 2014-03-31 | 2015-09-30 | 阿里巴巴集团控股有限公司 | 一种文字识别方法及装置 |
CN104978576A (zh) * | 2014-04-02 | 2015-10-14 | 阿里巴巴集团控股有限公司 | 一种文字识别方法及装置 |
CN105046254A (zh) * | 2015-07-17 | 2015-11-11 | 腾讯科技(深圳)有限公司 | 字符识别方法及装置 |
CN105117706A (zh) * | 2015-08-28 | 2015-12-02 | 小米科技有限责任公司 | 图像处理方法和装置、字符识别方法和装置 |
US20150356740A1 (en) * | 2014-06-05 | 2015-12-10 | Xerox Corporation | System for automated text and halftone segmentation |
CN105184289A (zh) * | 2015-10-10 | 2015-12-23 | 北京百度网讯科技有限公司 | 字符识别方法和装置 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01124082A (ja) * | 1987-11-10 | 1989-05-16 | Matsushita Electric Ind Co Ltd | 文字認識装置 |
JP2995818B2 (ja) * | 1990-08-10 | 1999-12-27 | ソニー株式会社 | 文字切り出し方法 |
EP0587450B1 (en) * | 1992-09-11 | 2004-11-17 | Canon Kabushiki Kaisha | Image processing method and apparatus |
JP3548234B2 (ja) * | 1994-06-29 | 2004-07-28 | キヤノン株式会社 | 文字認識方法及び装置 |
JP3400151B2 (ja) * | 1994-12-08 | 2003-04-28 | 株式会社東芝 | 文字列領域抽出装置および方法 |
CN101515325B (zh) * | 2009-04-08 | 2012-05-23 | 北京邮电大学 | 基于字符切分和颜色聚类的数字视频中的字符提取方法 |
CN101751569B (zh) * | 2010-01-15 | 2012-01-04 | 西安电子科技大学 | 用于脱机手写维吾尔文单词的字符切分方法 |
CN102169542B (zh) * | 2010-02-25 | 2012-11-28 | 汉王科技股份有限公司 | 文字识别中粘连字符的切分方法和装置 |
US8194983B2 (en) * | 2010-05-13 | 2012-06-05 | Hussein Khalid Al-Omari | Method and system for preprocessing an image for optical character recognition |
CN102456136B (zh) * | 2010-10-29 | 2013-06-05 | 方正国际软件(北京)有限公司 | 一种图文切分方法及*** |
US9922263B2 (en) * | 2012-04-12 | 2018-03-20 | Tata Consultancy Services Limited | System and method for detection and segmentation of touching characters for OCR |
US10062001B2 (en) * | 2016-09-29 | 2018-08-28 | Konica Minolta Laboratory U.S.A., Inc. | Method for line and word segmentation for handwritten text images |
-
2016
- 2016-01-05 CN CN201610004431.4A patent/CN106940799B/zh active Active
- 2016-12-30 WO PCT/CN2016/113843 patent/WO2017118356A1/zh active Application Filing
- 2016-12-30 EP EP16883481.0A patent/EP3401842B1/en active Active
- 2016-12-30 JP JP2017559607A patent/JP6628442B2/ja active Active
- 2016-12-30 MY MYPI2017703995A patent/MY184167A/en unknown
- 2016-12-30 KR KR1020177032664A patent/KR102012819B1/ko active IP Right Grant
-
2017
- 2017-11-03 US US15/802,913 patent/US10572728B2/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101251892A (zh) * | 2008-03-07 | 2008-08-27 | 北大方正集团有限公司 | 一种字符切分方法和装置 |
CN104951741A (zh) * | 2014-03-31 | 2015-09-30 | 阿里巴巴集团控股有限公司 | 一种文字识别方法及装置 |
CN104978576A (zh) * | 2014-04-02 | 2015-10-14 | 阿里巴巴集团控股有限公司 | 一种文字识别方法及装置 |
US20150356740A1 (en) * | 2014-06-05 | 2015-12-10 | Xerox Corporation | System for automated text and halftone segmentation |
CN105046254A (zh) * | 2015-07-17 | 2015-11-11 | 腾讯科技(深圳)有限公司 | 字符识别方法及装置 |
CN105117706A (zh) * | 2015-08-28 | 2015-12-02 | 小米科技有限责任公司 | 图像处理方法和装置、字符识别方法和装置 |
CN105184289A (zh) * | 2015-10-10 | 2015-12-23 | 北京百度网讯科技有限公司 | 字符识别方法和装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3401842A4 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127478A (zh) * | 2019-12-13 | 2020-05-08 | 上海众源网络有限公司 | 一种视图块分割方法及装置 |
CN111127478B (zh) * | 2019-12-13 | 2023-09-05 | 上海众源网络有限公司 | 一种视图块分割方法及装置 |
CN113076952A (zh) * | 2021-03-02 | 2021-07-06 | 西安中诺通讯有限公司 | 一种文本自动识别和增强的方法及装置 |
CN113076952B (zh) * | 2021-03-02 | 2024-05-28 | 西安中诺通讯有限公司 | 一种文本自动识别和增强的方法及装置 |
CN116092087A (zh) * | 2023-04-10 | 2023-05-09 | 上海蜜度信息技术有限公司 | Ocr识别方法、***、存储介质及电子设备 |
CN116092087B (zh) * | 2023-04-10 | 2023-08-08 | 上海蜜度信息技术有限公司 | Ocr识别方法、***、存储介质及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
EP3401842B1 (en) | 2022-02-09 |
MY184167A (en) | 2021-03-24 |
JP2018519574A (ja) | 2018-07-19 |
US20180053048A1 (en) | 2018-02-22 |
EP3401842A1 (en) | 2018-11-14 |
KR20170137170A (ko) | 2017-12-12 |
US10572728B2 (en) | 2020-02-25 |
EP3401842A4 (en) | 2019-08-28 |
KR102012819B1 (ko) | 2019-08-21 |
JP6628442B2 (ja) | 2020-01-08 |
CN106940799A (zh) | 2017-07-11 |
CN106940799B (zh) | 2020-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017118356A1 (zh) | 文本图像处理方法和装置 | |
WO2018103608A1 (zh) | 一种文字检测方法、装置及存储介质 | |
CN109815788B (zh) | 一种图片聚类方法、装置、存储介质及终端设备 | |
US10452893B2 (en) | Method, terminal, and storage medium for tracking facial critical area | |
WO2016066042A1 (zh) | 商品图片的分割方法及其装置 | |
WO2019119396A1 (zh) | 人脸表情识别方法及装置 | |
JP6794197B2 (ja) | 情報処理装置、情報処理方法及びプログラム | |
WO2023050651A1 (zh) | 图像语义分割方法、装置、设备及存储介质 | |
WO2021164515A1 (zh) | 一种针对篡改图像的检测方法及装置 | |
KR102421604B1 (ko) | 이미지 처리 방법, 장치 및 전자 기기 | |
US11055526B2 (en) | Method, system and apparatus for processing a page of a document | |
WO2023109086A1 (zh) | 文字识别方法、装置、设备及存储介质 | |
WO2020244076A1 (zh) | 人脸识别方法、装置、电子设备及存储介质 | |
JP2006133941A (ja) | 画像処理装置、画像処理方法、画像処理プログラム及び携帯型端末 | |
CN113139539B (zh) | 渐近回归边界的任意形状场景文字检测方法及装置 | |
JP2016081472A (ja) | 画像処理装置、画像処理方法及びプログラム | |
CN107480616B (zh) | 一种基于图像分析的肤色检测单位分析方法和*** | |
KR20190122178A (ko) | 문자 인식을 위한 영상 전처리 장치 및 방법 | |
CN112749704A (zh) | 文本区域的检测方法、装置和服务器 | |
JP4011859B2 (ja) | 単語画像正規化装置,単語画像正規化プログラム記録媒体および単語画像正規化プログラム | |
Choudhury et al. | A Framework for Segmentation of Characters and Words from In-Air Handwritten Assamese Text | |
CN117808749A (zh) | 图像检测方法、深度学习模型的训练方法、装置 | |
CN116778542A (zh) | 一种脸型识别方法、装置、设备和存储介质 | |
CN116740727A (zh) | 票据图像的处理方法、装置、电子设备及存储介质 | |
CN116206356A (zh) | 行为识别装置和方法以及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16883481 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2016883481 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2017559607 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20177032664 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |