CN109299718B

CN109299718B - Character recognition method and device

Info

Publication number: CN109299718B
Application number: CN201811109858.6A
Authority: CN
Inventors: 璐轰匠; 贺佳
Original assignee: New H3C Security Technologies Co Ltd
Current assignee: New H3C Security Technologies Co Ltd
Priority date: 2018-09-21
Filing date: 2018-09-21
Publication date: 2021-09-24
Anticipated expiration: 2038-09-21
Also published as: CN109299718A

Abstract

The embodiment of the application provides a character recognition method and a character recognition device, which relate to the technical field of information processing, wherein the method comprises the following steps: determining an image area to be recognized where characters to be recognized are located in an image; judging whether the image area to be identified needs to be segmented or not according to the width and the height of the image area to be identified; if so, determining the segmentation position of the image area to be identified, and segmenting the image area to be identified according to the determined segmentation position to obtain an image sub-area of the image area to be identified; and respectively carrying out character recognition on each image subregion, and obtaining a character recognition result of the image region to be recognized according to the recognition result. By applying the scheme provided by the embodiment of the application to character recognition, the accuracy of character recognition can be improved.

Description

Character recognition method and device

Technical Field

The present application relates to the field of information processing technologies, and in particular, to a character recognition method and apparatus.

Background

In character recognition, conventionally, a character region to be recognized in an image is usually recognized directly. In consideration of the fact that the characters to be recognized usually have relevance, the character region to be recognized is taken as a whole for character recognition, the relevance between the characters is considered, and a more accurate character recognition result can be obtained.

However, no matter which algorithm is used for character recognition, the number of nodes for storing the character recognition path is limited in the character recognition process, so that when the number of characters to be recognized is large, the optimal character recognition path is easily lost in the middle of the character recognition process, and the accuracy of character recognition is low. Wherein one character recognition path corresponds to one character recognition result.

Disclosure of Invention

An object of the embodiments of the present application is to provide a character recognition method and device, so as to improve accuracy of character recognition. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present application provides a character recognition method, where the method includes:

determining an image area to be recognized where characters to be recognized are located in an image;

judging whether the image area to be identified needs to be segmented or not according to the width and the height of the image area to be identified;

if so, determining the segmentation position of the image area to be identified, and segmenting the image area to be identified according to the determined segmentation position to obtain an image sub-area of the image area to be identified;

and respectively carrying out character recognition on each image subregion, and obtaining a character recognition result of the image region to be recognized according to the recognition result.

In a second aspect, an embodiment of the present application provides a character recognition apparatus, including:

the area determining module is used for determining an image area to be recognized where the character to be recognized is located in the image;

the segmentation judging module is used for judging whether the image area to be identified needs to be segmented according to the width and the height of the image area to be identified, and if so, the position determining module is triggered;

the position determining module is used for determining the segmentation position of the image area to be identified,

the region segmentation module is used for segmenting the image region to be identified according to the determined segmentation position to obtain an image subregion of the image region to be identified;

and the character recognition module is used for respectively carrying out character recognition on each image subregion and obtaining a character recognition result of the image region to be recognized according to the recognition result.

In a third aspect, embodiments provide an electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the steps of the character recognition method of the first aspect are implemented.

In a fourth aspect, embodiments of the present application provide a machine-readable storage medium storing machine-executable instructions that, when invoked and executed by a processor, cause the processor to: the steps of the character recognition method of the first aspect are implemented.

As can be seen from the above, in the scheme provided in the embodiment of the present application, after the image area to be recognized where the character to be recognized is located in the image is determined, if it is determined that the image area to be recognized needs to be segmented according to the width and the height of the image area to be recognized, the segmentation position of the image area to be recognized is determined, the image area to be recognized is segmented according to the determined segmentation position, image sub-areas of the image area to be recognized are obtained, character recognition is performed on each image sub-area, and a character recognition result of the image area to be recognized is obtained according to a recognition result. Therefore, in the scheme provided by the embodiment of the application, not only is the image area where the character is located taken as a unit for character recognition, but also the image area is segmented according to the width and the height of the image area, so that the number of the characters needing to be recognized is not too large in the character recognition process, the loss of the optimal character recognition path corresponding to the character recognition result in the character recognition process is not easy to cause, and the accuracy of character recognition can be improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1a is a schematic flowchart of a first character recognition method according to an embodiment of the present application;

fig. 1b is a schematic diagram of an image sub-region according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of a second character recognition method according to an embodiment of the present application;

FIG. 3 is a schematic illustration of a region provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of a first character recognition apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a second character recognition apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In order to solve the technical problem, embodiments of the present application provide a character recognition method and device.

In one embodiment of the present application, there is provided a character recognition method, including:

judging whether the image area to be identified needs to be segmented according to the width and the height of the image area to be identified;

if so, determining the segmentation position of the image area to be identified, and performing segmentation processing on the image area to be identified according to the determined segmentation position to obtain an image sub-area of the image area to be identified;

As can be seen from the above, in the scheme provided by this embodiment, not only the image area where the character is located is used as a unit to perform character recognition, but also the image area is segmented according to the width and height of the image area, so that in the process of performing character recognition, the number of characters to be recognized is not too large, and the optimal character recognition path corresponding to the character recognition result is not easily lost in the character recognition process, thereby improving the accuracy of character recognition.

The following description will first describe part of variable identifiers related to the embodiments of the present application.

line _ left represents the abscissa of the leftmost edge point of the image area to be recognized in the image.

line _ right represents the abscissa of the rightmost edge point of the image area to be recognized in the image.

The line _ bottom represents the ordinate of the lowermost edge point of the image area to be recognized in the image.

The line _ top represents the ordinate of the uppermost edge point of the image region to be recognized in the image.

The following is a description of the main body of the character recognition method provided in each embodiment of the present application. The execution main body can be various electronic devices such as a desktop computer, a notebook computer, a tablet computer and a smart phone. The present application does not limit the specific form of the execution main body.

The following describes the character recognition method provided in the embodiments of the present application in detail by using specific embodiments.

Fig. 1a is a schematic flow chart of a first character recognition method according to an embodiment of the present application, where the method is shown as follows.

S101: and determining the image area to be recognized where the character to be recognized is located in the image.

The image is an image containing characters to be recognized. The image may be a black and white dot matrix image obtained by optically capturing an image of a paper document or the like.

Specifically, the characters to be recognized included in the image may be chinese characters, english characters, french characters, german characters, and the like, which is not limited in the present application.

Since the document may include a plurality of character lines, the image may also include a plurality of character lines, and each character line occupies a certain image area in the image. In order to facilitate character recognition, the image area where each character line in the image is located may be determined. When the image area where each character line in the image is located is determined, the method for determining the area where the character line in the image is located in the prior art may be used for determining the area where the character line in the image is located, and the present application does not limit this.

The image area to be recognized may be an image area where any character line in the image is located.

S102: and judging whether the image area to be identified needs to be segmented or not according to the width and the height of the image area to be identified. If so, S103 is executed.

After the image area to be recognized is determined, the width and the height of the image area to be recognized can be obtained, and generally, the width and the height are expressed by the number of pixel points.

In an embodiment of the application, when judging whether the image area to be recognized needs to be segmented according to the width and the height of the image area to be recognized, whether the ratio calculated by adopting the following expression is smaller than a preset threshold value can be judged. And if the image area to be identified is smaller than the preset threshold, judging that the image area to be identified does not need to be segmented. And if the image area to be identified is not smaller than the preset threshold value, judging that the image area to be identified needs to be segmented.

ratio＝line_width/line_height

Wherein, ratio represents the above ratio, line _ width represents the width of the image area to be recognized, and line _ height represents the height of the image area to be recognized.

The preset threshold may be 25, 50, etc.

Since the height of the image area to be recognized can characterize the height of the characters in the character line in this area to some extent, and the width and height of the characters are usually in a direct proportion, the proportion can reflect the number of characters in the image area to be recognized. The proportion is larger than the preset threshold value, which indicates that more characters are contained in the image area to be recognized, and the image area to be recognized can be segmented to ensure the accuracy of character recognition.

In another embodiment of the application, when judging whether the image area to be recognized needs to be segmented according to the width and the height of the image area to be recognized, whether the height of the image area to be recognized is within a first preset range can also be judged.

If the width of the image area to be identified exceeds the second preset range, the judgment result is obtained.

If the width of the image area to be identified represented by the judgment result exceeds a second preset range, judging that the image area to be identified needs to be segmented;

and if the width of the image area to be identified represented by the judgment result does not exceed the second preset range, judging that the image area to be identified does not need to be segmented.

In view of the above, in an embodiment of the present application, if the height of the image region to be recognized is not within the first preset range, it may be considered that the height of the character included in the image region to be recognized is abnormal, the process of determining whether to perform segmentation on the image region to be recognized is required, and it is considered that the character region to be recognized is not required to perform segmentation.

In addition, the first preset range and the second preset range may be set by a developer according to an empirical value. In other ways, the second preset range may be determined according to the first preset range, the ratio between the width and the height of the character in the normal case, and the minimum value of the number of characters included in the image area when the segmentation process is required.

For example, the first preset range may be set as: [10, 30] pixel points, the ratio is: 1, the minimum value being: 10, the second preset range may be: greater than 100 pixels.

In one example, the second preset range may be a product of the following values: a first preset range, a ratio between the width and the height of the character in a normal case, and a minimum value of the number of characters included in the image area when the segmentation process is required. For example, the first preset range may be set as: [10, 30] pixel points, the ratio is: 1, the minimum value being: 10, the second preset range may be: [100, 300] pixel points.

It should be noted that, the present application is only described by taking the above as an example, and defines a manner of determining whether to perform segmentation processing on the image region to be recognized.

S103: and determining the segmentation position of the image area to be identified, and segmenting the image area to be identified according to the determined segmentation position to obtain the image sub-area of the image area to be identified.

When the image area to be recognized is subjected to segmentation processing, the number of image sub-areas into which the image area to be recognized is segmented can be determined according to the specific condition of the image area to be recognized. For example, since the ratio can characterize the number of characters contained in the image region to be recognized to some extent, the image region to be recognized can be divided into several image sub-regions according to the ratio.

For example, when the proportion is in the range of [25, 50), the image area to be identified is divided into two image sub-areas;

when the proportion is in the range of [50, 100), the image area to be identified is divided into three image sub-areas, and the like.

Of course, when the image region to be identified needs to be segmented, the number of the image sub-regions obtained by segmentation may also be preset.

In view of the above description, the above-mentioned cutting position determined in this step may be one or more.

In an embodiment of the present application, when determining the segmentation position of the image region to be identified, the following manner may be implemented.

Determining an initial segmentation position of an image area to be identified;

determining a region of the image region to be recognized, the abscissa of which is located in the range of [ line _ slice _ initial-offset1 and line _ slice _ initial + offset1], as a first character region, wherein line _ slice _ initial represents the abscissa of the initial slicing position, and offset1 represents a preset first offset;

determining the abscissa of the pixel column with the minimum sum of pixel values in the first character area as a first value;

and determining the position of the image area to be identified, of which the abscissa is the first value, as a segmentation position.

The following describes a coordinate system corresponding to the image area to be recognized with reference to fig. 1b, where the coordinate system may use the vertex of the lower left corner of the area where the "xinhua three information" is located as the origin of coordinates, the horizontal direction as the abscissa axis, and the vertical direction as the ordinate axis. The present application is described only by way of example and is not intended to limit the present application.

In the scheme provided by this embodiment, after the initial segmentation position is determined, the initial segmentation position is not directly determined as the final segmentation position, but a more appropriate segmentation position is further found around the initial segmentation position, so that the accuracy of the finally determined segmentation position of the image region to be recognized can be improved, and the accuracy of character recognition is further improved.

In addition, as will be understood by those skilled in the art, in the case where one pixel is represented by 1 bit, the image may be a binary image in which black and white are represented by 0 and 1, respectively; in the case where one pixel is expressed by 8 bits, the image may be a black image or a white image in which 0 and 255 denote black and white, respectively. In general, characters in an image are represented by black, background colors in the image are represented by white, and it can be seen from the above description that 0 is smaller than 1 and 0 is smaller than 255, that is, the pixel value of a black pixel is smaller than that of a white pixel, so that the black pixel (i.e., the pixel constituting the character) included in the pixel column with the smallest sum of the pixel values in the first character region is the largest, and the edge of most of the characters includes more pixels, so that the abscissa of the pixel column with the smallest sum of the pixel values taken as the first value can ensure that the characters are sliced from the edge of the character when the image region is sliced, that is, the situation that the characters are sliced from the middle is reduced to a certain extent, and further, the subsequent accurate character recognition is guaranteed.

Of course, the character in the image may be represented by white, the background color in the image may be represented by black, and the pixel value of the black pixel is smaller than the pixel value of the white pixel, in this case, it may be determined that the largest sum of the pixel values in the first character region includes the largest number of white pixels in the pixel row, that is, the abscissa of the pixel row having the largest sum of the pixel values may be determined as the first value.

In addition to the above two cases, it may be determined that the abscissa of the pixel column in which the sum of the pixel values in the first character region is closest to the preset value is the first value.

The present application is described only by way of example, and the manner of determining the first value is not limited.

The value of the offset1 can be the height of the image area to be recognized.

The implementation manner of determining the initial segmentation position of the image region to be identified may be: and obtaining each initial segmentation position in a bisection mode according to the number of the image subregions to be segmented.

The number of the initial splitting positions is the same as the number of the splitting positions, that is, one splitting position can be obtained according to one initial splitting position corresponding to the splitting position.

Supposing that when the image region to be identified needs to be segmented, the number of image sub-regions obtained by segmentation is as follows: 2, there exists a cutting position and an initial cutting position corresponding to the cutting position, and the abscissa of the initial cutting position can be calculated according to the following expression:

line_slice_initial＝line_left+line_width/2

in addition, the first character region may be a rectangular region, and the first character region may be represented by a bottom left corner vertex (line _ slice _ initial-offset1, line _ bottom) and a top right corner vertex (line _ slice _ initial + offset1, line _ top) of the rectangular region, and may be represented by a top left corner vertex (line _ slice _ initial-offset1, line _ top) and a bottom right corner vertex (line _ slice _ initial + offset1, line _ bottom) of the rectangular region.

S104: and respectively carrying out character recognition on each image subregion, and obtaining a character recognition result of the image region to be recognized according to the recognition result.

Specifically, when character recognition is performed on each image sub-region, character recognition may be performed on each image sub-region based on the LSTM (Long Short-Term Memory ) algorithm and the like.

Since a plurality of characters may be included in the image sub-region, a plurality of recognition results may be obtained when each character is recognized, and after each character in the image sub-region is recognized, a combination of the recognition result and one result corresponding to the previously recognized character is sequentially formed, which is called a character recognition path. When character recognition is performed on each image subregion, a plurality of character recognition paths may be obtained, and a plurality of character recognition results may be obtained. In addition, in the character recognition process, in addition to the character recognition result, the confidence of each character recognition result can be obtained. The confidence of each character recognition result can be understood as: the probability that a character in a sub-region of the image is a character in the character recognition result.

In this case, the character recognition result with the highest degree of confidence among the plurality of character recognition results may be used as the final character recognition result of the image subregion. And under the condition that the confidence coefficient is higher, namely the character recognition result with the maximum confidence coefficient value in the plurality of character recognition results is used as the final character recognition result of the image subarea.

For example, suppose that the character actually contained in the image subregion is ABC.

When A is identified, the obtained identification result is as follows: A. n, namely, the character recognition algorithm considers that the recognized A has a certain probability of A and a certain probability of N.

When B is identified, the obtained identification result is as follows: B. e, namely, the character recognition algorithm considers that the recognized B has a certain probability of B and a certain probability of E.

Four character recognition paths are formed:

A->B、A->E、N->B、N->E。

when C is identified, the obtained identification result is as follows: C. l, that is, the character recognition algorithm considers that the recognized C has a certain probability of being C and a certain probability of being L.

At this time, eight character recognition paths are formed:

A->B->C、A->B->L、A->E->C、A->E->L、N->B->C、N->B->L、N->E->C、N->E->L。

since the image sub-region contains three characters in total, after C is recognized, the character recognition of the image sub-region is completed, and thus the eight character recognition paths also represent eight character recognition results.

It is assumed that the confidence degrees of the eight character recognition results corresponding to the eight character recognition paths are:

98％、25％、45％、50％、73％、30％、56％、67％

the character recognition result with the highest confidence coefficient is: and (4) a character recognition result ABC corresponding to the character recognition path A- > B- > C.

In an embodiment of the application, character recognition can be performed on each image subregion, and then the recognition results of each image subregion are directly combined to obtain the character recognition result of the image region to be recognized.

In another embodiment of the present application, the character recognition for each image sub-region can be implemented through the following steps a-E until each image sub-region is traversed.

Step A: and determining a first image subregion in the image region to be identified as an object to be identified.

And B: and performing character recognition on the object to be recognized to obtain a first recognition result.

And C: and D, judging whether the number of the characters contained in the first recognition result is smaller than a first preset number, if so, executing the step D, and if not, executing the step E.

Specifically, the first preset number may be 3, 4, and so on.

Step D: and D, directly taking the next image subregion as a next object to be identified, and returning to the step B.

Step E: and B, determining the next object to be recognized according to the area of the last second preset number of characters in the object to be recognized in the first recognition result and the next image subarea, and returning to the step B.

Since the determined segmentation position may segment a common word, for example, segment the common word of "technology", which is not favorable for character recognition, the last partial region in the object to be recognized is determined as the next object to be recognized in the step E, so that the segmented common word appears in the same object to be recognized, which is favorable for improving the accuracy of the recognition result.

Specifically, the second preset number may be 2, 3, 4, and so on.

In addition, the first preset number and the second preset number may be set by a developer according to an empirical value.

The above steps A to E are explained below with reference to FIG. 1 b.

Assuming that the image area is as shown in fig. 1b, the image area is divided into three image sub-areas, which correspond to the areas of the three rectangular frames in fig. 1b, and one image sub-area is called as sub-area X, where the contents are: the other image subregion is called subregion Y, and the contents are as follows: in the safety technique, another image subregion is called subregion Z, in which the contents are: a limited company. The first preset number is: 3, the second preset number is: 2.

the process of character recognition for sub-regions X and Y is as follows:

step A1: the sub-region X is taken as an object to be identified.

Step B1: performing character recognition on an object to be recognized, namely performing character recognition on the sub-region X to obtain a recognition result 1, wherein the content of the recognition result 1 is as follows: xinhua three information, containing 5 chinese characters.

Step C1: the identification result 1 contains more than 3 characters with 5 characters.

Step E1: the area of the last two characters in the sub-area X in the recognition result 1, that is, the area of the "information" two characters in the sub-area X and the new area 1 formed by the sub-area Y together are taken as the object to be recognized.

Step B2: and performing character recognition on the new area 1 to obtain a recognition result 2, wherein the content of the recognition result 2 is as follows: the information security technology comprises 6 Chinese characters.

Step C2: the recognition result 2 contains more than 3 characters with 6 characters.

Step E2: the area of the last two characters in the sub-area Y in the recognition result 2, that is, the new area 2 formed by the area of the two characters in the sub-area Y and the sub-area Z together is taken as the object to be recognized.

Step B3: and performing character recognition on the new area 2 to obtain a recognition result 3, wherein the content of the recognition result 2 is as follows: technology, Inc.

Since the sub-area Z is the last sub-area in the image area, the steps corresponding to the above steps C-E may be continuously executed after the above step B3 is executed, or may not be executed any more.

For convenience of description, an area of the last second preset number of characters in the first recognition result in the object to be recognized is referred to as a first area.

Since the image region to be recognized belongs to a region in the image, the respective image subareas are also part regions in the image, on the basis of which the respective characters in the above-mentioned first recognition result also correspond to part regions in the image.

In addition, since the second preset number of characters is the last second preset number of characters in the first recognition result, the areas of the first area and the next image sub-area in the image are adjacent to each other.

In addition, since there may be an error when determining the first region, after determining the first region, the first region may be corrected based on the abscissa of the leftmost edge point of the first region, and then a region obtained by combining the corrected first region and a next image sub-region may be used as a next object to be recognized.

Specifically, when the first region is corrected, the position of the image region to be recognized, whose abscissa is the second value, may be determined as the corrected left edge of the first region.

Wherein the second value is: the abscissa of the pixel column having the smallest sum of pixel values in the second character region is: an area of the image area to be recognized, in which the abscissa is located within the range of [ box _ left-offset2, box _ left + offset3], box _ left represents the abscissa of the leftmost edge point of the above-described first area, and offsets 2 and 3 represent preset second and third offset amounts.

Because the characters in the image are represented by black under normal conditions, the background color in the image is represented by white, and the pixel value of the black pixel is smaller than that of the white pixel, the black pixel (namely, the pixel forming the characters) contained in the pixel column with the minimum sum of the pixel values in the second character region is the most, and the edge of most of the numeric characters contains more pixels, the horizontal coordinate of the pixel column with the minimum sum of the pixel values taken as the second value can ensure that the characters are segmented from the edge of the characters when the image region is segmented, thereby reducing the condition that the characters are segmented from the middle to a certain extent, and further ensuring the accurate subsequent character recognition.

Of course, the character in the image may be represented by white, the background color in the image may be represented by black, and the pixel value of the black pixel is smaller than the pixel value of the white pixel, in this case, it may be determined that the largest pixel row with the largest sum of pixel values in the second character region includes the largest number of white pixels, that is, the abscissa of the pixel row with the largest sum of pixel values may be determined as the second value.

In addition to the above two cases, it may be determined that the abscissa of the pixel row in the second character region whose sum of the pixel values is closest to the preset value is the second value.

The present application is described only by way of example, and the manner of determining the second value is not limited.

The values of the offsets 2 and 3 can be determined according to the height of the image area to be recognized. For example, offset2 ═ 0.2 line _ height, offset3 ═ 0.15 line _ height, and the like. The coefficients in front of the line _ height can be adjusted according to specific situations so as to prevent the characters of the left and right structures from being mistakenly split from the middle.

Since the next object to be recognized is determined according to the first region and the next image sub-region in step E, there may be repeated recognition results for the first region in the recognition results of two adjacent image sub-regions, and in this case, the recognition result with the highest confidence in the recognition results for the first region may be selected as the final recognition result for the first region.

As can be seen from the above, in the scheme provided in this embodiment, after the image area to be recognized where the character to be recognized is located in the image is determined, if it is determined that the image area to be recognized needs to be segmented according to the width and the height of the image area to be recognized, the segmentation position of the image area to be recognized is determined, the image area to be recognized is segmented according to the determined segmentation position, image sub-areas of the image area to be recognized are obtained, character recognition is performed on each image sub-area, and a character recognition result of the image area to be recognized is obtained according to a recognition result. Therefore, in the scheme provided by the embodiment, not only is the image area where the character is located used as a unit for character recognition, but also the image area is segmented according to the width and the height of the image area, so that in the character recognition process, the number of characters to be recognized is not too large, the loss of the optimal character recognition path corresponding to the character recognition result in the character recognition process is not easily caused, and the accuracy of character recognition can be improved.

In an embodiment of the present application, referring to fig. 2, a flowchart of a second character recognition method is provided, and compared with the foregoing embodiment shown in fig. 1, in this embodiment, after S101 determines an image area to be recognized where a character to be recognized is located in an image, repositioning each edge of the image area to be recognized is further performed in the following manner.

At S105: and determining a region extending from the edge to the outside of the image region to be recognized by a preset edge offset as a target region.

Since there is a situation that the character edge is cut when determining the image area to be recognized in the image by using most methods in the prior art, in order to ensure that a more accurate image area to be recognized is obtained, each edge of the image area to be recognized is repositioned by setting an edge offset in advance.

Different edge offsets may be set for different edges of the image area to be identified. Assuming that the image area to be recognized is a rectangular area, the area has four edges, i.e., upper, lower, left, and right, and for this purpose, edge shift amounts may be set for the four edges, respectively.

The edge refers to any edge of the character of the image area to be recognized.

Taking the image area to be recognized as a rectangular area as an example, when the edge is the upper edge of the rectangular area, the target area is: and extending the area with the preset edge offset upwards from the upper edge, wherein the left edge and the right edge of the target area are upwards extended lines of the left edge and the right edge of the rectangular area. As shown in fig. 3, the area 1 is the rectangular area, and the area 2 is the target area.

When the edge is the left edge of the rectangular area, the target area is: and a region with a preset edge offset is extended leftwards from the left edge, and the upper edge and the lower edge of the target region are extension lines leftwards of the upper edge and the lower edge of the rectangular region. As shown in region 3 of fig. 3.

At S106: and adopting the pixel units in the target area to perform edge relocation on the image area to be identified.

Wherein, the pixel unit is: the pixel row or pixel column where the edge is located.

Since the edge of the area may be the upper edge, the lower edge, the left edge or the right edge of the area, and since the upper edge and the lower edge correspond to the pixel rows in the image and the left edge and the right edge correspond to the pixel columns in the image, the pixel units may be the pixel rows or the pixel columns when the edges are different. For example, for the upper and lower edges, the pixel cell may be the pixel row where the edge is located. For the left and right edges, the pixel cell may be the pixel column in which the edge is located.

In an embodiment of the present application, when performing edge relocation on an image area to be recognized by using pixel units in a target area, the following steps F to G may be performed.

Step F: determining a pair of adjacent first pixel units satisfying the following expression in the target region:

|Sum₁-Sum₂|＜Th₃

wherein, Sum₁Represents the Sum of pixel values, Sum, of all pixel points of one pixel element in the first pair₂Indicating the sum of the pixel values of all the pixels of the other pixel cell of the first pair, and Th3 indicating a third predetermined number.

The Th3 may be the same or different for different edges.

Specifically, the value of Th3 may be 10, 20, 30, or the like. The value of Th3 can be set according to the specific situation of the image in the specific application scene.

In determining the first pixel cell pair in the target region, the first pixel cell pair may be confirmed pixel row by pixel row or pixel column by pixel column in a certain order.

For example, to ensure that the first pixel unit pair closest to the upper edge of the image area to be recognized is quickly found, the first pixel unit pair may be identified pixel by pixel row in the order from bottom to top;

in order to ensure that the first pixel unit pair closest to the lower edge of the previous image area to be recognized is quickly found, the first pixel unit pair can be confirmed pixel by pixel in a row from top to bottom;

in order to ensure that the first pixel unit pair closest to the left edge of the previous image area to be recognized is quickly found, the first pixel unit pair can be confirmed pixel by pixel in a right-to-left sequence;

in order to ensure that the first pixel unit pair closest to the right edge of the previous image area to be recognized is found quickly, the first pixel unit pair can be confirmed pixel by pixel in a left-to-right sequence.

Step G: the edge of the image area to be identified is repositioned to be either pixel cell in the first pair of pixel cells.

In one embodiment of the present application, when there is no first pixel unit in the target region, a pair of adjacent second pixel units having a largest first absolute value in the target region may also be determined, where the first absolute value is ═ Sum₁-Sum₂L, |; the edge of the image area to be identified is repositioned to be either pixel cell in the second pair of pixel cells.

When the image area to be recognized is determined, under the influence of factors such as algorithm precision and the like, the determined image area to be recognized may have the situation that the character edge is cut.

In view of the above situation, in this embodiment, the step S102 of judging whether to perform the segmentation process on the image region to be recognized according to the width and the height of the image region to be recognized specifically includes:

S102A: and judging whether the image area to be identified after the edge is repositioned needs to be segmented or not according to the width and the height of the image area to be identified after the edge is repositioned.

As can be seen from the above, in the scheme provided by this embodiment, after the image area to be recognized is determined, the edge of the image area to be recognized is also repositioned, so that the probability of cutting the character edge is reduced, and the accuracy of character recognition can be further improved.

In the following, the edge relocation process of the region is described by taking the image region to be identified as a rectangular region as an example.

Repositioning the upper edge of the area

Assume that the preset edge offset corresponding to the upper edge is: offset4, the offset4 region may be 0.1 line height. The third preset number corresponding to the upper edge is as follows: th3_ t.

Determining the target area as follows: the bottom left corner is a rectangular region with (line _ left, line _ top) and the top right corner is (line _ right, line _ top + offset 4).

Traversing each pixel row in the target area according to the sequence of the pixel rows from bottom to top, and determining the pixel rows which are adjacent and have the pixel sum difference smaller than Th3_ t in the target area, wherein the pixel row is recorded as: pixel (y1), pixel (y1+ 1).

The upper edge of the area is repositioned to a row of pixels with ordinate y 1.

Repositioning the lower edge of the area

Assume that the default edge offset corresponding to the lower edge is: offset5, the offset5 region may be 0.1 line height. The third preset number corresponding to the lower edge is: th3_ b.

Determining the target area as follows: the lower left corner is a (line _ bottom-offset5) rectangular region and the upper right corner is a (line _ right) rectangular region.

Traversing each pixel row in the target area from top to bottom according to the pixel rows, and determining the pixel row which is adjacent to the pixel row and has the pixel sum difference smaller than Th3_ b in the target area, wherein the pixel row is expressed as: pixel (y2), pixel (y2+ 1).

The lower edge of the area is repositioned to a row of pixels with ordinate y 2.

Thirdly, repositioning the left edge of the area

Assume that the default edge offset corresponding to the left edge is: offset6, the region of offset6 may be 0.35 line height. The third preset number corresponding to the left edge is: th3_ l.

Determining the target area as follows: the lower left corner has a rectangular region with (line _ left-offset6, line _ bottom) and the upper right corner has (line _ left, line _ top).

Traversing each pixel column in the target area according to the sequence of the pixel columns from left to right, and determining the pixel columns which are adjacent and have the difference between the pixel sums smaller than Th3_ l in the target area, wherein the pixel columns are recorded as: pixel (x1), pixel (x1+ 1).

The left side of the region is repositioned to a pixel column with x1 abscissa.

Fourthly, repositioning the right edge of the area

Assuming that the preset edge offset corresponding to the right edge is: offset7, the region of offset7 may be 0.35 line height. The third preset number corresponding to the right edge is: th3_ r.

Determining the target area as follows: the bottom left corner is a rectangular region with (line _ right, line _ bottom) and the top right corner is (line _ right + offset7, line _ top).

Traversing each pixel column in the target area according to the sequence of the pixel columns from right to left, and determining the pixel columns which are adjacent and have the pixel sum difference smaller than Th3_ r in the target area, wherein the pixel columns are recorded as: pixel (x2), pixel (x2+ 1).

The left side of the region is repositioned to a pixel column with x1 abscissa.

Corresponding to the character recognition method, the embodiment of the application also provides a character recognition device.

Fig. 4 is a schematic structural diagram of a first character recognition apparatus according to an embodiment of the present application, where the apparatus includes:

the region determining module 401 is configured to determine a region of the image to be recognized, where a character to be recognized in the image is located;

a segmentation judging module 402, configured to judge whether segmentation processing needs to be performed on the image region to be identified according to the width and the height of the image region to be identified, and if so, trigger a position determining module 403;

the position determining module 403 is configured to determine a segmentation position of the image region to be identified,

a region segmentation module 404, configured to perform segmentation processing on the image region to be identified according to the determined segmentation position, so as to obtain an image sub-region of the image region to be identified;

and the character recognition module 405 is configured to perform character recognition on each image sub-region respectively, and obtain a character recognition result of the image region to be recognized according to the recognition result.

In an embodiment of the present application, the position determining module 403 includes:

the initial position determining unit is used for determining an initial segmentation position of the image area to be identified;

a first region determining unit, configured to determine, as a first character region, a region in the to-be-recognized image region whose abscissa is located within a range of [ line _ slice _ initial-offset1, line _ slice _ initial + offset1], where line _ slice _ initial represents the abscissa of the initial slicing position, and offset1 represents a preset first offset;

a value determination unit, configured to determine that an abscissa of a pixel column having a smallest sum of pixel values in the first character region is a first value;

and the segmentation position determining unit is used for determining the position of the image area to be identified, of which the abscissa is the first value, as the segmentation position.

In an embodiment of the present application, the character recognition module 405 includes:

the first object determining unit is used for determining a first image subarea in the image area to be identified as an object to be identified;

the character recognition unit is used for carrying out character recognition on the object to be recognized to obtain a first recognition result;

the number judging unit is used for judging whether the number of characters contained in the first recognition result is smaller than a first preset number or not, if so, the second object determining unit is triggered, and if not, the third object determining unit is triggered;

the second object determining unit is used for directly taking the next image sub-area as a next object to be identified and triggering the character identifying unit until each image sub-area is traversed;

and the third object determining unit is used for determining a next object to be recognized according to the area of the last second preset number of characters in the object to be recognized and the next image sub-area in the first recognition result, and triggering the character recognizing unit until each image sub-area is traversed.

In an embodiment of the present application, referring to fig. 5, a schematic structural diagram of a second character recognition apparatus is provided, and compared with the foregoing embodiment shown in fig. 4, in this embodiment, the character recognition apparatus further includes:

an edge relocation module 406, configured to, after the region determination module 401 determines an image region to be identified in an image, relocate each edge of the image region to be identified;

wherein the edge relocation module 406 includes:

a second region determining unit 406A, configured to determine, as a target region, a region extending from an edge to outside the image region to be recognized by a preset edge offset;

an edge relocation unit 406B, configured to perform edge relocation on the image area to be identified by using pixel units in the target area, where the pixel units are: the pixel row or the pixel column where the edge is located;

the segmentation judging module 402 is specifically configured to judge whether segmentation processing needs to be performed on the image area to be identified after the edge is repositioned according to the width and the height of the image area to be identified after the edge is repositioned.

In an embodiment of the present application, the edge relocation unit 406B includes:

a first unit pair determining subunit, configured to determine a first pixel unit pair that satisfies the following expression and is adjacent to the target region:

|Sum₁-Sum₂|＜Th₃

wherein, Sum₁Represents the Sum of pixel values, Sum, of all pixel points of one pixel element in the first pair₂Indicating the sum of pixel values of all pixels of the other pixel cell in the first pixel cell pair, and Th3 indicating a third predetermined number;

and the first edge repositioning subunit is used for repositioning the edge of the image area to be identified into any pixel unit in the first pixel unit pair.

In an embodiment of the present application, the edge relocation unit 406 further includes:

a second unit pair determining subunit, configured to determine, when the first pixel unit is absent in the target region, a second pixel unit pair which has a largest first absolute value and is adjacent to the target region, where the first absolute value is ═ Sum₁-Sum₂|；

A second edge relocation subunit, configured to relocate an edge of the image area to be identified to any pixel unit in the second pixel unit pair.

Corresponding to the character recognition method, the embodiment of the application also provides electronic equipment.

Fig. 6 provides a schematic structural diagram of an electronic device, including: a processor 601 and a machine-readable storage medium 602, the machine-readable storage medium 602 storing machine-executable instructions executable by the processor 601, the processor 601 caused by the machine-executable instructions to: the character recognition method provided by the embodiment of the application is realized.

It should be noted that other embodiments of the character recognition method implemented by the processor 601 through machine executable instructions are the same as the other embodiments mentioned in the previous embodiments of the method, and are not described herein again.

The machine-readable storage medium may include a Random Access Memory (RAM) and a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the machine-readable storage medium may be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

As can be seen from the above, when performing character recognition, the electronic device provided in this embodiment not only performs character recognition by using the image area where the character is located as a unit, but also performs segmentation processing on the image area according to the width and height of the image area, so that in the process of performing character recognition, the number of characters to be recognized is not too large, and the optimal character recognition path corresponding to the character recognition result is not easily lost in the character recognition process, thereby improving the accuracy of character recognition.

In accordance with the above character recognition method, embodiments of the present application further provide a machine-readable storage medium storing machine-executable instructions, which when invoked and executed by a processor, cause the processor to: the character recognition method provided by the embodiment of the application is realized.

It should be noted that other embodiments of the character recognition method implemented by the processor caused by the machine executable instructions are the same as the other embodiments mentioned in the previous embodiment of the method, and are not described herein again.

As can be seen from the above, when performing character recognition by executing the machine executable instruction stored in the machine readable storage medium provided in this embodiment, not only the image area where the character is located is taken as a unit to perform character recognition, but also the image area is subjected to segmentation processing according to the width and height of the image area, so that in the process of performing character recognition, the number of characters to be recognized is not too large, and the optimal character recognition path corresponding to the character recognition result is not easily lost in the process of character recognition, thereby improving the accuracy of character recognition.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, electronic device, and machine-readable storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.

The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. A method of character recognition, the method comprising:

respectively carrying out character recognition on each image subregion, and obtaining a character recognition result of the image region to be recognized according to the recognition result;

the judging whether the image area to be identified needs to be segmented according to the width and the height of the image area to be identified comprises the following steps:

judging whether the ratio of the width to the height of the image area to be identified is smaller than a preset threshold value or not;

if the image area to be identified is smaller than the preset threshold value, judging that the image area to be identified does not need to be segmented;

if the image area to be identified is not smaller than the preset threshold, judging that the image area to be identified needs to be segmented;

the determining the segmentation position of the image region to be identified comprises:

determining an initial segmentation position of the image area to be identified;

determining a region of the image region to be recognized, whose abscissa is in a range of [ line _ slice _ initial-offset1 and line _ slice _ initial + offset1], as a first character region, where line _ slice _ initial represents the abscissa of the initial slicing position, and offset1 represents a preset first offset;

determining the abscissa of the pixel row with the minimum/maximum/closest to the preset value of the sum of the pixel values in the first character area as a first value;

determining the position of the image area to be identified, of which the abscissa is the first value, as a segmentation position;

the character recognition for each image subregion respectively comprises the following steps:

determining a first image subregion in the image region to be identified as an object to be identified;

performing character recognition on the object to be recognized to obtain a first recognition result;

judging whether the number of characters contained in the first recognition result is smaller than a first preset number or not;

if so, directly taking the next image subregion as a next object to be identified;

if not, determining a next object to be recognized according to the area of the last second preset number of characters in the object to be recognized and the next image sub-area in the first recognition result;

returning to the step of performing character recognition on the object to be recognized to obtain a first recognition result until traversing each image subregion;

and the image area to be identified is the image area where any character line in the image is located.

2. The method according to claim 1, wherein after determining the image area to be recognized where the character to be recognized is located in the image, the method further comprises:

repositioning each edge of the image area to be identified in the following manner:

determining a region extending from the edge to the outside of the image region to be recognized by a preset edge offset as a target region;

adopting a pixel unit in the target area to perform edge relocation on the image area to be identified, wherein the pixel unit is as follows: the pixel row or the pixel column where the edge is located;

and judging whether the image area to be identified after the edge is repositioned needs to be segmented or not according to the width and the height of the image area to be identified after the edge is repositioned.

3. The method according to claim 2, wherein the performing edge relocation on the image area to be identified by using pixel units in the target area comprises:

determining a first pair of adjacent pixel cells in the target region that satisfy the following expression:

|Sum₁-Sum₂|＜Th₃

repositioning an edge of the image area to be identified to any pixel cell of the first pair of pixel cells.

4. The method of claim 3, wherein when the first pixel cell is not present in the target region, further comprising:

determining a second pixel unit pair which has the maximum first absolute value and is adjacent to the target area, wherein the first absolute value is ═ Sum₁-Sum₂|；

Repositioning an edge of the image area to be identified to any pixel cell of the second pair of pixel cells.

5. An apparatus for character recognition, the apparatus comprising:

the character recognition module is used for respectively carrying out character recognition on each image subregion and obtaining a character recognition result of the image region to be recognized according to the recognition result;

the segmentation judging module is specifically used for judging whether the ratio of the width to the height of the image area to be identified is smaller than a preset threshold value; if the image area to be identified is smaller than the preset threshold value, judging that the image area to be identified does not need to be segmented; if the image area to be identified is not smaller than the preset threshold, judging that the image area to be identified needs to be segmented;

the position determination module includes:

a value determination unit, configured to determine that an abscissa of a pixel row having a minimum/maximum/closest preset value of a sum of pixel values in the first character region is a first value;

the segmentation position determining unit is used for determining the position of the image area to be identified, of which the abscissa is the first value, as a segmentation position;

the character recognition module includes:

the third object determining unit is configured to determine a next object to be recognized according to a region of a last second preset number of characters in the first recognition result in the object to be recognized and a next image sub-region, and trigger the character recognizing unit until each image sub-region is traversed;

6. The apparatus of claim 5, further comprising:

the edge repositioning module is used for repositioning each edge of the image area to be identified after the area determining module determines the image area to be identified in the image;

wherein the edge relocation module comprises:

the second area determining unit is used for determining an area which extends from the edge to the outside of the image area to be identified and is provided with a preset edge offset as a target area;

an edge relocation unit, configured to perform edge relocation on the image area to be identified by using a pixel unit in the target area, where the pixel unit is: the pixel row or the pixel column where the edge is located;

the segmentation judging module is specifically configured to judge whether segmentation processing needs to be performed on the image area to be identified after the edge is repositioned according to the width and the height of the image area to be identified after the edge is repositioned.

7. The apparatus of claim 6, wherein the edge relocation unit comprises:

|Sum₁-Sum₂|＜Th₃

8. The apparatus of claim 7, wherein the edge relocation unit further comprises:

9. An electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 4.

10. A machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to: carrying out the method steps of any one of claims 1 to 4.