CN110751140A - Character batch recognition method and device and computer equipment - Google Patents

Character batch recognition method and device and computer equipment Download PDF

Info

Publication number
CN110751140A
CN110751140A CN201910872233.3A CN201910872233A CN110751140A CN 110751140 A CN110751140 A CN 110751140A CN 201910872233 A CN201910872233 A CN 201910872233A CN 110751140 A CN110751140 A CN 110751140A
Authority
CN
China
Prior art keywords
character
target
image
result
proofreading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910872233.3A
Other languages
Chinese (zh)
Inventor
张凡
魏华
陈志�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Guoxin Synthetic Technology Co Ltd
Original Assignee
Shenzhen Guoxin Synthetic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Guoxin Synthetic Technology Co Ltd filed Critical Shenzhen Guoxin Synthetic Technology Co Ltd
Priority to CN201910872233.3A priority Critical patent/CN110751140A/en
Publication of CN110751140A publication Critical patent/CN110751140A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The embodiment of the invention discloses a method and a device for batch recognition of characters and computer equipment, wherein the method comprises the following steps: acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters; carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image; obtaining a proofreading result obtained by proofreading the identification result; and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result. By the aid of the mode, the correction efficiency can be improved.

Description

Character batch recognition method and device and computer equipment
Technical Field
The invention relates to the technical field of image recognition, in particular to a method and a device for batch recognition of characters and computer equipment.
Background
In many application scenes, filling of paper documents is an important work, and how to link the paper documents with an electronic information system is a key work for efficient information transfer of enterprises. At present, a method for converting a paper document into an electronic document mainly adopts a manual input mode, so that the labor cost is greatly improved and the efficiency is low. With the development of Optical Character Recognition (OCR) technology, there has been a trend to replace manual entry with machine Recognition, so as to improve entry efficiency. However, due to the restrictions of various kinds of noise, contamination, imaging quality and the like under working conditions, the OCR technology cannot achieve a recognition rate of 100%, and the OCR recognition result needs to be manually checked.
The steps of manual inspection typically include: 1. memorizing the content on the paper document picture; 2. transferring the sight line to an OCR recognition result area to confirm the recognition result; 3. if a certain character recognition error is found, the content on the paper document picture needs to be checked again for confirmation. Because the capacity of manual short-term memory generally does not exceed 6 characters, when the content on the paper document picture is too long, for example, the content is an identity card number, short-term memory and sight line transfer need to be repeatedly performed, and the checking efficiency is low.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus and a computer device for efficient batch recognition of characters.
A batch character recognition method, comprising:
acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
obtaining a proofreading result obtained by proofreading the identification result;
and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
In one embodiment, the target image comprises a plurality of character areas to be recognized; the obtaining of the recognition result of the character area to be recognized in each target image according to the proofreading result includes: and obtaining the recognition results corresponding to the character areas to be recognized in each target image according to the proofreading result.
In one embodiment, the displaying the single-character image and the recognition result corresponding to the single-character image includes: generating a plurality of homogeneous sets according to the identification result corresponding to the single character image, wherein the identification result of the single character image in each homogeneous set is the same; and displaying the single character images and the identification results corresponding to the single character images according to each homogeneous set.
In one embodiment, the displaying the single-character images and the recognition results corresponding to the single-character images according to each of the homogeneous sets includes: acquiring a category identification input by a user, and determining a target homogeneous set according to the category identification; and displaying each single-character image in the target same-class set and the corresponding recognition result of the single-character image in different character display areas.
In one embodiment, after displaying the single-character images in the target homogeneous set and the recognition results corresponding to the single-character images in different character display regions, the method further includes: if a first preset operation is detected in a character display area, determining a first single character image corresponding to the first preset operation, and determining a target character area corresponding to the first single character image; displaying the single character image of each target character in the target character area in a first display area; and displaying the recognition result of the single character image of each target character in the target character area in a second display area.
In an embodiment, the obtaining a verification result obtained by verifying the identification result includes: if a second preset operation is detected in the character display area, determining a second single character image corresponding to the second preset operation, and displaying a proofreading result input frame; and acquiring a proofreading result of the second single-character image input by the user through the proofreading result input frame.
In an embodiment, the obtaining, according to the collation result, a recognition result of a character region to be recognized in each target image includes: acquiring a character position table corresponding to each target image, wherein the character position table records the positions of target characters in the target images in a character area; and combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition result of the character area to be recognized in each target image.
A batch character recognition apparatus comprising:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a target image set, the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
the segmentation recognition module is used for carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
the second acquisition module is used for acquiring a proofreading result obtained by proofreading the identification result;
and the proofreading and identifying module is used for obtaining the identification result of the character area to be identified in each target image according to the proofreading result.
In one embodiment, the target image comprises a plurality of character areas to be recognized; the collation identification module comprises: and the multi-region proofreading module is used for obtaining the identification results corresponding to the character regions to be identified in each target image according to the proofreading result.
In one embodiment, the segmentation identification module includes: the homogeneous summarizing module is used for generating a plurality of homogeneous sets according to the identification results corresponding to the single character images, and the identification results of the single character images in each homogeneous set are the same; and the same-class display module is used for displaying the single character image and the identification result corresponding to the single character image according to each same-class set.
In one embodiment, the homogeneous display module includes: the identification acquisition module is used for acquiring the category identification input by the user and determining a target homogeneous set according to the category identification; and the identification display module is used for displaying each single character image in the target same-class set and the identification result corresponding to the single character image in different character display areas.
In one embodiment, the apparatus further comprises: the first detection module is used for determining a first single character image corresponding to a first preset operation and determining a target character area corresponding to the first single character image if the first preset operation is detected in a character display area; the first display module is used for displaying the single character image of each target character in the target character area in a first display area; and the second display module is used for displaying the recognition result of the single character image of each target character in the target character area in a second display area.
In one embodiment, the second obtaining module includes: the second detection module is used for determining a second single character image corresponding to a second preset operation and displaying a proofreading result input frame if the second preset operation is detected in the character display area; and the user proofreading module is used for acquiring the proofreading result of the second single-character image input by the user through the proofreading result input box.
In one embodiment, the collation identification module includes: the character table acquisition module is used for acquiring a character position table corresponding to each target image, and the character position table records the positions of target characters in the target images in a character area; and the position combination module is used for combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition results of the character areas to be recognized in the target images.
A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of:
acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
obtaining a proofreading result obtained by proofreading the identification result;
and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
obtaining a proofreading result obtained by proofreading the identification result;
and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
The embodiment of the invention has the following beneficial effects:
the invention provides a method, a device and computer equipment for batch recognition of characters, which comprises the steps of firstly, obtaining a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters; then, carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image; simultaneously obtaining a proofreading result obtained by proofreading the identification result; and finally, obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result. Therefore, in the mode, as the proofreading of the characters in the character areas of the plurality of target images is carried out at the same time, the batch proofreading is realized, and the proofreading efficiency is improved to a certain extent; meanwhile, because the single-character images and the corresponding recognition results of the single-character images are displayed, compared with a regional proofreading mode (for example, the proofreading mode comprises more than ten digits), the error probability of the one-to-one comparison mode is smaller, so that the proofreading precision is improved; finally, because the single character is subjected to proofreading, compared with regional proofreading, the method does not need to memorize repeatedly, but only needs direct proofreading, and therefore, the method also has higher efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a schematic diagram illustrating a flow chart of an implementation of a batch character recognition method according to an embodiment;
FIG. 2 is a diagram of a single character image in one embodiment;
FIG. 3 is a diagram illustrating a display of a single character image and recognition results in one embodiment;
FIG. 4 is a schematic illustration of various display regions in one embodiment;
FIG. 5 is a flow diagram illustrating an implementation of step 106 in one embodiment;
FIG. 6 is a flow diagram illustrating an implementation of step 108 in one embodiment;
FIG. 7 is a block diagram showing the structure of a character batch recognition apparatus according to an embodiment;
FIG. 8 is a block diagram of a computer device in one embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, in an embodiment, a method for batch character recognition is provided, and an execution subject of the method for batch character recognition according to the embodiment of the present invention is a device capable of implementing the method for batch character recognition according to the embodiment of the present invention, where the device may include, but is not limited to, a terminal and a server, where the terminal may include, but is not limited to, a mobile phone, a tablet computer, a desktop computer, and a notebook computer, and the server may include, but is not limited to, a high-performance computer and a high-performance computer cluster. The character batch recognition method specifically comprises the following steps:
102, acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters.
In the character region, a plurality of characters in the target image form a character region, for example, the target image shown in fig. 2 is a check image, the character region is 19105371 or 7788283596, and a target image may include a plurality of character regions, such as the check image shown in fig. 2, which includes three character regions, that is, a check number character region, a check amount character region, and a payment bank character region.
Wherein, the character includes: chinese characters, numbers, letters, and symbols.
And 104, carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image.
The single-character image is an image including only one character, as shown in fig. 2.
Segmenting the target characters in the character area to be recognized to obtain a single character image of each target character; and identifying the obtained single-character image to obtain an identification result corresponding to the single-character image. The target character in the character region can be segmented by adopting an image segmentation algorithm, and the segmentation algorithm can adopt a space domain segmentation algorithm and a frequency domain segmentation algorithm, which is not specifically limited; the image recognition method may include image matching recognition and recognition by a neural network model, which is not specifically limited herein.
Illustratively, as shown in fig. 3, two display manners of single-character images and recognition results corresponding to the single characters are shown. Displaying single character images with the same recognition result according to lines in a first mode; and secondly, displaying the single-character images with the same recognition result in columns.
In one embodiment, the displaying the single-character image and the recognition result corresponding to the single-character image in step 104 includes:
and 104A, generating a plurality of homogeneous sets according to the identification result corresponding to the single character image, wherein the identification result of the single character image in each homogeneous set is the same.
And carrying out similar combination on the single character images of each target character of each target image in the target image set according to the recognition result to generate a similar set. For example, assuming that the recognition result of the target character is only 1 to 9, at this time, 9 homogeneous sets are generated, and the recognition result in each homogeneous set includes only 1 or 2 or … 9.
And 104B, displaying the single character image and the identification result corresponding to the single character image according to each homogeneous set.
Illustratively, different homogeneous sets are displayed in different rows or in different columns.
Illustratively, the recognition results corresponding to different homogeneous sets are displayed according to the selection of the user, as shown in fig. 4, the homogeneous set selected and displayed by the user is homogeneous set 0, so that the single-character images in the homogeneous set with the recognition result of 0 are displayed in the display interface.
Specifically, when there are many target images and many target characters in the target image set, in order to prevent the screen from not displaying the single character images of different recognition results on the same screen, the single character image of a certain category is displayed according to the selection of the user. Specifically, the step 104B of displaying the single character image and the recognition result corresponding to the single character image according to each of the homogeneous sets includes:
and step 104B1, acquiring the category identification input by the user, and determining the target homogeneous set according to the category identification.
The category identification is used for uniquely identifying a homogeneous set.
And step 104B2, displaying each single-character image in the target homogeneous set and the recognition result corresponding to the single-character image in different character display areas.
The character display area is used for displaying a single character image and the recognition result of the single character image.
In one embodiment, after step 104B2, the method further comprises: step 104B3, if a first preset operation is detected in the character display area, determining a first single character image corresponding to the first preset operation, and determining a target character area corresponding to the first single character image; step 104B4, displaying the single character image of each target character in the target character area in a first display area; step 104B5, displaying the recognition result of the single character image of each target character in the target character region in the second display region.
The first preset operation comprises an operation of clicking a character display area. The user clicks a certain character display region, thereby determining a single-character image in the character display region clicked by the user as a first single-character image.
The association relationship between the single character image and the character region is established in advance, and as shown in table 1, the corresponding character region is determined directly according to the single character image or a plurality of single character images are determined according to the character region. It should be noted that the character regions in the target image need to be distinguished, and in the first mode, the character regions in different target images have unique region identifiers, for example, the character regions in the target image 1 have: quyu1-1 and Quyu1-2, character regions in the target image 2 are Quyu2-1 and Quyu 2-2; in the second method, the character regions in different target images have the same region identifier, and at this time, the target image is determined according to the correspondence between the single-character image and the target image, and then the character region is determined, for example, it is determined that the single-character image belongs to image 1 according to the single-character image DzftxN, and then it is determined that the single-character image belongs to Quyu1 in image 1.
TABLE 1
Figure BDA0002203184680000091
And the content of the character display area, the content of the first display area and the content of the second display area are displayed on the currently displayed interface. The first display area may be, as shown in fig. 4, displayed in the lower right corner of the currently displayed interface, and the single character images are displayed in order from left to right; the second display area may be, as shown in fig. 4, displayed in the lower left corner of the currently displayed interface, and each recognition result is also displayed corresponding to the single-character image in the first display area in the order from left to right.
Through the mode, the user can check the character area corresponding to a certain single character image through the first preset operation, so that all characters in the character area are checked directly, and the checking efficiency is improved.
Furthermore, the accuracy of the recognition result of the first single-character image is displayed in the third display area, so that the user can be reminded, and the single-character image with low accuracy can be highlighted. Further, if the accuracy of the recognition result of a certain first single-character image is lower than a preset accuracy, marking the single-character image displayed in the first display area, for example, marking the single-character image as a red frame; or, if the accuracy of the recognition result of a certain first single-character image is lower than a preset accuracy, marking the character display area where the first single-character image is located, for example, marking the character display area as a red box, or marking the first single-character image as a specific color.
In one embodiment, in order to expand the display content on the basis of the above-described embodiment, a single character image of a target character of each character region in a target image is selected for display. Specifically, after the step 104B2, the method further includes: step 104B6, if a first preset operation is detected in the character display area, determining a first single-character image corresponding to the first preset operation, and determining a first target image corresponding to the first single-character image; step 104B7, displaying single character images of each target character in the first target image in a first display area; step 104B8, displaying the recognition result of the single character image of each target character in the first target image in the second display area.
The correspondence between the target image and the single character image is established in advance as shown in table 2.
TABLE 2
Figure BDA0002203184680000101
And step 106, obtaining a proofreading result obtained by proofreading the identification result.
Specifically, a user proof mode is provided. Specifically, as shown in fig. 5, the obtaining a verification result obtained by verifying the identification result in step 106 includes:
and step 106A, if a second preset operation is detected in the character display area, determining a second single character image corresponding to the second preset operation, and displaying a proofreading result input frame.
Wherein the second preset operation includes, but is not limited to, a double-click operation. For example, when the user finds that a single-character image is recognized incorrectly, for example, the single-character image is 8 and the recognition result is 0, the single-character image is classified into the same-class set 0 and displayed when the same-class set 0 is displayed, the user finds that the classification is incorrect, and then double-clicks the character display region corresponding to the single-character image 8, a collation result input box pops up at this time, and 8 is input in the popped collation result input box, thereby obtaining the collation result of the single-character image.
And step 106B, acquiring the proofreading result of the second single-character image input by the user through the proofreading result input box.
And 108, obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
In one embodiment, a determination of the recognition result of a character region is provided. As shown in fig. 6, the step 108 of obtaining the recognition result of the character region to be recognized in each target image according to the collation result includes:
step 108A, acquiring a character position table corresponding to each target image, wherein the character position table records the positions of target characters in the target images in a character area.
Each target image in the set of target images has a respective table of character positions. For example, the positions of the characters in the character regions may be determined according to the appearance order of the characters, wherein the appearance order of the characters may be from left to right, from right to left, from top to bottom, or from bottom to top, for example, taking left to right as an example, if a certain character region is 3749, the position of the character 3 is Quyu1-1, the position of the character 7 is Quyu1-2, the position of the character 4 is Quyu1-3, and the position of the character 9 is Quyu 1-4.
The character position table may be as shown in table 3, where the character identifier is used to uniquely identify one character, and the position identifier is used to uniquely identify one position.
TABLE 3
Figure BDA0002203184680000111
And 108B, combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition results of the character areas to be recognized in the target images.
It should be noted that, if the recognition result of the single-character image is correct, the correction is not needed, and the recognition result corresponding to the single-character image is directly used as the correction result.
In one embodiment, the target image comprises a plurality of character areas to be recognized; step 108, obtaining the recognition result of the character region to be recognized in each target image according to the proofreading result, includes: and obtaining the recognition results corresponding to the character areas to be recognized in each target image according to the proofreading result.
Since there may be multiple character areas in a target image, to determine the position of the character in the character area, the position is expressed by a combination of a character identifier and a position identifier, for example, the character Zifu &3, whose position identifier is Quyu1-3, indicating that the character Zifu &3 is at position 3 in area 1.
Through the method, the recognition of a plurality of character areas in one target image can be realized.
Firstly, acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters; then, carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image; simultaneously obtaining a proofreading result obtained by proofreading the identification result; and finally, obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result. Therefore, in the mode, as the proofreading of the characters in the character areas of the plurality of target images is carried out at the same time, the batch proofreading is realized, and the proofreading efficiency is improved to a certain extent; meanwhile, because the single-character images and the corresponding recognition results of the single-character images are displayed, compared with a regional proofreading mode (for example, the proofreading mode comprises more than ten digits), the error probability of the one-to-one comparison mode is smaller, so that the proofreading precision is improved; finally, because the single character is subjected to proofreading, compared with regional proofreading, the method does not need to memorize repeatedly, but only needs direct proofreading, and therefore, the method also has higher efficiency.
As shown in fig. 7, a device 700 for batch recognition of characters is provided, which specifically includes:
a first obtaining module 702, configured to obtain a target image set, where the target image set includes multiple target images, the target images include character regions to be recognized, and the character regions include multiple target characters;
a segmentation recognition module 704, configured to perform segmentation recognition on the target characters in the character region to be recognized, obtain a single character image of each target character and a recognition result corresponding to the single character image, and display the single character image and the recognition result corresponding to the single character image;
a second obtaining module 706, configured to obtain a verification result obtained by verifying the identification result;
and the proofreading and identifying module 708 is configured to obtain an identification result of the character region to be identified in each target image according to the proofreading result.
The device for recognizing the characters in batches firstly acquires a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters; then, carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image; simultaneously obtaining a proofreading result obtained by proofreading the identification result; and finally, obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result. Therefore, in the mode, as the proofreading of the characters in the character areas of the plurality of target images is carried out at the same time, the batch proofreading is realized, and the proofreading efficiency is improved to a certain extent; meanwhile, because the single-character images and the corresponding recognition results of the single-character images are displayed, compared with a regional proofreading mode (for example, the proofreading mode comprises more than ten digits), the error probability of the one-to-one comparison mode is smaller, so that the proofreading precision is improved; finally, because the single character is subjected to proofreading, compared with regional proofreading, the method does not need to memorize repeatedly, but only needs direct proofreading, and therefore, the method also has higher efficiency.
In one embodiment, the target image comprises a plurality of character areas to be recognized; the collation identification module 708 includes: and the multi-region proofreading module is used for obtaining the identification results corresponding to the character regions to be identified in each target image according to the proofreading result.
In one embodiment, the segmentation identification module 704 includes: the homogeneous summarizing module is used for generating a plurality of homogeneous sets according to the identification results corresponding to the single character images, and the identification results of the single character images in each homogeneous set are the same; and the same-class display module is used for displaying the single character image and the identification result corresponding to the single character image according to each same-class set.
In one embodiment, the homogeneous display module includes: the identification acquisition module is used for acquiring the category identification input by the user and determining a target homogeneous set according to the category identification; and the identification display module is used for displaying each single character image in the target same-class set and the identification result corresponding to the single character image in different character display areas.
In one embodiment, the apparatus 700 further comprises: the first detection module is used for determining a first single character image corresponding to a first preset operation and determining a target character area corresponding to the first single character image if the first preset operation is detected in a character display area; the first display module is used for displaying the single character image of each target character in the target character area in a first display area; and the second display module is used for displaying the recognition result of the single character image of each target character in the target character area in a second display area.
In one embodiment, the second obtaining module 706 includes: the second detection module is used for determining a second single character image corresponding to a second preset operation and displaying a proofreading result input frame if the second preset operation is detected in the character display area; and the user proofreading module is used for acquiring the proofreading result of the second single-character image input by the user through the proofreading result input box.
In one embodiment, the collation identification module 708 includes: the character table acquisition module is used for acquiring a character position table corresponding to each target image, and the character position table records the positions of target characters in the target images in a character area; and the position combination module is used for combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition results of the character areas to be recognized in the target images.
FIG. 8 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be a terminal or a server. As shown in fig. 8, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the batch character recognition method. The internal memory may also have a computer program stored therein, which when executed by the processor, causes the processor to perform a batch character recognition method. Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the batch recognition method for characters provided by the present application can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in fig. 8. The memory of the computer device can store various program templates which form the character batch recognition device. Such as a first acquisition module 702, a second acquisition module 706, and a segmentation identification module 704.
A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of:
acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
obtaining a proofreading result obtained by proofreading the identification result;
and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
The computer equipment firstly acquires a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters; then, carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image; simultaneously obtaining a proofreading result obtained by proofreading the identification result; and finally, obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result. Therefore, in the mode, as the proofreading of the characters in the character areas of the plurality of target images is carried out at the same time, the batch proofreading is realized, and the proofreading efficiency is improved to a certain extent; meanwhile, because the single-character images and the corresponding recognition results of the single-character images are displayed, compared with a regional proofreading mode (for example, the proofreading mode comprises more than ten digits), the error probability of the one-to-one comparison mode is smaller, so that the proofreading precision is improved; finally, because the single character is subjected to proofreading, compared with regional proofreading, the method does not need to memorize repeatedly, but only needs direct proofreading, and therefore, the method also has higher efficiency.
In one embodiment, the target image comprises a plurality of character areas to be recognized; the obtaining of the recognition result of the character area to be recognized in each target image according to the proofreading result includes: and obtaining the recognition results corresponding to the character areas to be recognized in each target image according to the proofreading result.
In one embodiment, the displaying the single-character image and the recognition result corresponding to the single-character image includes: generating a plurality of homogeneous sets according to the identification result corresponding to the single character image, wherein the identification result of the single character image in each homogeneous set is the same; and displaying the single character images and the identification results corresponding to the single character images according to each homogeneous set.
In one embodiment, the displaying the single-character images and the recognition results corresponding to the single-character images according to each of the homogeneous sets includes: acquiring a category identification input by a user, and determining a target homogeneous set according to the category identification; and displaying each single-character image in the target same-class set and the corresponding recognition result of the single-character image in different character display areas.
In one embodiment, the computer program, when executed by the processor, is further operable to: after the single character images in the target homogeneous set and the identification results corresponding to the single character images are displayed in different character display areas, if a first preset operation is detected in the character display areas, determining a first single character image corresponding to the first preset operation, and determining a target character area corresponding to the first single character image; displaying the single character image of each target character in the target character area in a first display area; and displaying the recognition result of the single character image of each target character in the target character area in a second display area.
In an embodiment, the obtaining a verification result obtained by verifying the identification result includes: if a second preset operation is detected in the character display area, determining a second single character image corresponding to the second preset operation, and displaying a proofreading result input frame; and acquiring a proofreading result of the second single-character image input by the user through the proofreading result input frame.
In an embodiment, the obtaining, according to the collation result, a recognition result of a character region to be recognized in each target image includes: acquiring a character position table corresponding to each target image, wherein the character position table records the positions of target characters in the target images in a character area; and combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition result of the character area to be recognized in each target image.
In one embodiment, a computer-readable storage medium is proposed, in which a computer program is stored which, when executed by a processor, causes the processor to carry out the steps of:
acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
obtaining a proofreading result obtained by proofreading the identification result;
and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
The computer-readable storage medium firstly acquires a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters; then, carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image; simultaneously obtaining a proofreading result obtained by proofreading the identification result; and finally, obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result. Therefore, in the mode, as the proofreading of the characters in the character areas of the plurality of target images is carried out at the same time, the batch proofreading is realized, and the proofreading efficiency is improved to a certain extent; meanwhile, because the single-character images and the corresponding recognition results of the single-character images are displayed, compared with a regional proofreading mode (for example, the proofreading mode comprises more than ten digits), the error probability of the one-to-one comparison mode is smaller, so that the proofreading precision is improved; finally, because the single character is subjected to proofreading, compared with regional proofreading, the method does not need to memorize repeatedly, but only needs direct proofreading, and therefore, the method also has higher efficiency.
In one embodiment, the target image comprises a plurality of character areas to be recognized; the obtaining of the recognition result of the character area to be recognized in each target image according to the proofreading result includes: and obtaining the recognition results corresponding to the character areas to be recognized in each target image according to the proofreading result.
In one embodiment, the displaying the single-character image and the recognition result corresponding to the single-character image includes: generating a plurality of homogeneous sets according to the identification result corresponding to the single character image, wherein the identification result of the single character image in each homogeneous set is the same; and displaying the single character images and the identification results corresponding to the single character images according to each homogeneous set.
In one embodiment, the displaying the single-character images and the recognition results corresponding to the single-character images according to each of the homogeneous sets includes: acquiring a category identification input by a user, and determining a target homogeneous set according to the category identification; and displaying each single-character image in the target same-class set and the corresponding recognition result of the single-character image in different character display areas.
In one embodiment, the computer program, when executed by the processor, is further operable to: after the single character images in the target homogeneous set and the identification results corresponding to the single character images are displayed in different character display areas, if a first preset operation is detected in the character display areas, determining a first single character image corresponding to the first preset operation, and determining a target character area corresponding to the first single character image; displaying the single character image of each target character in the target character area in a first display area; and displaying the recognition result of the single character image of each target character in the target character area in a second display area.
In an embodiment, the obtaining a verification result obtained by verifying the identification result includes: if a second preset operation is detected in the character display area, determining a second single character image corresponding to the second preset operation, and displaying a proofreading result input frame; and acquiring a proofreading result of the second single-character image input by the user through the proofreading result input frame.
In an embodiment, the obtaining, according to the collation result, a recognition result of a character region to be recognized in each target image includes: acquiring a character position table corresponding to each target image, wherein the character position table records the positions of target characters in the target images in a character area; and combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition result of the character area to be recognized in each target image.
It should be noted that the above-mentioned character batch recognition method, character batch recognition apparatus, computer device and computer readable storage medium belong to a general inventive concept, and the contents in the embodiments of the character batch recognition method, the character batch recognition apparatus, the computer device and the computer readable storage medium are mutually applicable.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A character batch recognition method is characterized by comprising the following steps:
acquiring a target image set, wherein the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
obtaining a proofreading result obtained by proofreading the identification result;
and obtaining the recognition result of the character area to be recognized in each target image according to the proofreading result.
2. The method of claim 1, wherein the target image includes a plurality of character regions to be recognized;
the obtaining of the recognition result of the character area to be recognized in each target image according to the proofreading result includes: and obtaining the recognition results corresponding to the character areas to be recognized in each target image according to the proofreading result.
3. The method of claim 1, wherein the displaying the single-character image and the recognition result corresponding to the single-character image comprises:
generating a plurality of homogeneous sets according to the identification result corresponding to the single character image, wherein the identification result of the single character image in each homogeneous set is the same;
and displaying the single character images and the identification results corresponding to the single character images according to each homogeneous set.
4. The method of claim 3, wherein displaying the single-character images and the recognition results corresponding to the single-character images according to each of the homogeneous sets comprises:
acquiring a category identification input by a user, and determining a target homogeneous set according to the category identification;
and displaying each single-character image in the target same-class set and the corresponding recognition result of the single-character image in different character display areas.
5. The method of claim 4, wherein after displaying the respective single-character images in the target homogeneous set and the recognition results corresponding to the single-character images in different character display regions, further comprising:
if a first preset operation is detected in a character display area, determining a first single character image corresponding to the first preset operation, and determining a target character area corresponding to the first single character image;
displaying the single character image of each target character in the target character area in a first display area;
and displaying the recognition result of the single character image of each target character in the target character area in a second display area.
6. The method as claimed in claim 4, wherein said obtaining a collation result obtained by collating said identification result comprises:
if a second preset operation is detected in the character display area, determining a second single character image corresponding to the second preset operation, and displaying a proofreading result input frame;
and acquiring a proofreading result of the second single-character image input by the user through the proofreading result input frame.
7. The method according to any one of claims 1 to 6, wherein obtaining the recognition result of the character region to be recognized in each target image according to the collation result comprises:
acquiring a character position table corresponding to each target image, wherein the character position table records the positions of target characters in the target images in a character area;
and combining the proofreading results corresponding to the single character images according to the character position table to obtain the recognition result of the character area to be recognized in each target image.
8. A batch character recognition apparatus, comprising:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a target image set, the target image set comprises a plurality of target images, the target images comprise character areas to be recognized, and the character areas comprise a plurality of target characters;
the segmentation recognition module is used for carrying out segmentation recognition on the target characters in the character region to be recognized to obtain a single character image of each target character and a recognition result corresponding to the single character image, and displaying the single character image and the recognition result corresponding to the single character image;
the second acquisition module is used for acquiring a proofreading result obtained by proofreading the identification result;
and the proofreading and identifying module is used for obtaining the identification result of the character area to be identified in each target image according to the proofreading result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the batch recognition method of characters as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the batch recognition method of characters according to any one of claims 1 to 7.
CN201910872233.3A 2019-09-16 2019-09-16 Character batch recognition method and device and computer equipment Pending CN110751140A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910872233.3A CN110751140A (en) 2019-09-16 2019-09-16 Character batch recognition method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910872233.3A CN110751140A (en) 2019-09-16 2019-09-16 Character batch recognition method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN110751140A true CN110751140A (en) 2020-02-04

Family

ID=69276434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910872233.3A Pending CN110751140A (en) 2019-09-16 2019-09-16 Character batch recognition method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN110751140A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220111A (en) * 2021-12-22 2022-03-22 深圳市伊登软件有限公司 Image-text batch identification method and system based on cloud platform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5852685A (en) * 1993-07-26 1998-12-22 Cognitronics Imaging Systems, Inc. Enhanced batched character image processing
JPH117492A (en) * 1997-06-19 1999-01-12 Baazu Joho Kagaku Kenkyusho:Kk Method and device for editing key entry
CN1426017A (en) * 2001-12-14 2003-06-25 全景软体股份有限公司 Method and its system for checking multiple electronic files
US20050232495A1 (en) * 2004-04-19 2005-10-20 International Business Machines Corporation Device for Outputting Character Recognition Results, Character Recognition Device, and Method and Program Therefor
US20130085746A1 (en) * 2011-09-30 2013-04-04 International Business Machines Corporation Proof reading of text data generated through optical character recognition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5852685A (en) * 1993-07-26 1998-12-22 Cognitronics Imaging Systems, Inc. Enhanced batched character image processing
JPH117492A (en) * 1997-06-19 1999-01-12 Baazu Joho Kagaku Kenkyusho:Kk Method and device for editing key entry
CN1426017A (en) * 2001-12-14 2003-06-25 全景软体股份有限公司 Method and its system for checking multiple electronic files
US20050232495A1 (en) * 2004-04-19 2005-10-20 International Business Machines Corporation Device for Outputting Character Recognition Results, Character Recognition Device, and Method and Program Therefor
US20130085746A1 (en) * 2011-09-30 2013-04-04 International Business Machines Corporation Proof reading of text data generated through optical character recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李成城等: "基于OCR的纵向文字校对的研究与实现", 《计算机应用研究》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220111A (en) * 2021-12-22 2022-03-22 深圳市伊登软件有限公司 Image-text batch identification method and system based on cloud platform

Similar Documents

Publication Publication Date Title
CN109993112B (en) Method and device for identifying table in picture
CN107798299B (en) Bill information identification method, electronic device and readable storage medium
CN111476227B (en) Target field identification method and device based on OCR and storage medium
US9626555B2 (en) Content-based document image classification
WO2021012382A1 (en) Method and apparatus for configuring chat robot, computer device and storage medium
WO2021143088A1 (en) Synchronous check method and apparatus for multiple certificate types, and computer device and storage medium
CN110490190B (en) Structured image character recognition method and system
CN107358148B (en) Anti-cheating network investigation method and device based on handwriting recognition
CN111242124A (en) Certificate classification method, device and equipment
US20190384971A1 (en) System and method for optical character recognition
CN113111880B (en) Certificate image correction method, device, electronic equipment and storage medium
CN112396047B (en) Training sample generation method and device, computer equipment and storage medium
CN110147787A (en) Bank's card number automatic identifying method and system based on deep learning
US20210390299A1 (en) Techniques to determine document recognition errors
CN111858977B (en) Bill information acquisition method, device, computer equipment and storage medium
CN113255642A (en) Medical bill information integration method for injury claims
CN113837151A (en) Table image processing method and device, computer equipment and readable storage medium
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN111414914A (en) Image recognition method and device, computer equipment and storage medium
CN114005126A (en) Table reconstruction method and device, computer equipment and readable storage medium
CN114386013A (en) Automatic student status authentication method and device, computer equipment and storage medium
CN114332883A (en) Invoice information identification method and device, computer equipment and storage medium
CN110751140A (en) Character batch recognition method and device and computer equipment
CN110909816A (en) Picture identification method and device
CN114220103B (en) Image recognition method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200204

RJ01 Rejection of invention patent application after publication