CN109635798B - Information extraction method and device - Google Patents

Information extraction method and device Download PDF

Info

Publication number
CN109635798B
CN109635798B CN201811487626.4A CN201811487626A CN109635798B CN 109635798 B CN109635798 B CN 109635798B CN 201811487626 A CN201811487626 A CN 201811487626A CN 109635798 B CN109635798 B CN 109635798B
Authority
CN
China
Prior art keywords
area
image
position information
processed
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811487626.4A
Other languages
Chinese (zh)
Other versions
CN109635798A (en
Inventor
潘鹏举
何春江
王根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN201811487626.4A priority Critical patent/CN109635798B/en
Publication of CN109635798A publication Critical patent/CN109635798A/en
Application granted granted Critical
Publication of CN109635798B publication Critical patent/CN109635798B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an information extraction method and device, wherein template information is corrected according to position information of a region similar to a reference region found in an image to be processed, and the corrected template information is used for extracting information from the image to be processed. Because the reference characters are contents in the template image and the image to be processed, but not calibration points set for correction, compared with common calibration points, the reference characters are relatively uniformly and widely distributed in the image, therefore, the template information is updated by taking the position of the reference characters in the image to be processed as a basis, the problem of nonlinear error propagation brought by taking the calibration points as a basis can be avoided, and higher accuracy is achieved, and the accuracy of the information extracted from the image to be processed is improved.

Description

Information extraction method and device
Technical Field
The present application relates to the field of image processing, and in particular, to an information extraction method and apparatus.
Background
Techniques for automatically recognizing characters from images are widely used in various fields. In order to improve the accuracy of character recognition, a reference region in a template image is usually used to locate a reference region in an image to be processed, then a region of interest is cut from the image to be processed according to the reference region in the image to be processed, and characters in the region of interest are recognized. Taking an intelligent marking as an example, taking an image of a blank answer sheet as a template image, taking an image of an answer sheet filled by a test taker as an image to be processed, taking an area occupied by an answer number in the template image as a reference area, positioning the area occupied by the answer number in the image to be processed, cutting out an area between the areas occupied by the answer number as an interested area, and then identifying characters in the interested area.
However, due to some factors in the imaging process, the image to be processed may have some deviation from the template image, for example, during the scanning of the answer sheet filled by the examinee, the relative position of the answer sheet and the camera is different from that of the template image and the camera, so that the obtained image to be processed has a certain angle of inclination with respect to the template image. In this case, the region of interest is extracted using the reference region in the template image, and there may be a deviation that causes a character to be recognized to be missing or includes an interfering character in the region of interest.
In order to avoid positioning deviation of an interested region, in the prior art, a calibration point is set in a template image and an image to be processed, the deviation amount of the template image and the image to be processed is calculated by using the calibration point, and then the image to be processed is corrected according to the deviation amount.
Disclosure of Invention
In the process of research, the applicant finds that, in the case of a large image to be processed, due to nonlinear error propagation, the correction effect on a region far from the calibration point is not good in the mode of correcting the image to be processed by using the calibration point, and therefore, the accuracy of the extracted region of interest still needs to be improved.
The application provides an information extraction method and device, and aims to solve the problem of how to improve the correction effect of deviation between an image to be processed and template information so as to extract more accurate information from the image to be processed.
In order to achieve the above object, the present application provides the following technical solutions:
an information extraction method, comprising:
acquiring template information generated by a template image, wherein the template information comprises reference characters and position information of reference areas, and any one reference area is an area occupied by at least one reference character in the template image;
deleting non-reference characters from the image to be processed to obtain a reference image;
searching a target area in the reference image, wherein the target area is an area of which the similarity with any one reference area meets a preset condition;
and correcting the position information of the reference area according to the position information of the target area, wherein the corrected position information of the reference area is used for extracting information from the image to be processed.
Optionally, before correcting the position information of the reference region, the method further includes:
acquiring first position information, wherein the first position information is position information of a first reference area in the template image, and the first reference area is any one reference area;
intercepting the area indicated by the first position information from the image to be processed to obtain a contrast area;
and determining that the similarity between the first reference area and the contrast area is smaller than a preset threshold value.
Optionally, the number of the target regions is the same as the number of the reference regions;
the correcting the position information of the reference region according to the position information of the target region includes:
and replacing the position information of the reference area with the position information of the corresponding target area.
Optionally, the number of the target regions is smaller than the number of the reference regions;
the correcting the position information of the reference region according to the position information of the target region includes:
replacing the position information of the reference area with the position information of the corresponding target area;
calculating the deviation amount of the image to be processed and the template image according to the position information of the target area and the position information of the corresponding reference area;
and updating the position information of other reference areas according to the deviation amount, wherein the other reference areas are reference areas without corresponding target areas.
Optionally, the searching for the target region in the reference image includes:
searching a covering area of a second reference area on the reference image according to a preset step length, wherein the covering area is an area overlapped with the second reference area; the second reference region is any one of the reference regions;
calculating the similarity between the second reference area and the coverage area;
and taking the coverage area with the similarity meeting the preset condition as a target area of the second reference area.
Optionally, the deleting the non-reference character from the image to be processed includes:
dividing characters in the image to be processed into the reference characters and the non-reference characters by using a discrimination model obtained by pre-training;
and deleting the non-reference character from the image to be processed.
Optionally, the dividing, by using a pre-trained discrimination model, the characters in the image to be processed into the reference characters and the non-reference characters, and deleting the non-reference characters from the image to be processed includes:
inputting the image to be processed as the classification model obtained by the pre-training to obtain an output result, wherein the output result comprises position information and probability of a candidate region, the candidate region is a region occupied by the non-reference character, and the probability is the probability that the character included in the candidate region is the non-reference character;
and deleting the candidate region with the probability greater than a preset threshold value from the image to be processed according to the position information of the candidate region.
An information extraction apparatus comprising:
the template information comprises reference characters and position information of reference areas, and any one of the reference areas is an area occupied by at least one of the reference characters in the template image;
the deleting module is used for deleting the non-reference characters from the image to be processed to obtain a reference image;
the searching module is used for searching a target area in the reference image, wherein the similarity between the target area and any one of the reference areas meets a preset condition;
and the correcting module is used for correcting the position information of the reference region according to the position information of the target region, and the corrected position information of the reference region is used for extracting information from the image to be processed.
An information extraction device comprising:
a memory and a processor;
the memory is used for storing one or more programs;
the processor is configured to execute the one or more programs to cause the information extraction device to implement the aforementioned information extraction method.
A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to execute the aforementioned information extraction method.
According to the information extraction method and device, the template information is corrected according to the position information of the area similar to the reference area found in the image to be processed, and then the corrected template information is used for extracting information from the image to be processed. Because the reference characters are contents in the template image and the image to be processed, but not calibration points set for correction, compared with common calibration points, the reference characters are relatively uniformly and widely distributed in the image, therefore, the template information is updated by taking the position of the reference characters in the image to be processed as a basis, the problem of nonlinear error propagation brought by taking the calibration points as a basis can be avoided, and higher accuracy is achieved, and the accuracy of the information extracted from the image to be processed is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an information extraction method disclosed in an embodiment of the present application;
FIG. 2 is a schematic diagram of a deep neural network classifier based on a deep neural network design;
FIG. 3 is a flow chart of another information extraction method disclosed in the embodiments of the present application;
FIG. 4 is an exemplary diagram of a blank answer sheet image;
FIG. 5 is an exemplary diagram of an image of an answering card of a test taker;
FIG. 6 is an exemplary diagram of a reference image obtained after handwritten characters are deleted from an image of an answering card of a test taker;
FIG. 7 is a schematic diagram of a sliding generated similarity matrix;
FIG. 8 is a flow chart of yet another information extraction method disclosed in an embodiment of the present application;
fig. 9 is a schematic structural diagram of an information extraction apparatus disclosed in an embodiment of the present application.
Detailed Description
The basic difference between the information extraction method disclosed by the application and the existing method for correcting the image to be processed by using the calibration points so as to adapt the image to be processed to the template image is as follows: and correcting the template information to enable the template information to be adapted to the image to be processed.
The information extraction method disclosed by the embodiment of the application can be used as a pre-sequence processing step of character recognition, and aims to correct template information so that a more accurate region of interest is extracted from an image to be processed by using the template information in a subsequent character recognition process, and the accuracy of character recognition is improved.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is an information extraction method disclosed in an embodiment of the present application, including the following steps:
s101: and acquiring a template image.
Alternatively, the template image may be input by the user, for example, the user uploads a scanned or captured image of a blank (i.e., unanswered) answer sheet as the template image.
S102: and acquiring template information.
The template information is generated from the template image. The template information includes reference characters and position information of the reference area. Wherein, any one reference area is the area occupied by at least one reference character in the template image.
Optionally, the specific manner of obtaining the template information is as follows: the template image is displayed, a reference area (usually a rectangular area) framed on the template image by a user is obtained by a man-machine interaction mode, and the position information of the reference area is determined. The reference character input by the user can be received in a man-machine interaction mode, and the reference character can also be obtained by recognizing the character in the reference area.
For example, the template image is displayed as shown in fig. 4, fig. 4 may be displayed on a human-computer interaction interface, the human-computer interaction interface may further include an input box of a reference area and an input box of a reference character in addition to fig. 4, a user uses a mouse to frame a rectangular box including any one of the question marks in fig. 4, after the frame selection, the coordinates of the rectangular box are automatically filled in the input box of the reference area of the interaction interface, and the question mark included in the rectangular box is automatically filled in the input box of the reference character (the character in the reference area is recognized to obtain the reference character). Alternatively, the user may input the question mark in the rectangular frame of the frame selection in the input frame of the reference character.
It should be noted that the template information input by the user may also be directly received without receiving and displaying the template image.
S103: and acquiring an image to be processed.
Alternatively, the image to be processed may be input by the user, for example, the user uploads a scanned or photographed image of the answer sheet filled out by the examinee as the image to be processed. Images to be processed may also be received from other devices.
S104: and deleting the non-reference characters from the image to be processed to obtain a reference image.
Specifically, the pre-trained discrimination model may be used to divide the characters in the image to be processed into reference characters and non-reference characters, and delete the non-reference characters.
Further, as shown in fig. 2, the discriminant model is a deep neural network classifier designed based on a deep neural network, and includes an input layer, a feature extraction layer, a target region extraction layer, and an output layer.
Wherein, the input layer inputs any image to be processed (i.e. image in fig. 2). The feature extraction layer adopts a Network structure of a Convolutional Neural Network (CNN), an image to be processed input by the input layer generates a feature map after the Convolution operation of the CNN, and the feature extraction layer inputs the feature map into the target region extraction layer. The target region extraction layer adopts a network structure of a region candidate network (RPN), the RPN is a neural network which is established on a feature map by using a sliding window and is used for object classification and region position regression, and the target region extraction layer generates a plurality of candidate regions to be classified in the feature map, wherein the candidate regions to be classified are regions including characters. And the target region extraction layer generates a fixed feature vector after the ROI Pooling operation is carried out on the candidate region, and inputs the feature vector into the output layer. The output layer is divided into a full connection layer and two branches, wherein one branch is a softmax classifier and is used for outputting a candidate region (the candidate region is a region which is judged to comprise non-reference characters in the candidate region to be classified) and the probability that characters included in the candidate region are the non-reference characters, and the other branch is a bounding-box regressors and is used for outputting the position information of the candidate region.
The training method of the model shown in fig. 2 may be: and initializing the parameters of the deep neural network layer by layer without supervision by adopting a deep learning training method, and regulating and optimizing the parameters of each layer by using a direction propagation algorithm in a supervised manner. The specific parameters of each layer in the model and the training and tuning methods can be referred to in the prior art, and are not described herein again.
After the position information and the probability of the candidate region are obtained, deleting the candidate region of which the probability that the included character is the non-reference character is larger than a preset threshold value from the image to be processed according to the position information of the candidate region output by the model, and obtaining a reference image. One specific way to delete the non-reference candidate region is: the candidate region including the non-reference character is set as a background.
It should be noted that the above discriminant model and its functions are merely examples, and optionally, the discriminant model may also directly output a discriminant result, where the discriminant result no longer includes a probability, and the discriminant result directly indicates an area that needs to be deleted. In this way, the region to be deleted does not need to be identified according to the probability.
S105: and searching a target area in the reference image, wherein the target area is an area of which the similarity with any one reference area meets a preset condition.
Specifically, for any one reference area, each area (coverage area for short) that can be covered by the reference area is searched in the reference image, and the size and shape of the coverage area are the same as those of the reference area, that is, any one coverage area can completely overlap with the reference area. And calculating the similarity between the reference area and each coverage area, wherein the coverage area with the similarity to the reference area meeting a preset condition (for example, the maximum similarity among all the calculated similarities) is a target area of the reference area.
Further, the coverage area may be found by using a sliding matching manner, that is, the reference area is slid in the reference image by a preset step (for example, one pixel), and during the sliding process, the area covered by the reference area on the reference image is the coverage area. Of course, sliding matching is only one specific implementation for finding the coverage area, and other ways for finding the coverage area may be used.
In this embodiment, for each reference region, the target regions may be searched, and thus, the number of the obtained target regions is the same as the number of the reference regions. It is also possible to search for only a part of the target areas of the reference area, in which case the number of target areas is smaller than the number of reference areas.
In this embodiment, a target area found by any one of the reference areas is referred to as a target area corresponding to the reference area.
S106: the position information of the reference area is corrected based on the position information of the target area.
Specifically, when the target areas of all the reference areas are searched, the position information of the reference area is replaced with the position information of the corresponding target area.
When a target area of a part of the reference areas is searched, the position information of the reference area is replaced with the position information of the corresponding target area for the reference area for which the target area is not searched. For a reference area (called as other reference area) in which the target area is not searched, calculating the deviation amount between the image to be processed and the template image according to the position information of the searched target area and the position information of the corresponding reference area, and updating the position information of the other reference area according to the deviation amount. Specifically, the deviation amount is determined according to the deviation amount between each pair of corresponding target region and reference region.
For example, the deviation amount may be a mean value of deviation amounts of the target region and the reference region corresponding to each pair. The determined deviation amount may include a displacement amount (a displacement direction may be represented by positive and negative values), and the sum of the position coordinates and the displacement amount of the other reference region is the updated position coordinates of the other reference region.
At this point, the correction of the template information is completed.
After the template information is corrected, the position information of the reference area in the image to be processed can be used for positioning the position information of the reference area in the image to be processed, and the region of interest is cut out of the image to be processed according to the position information of the reference area in the image to be processed.
In the process shown in fig. 1, the area similar to the reference area is searched in the image to be processed according to the similarity, so that the searched area is most likely to be an area including the same character as the reference area, that is, the position of the reference character in the image to be processed can be determined. Because the reference characters are contents in the template image and the image to be processed, but not calibration points set for correction, compared with common calibration points, the reference characters are relatively uniformly and widely distributed in the image, therefore, the template information is updated by taking the position of the reference characters in the image to be processed as a basis, the problem of nonlinear error propagation brought by taking the calibration points as a basis can be avoided, and the accuracy is higher.
The process shown in fig. 1 will be described in detail below by taking an intelligent paper reading as an example.
Fig. 3 is a diagram of another information extraction method disclosed in the embodiment of the present application, including the following steps:
s301: and acquiring the template information of the answer sheet.
The answer sheet template information comprises position information and reference characters of the reference area. In the present embodiment, the reference character is a print character. The template image for generating the template information is a blank answer sheet image, as shown in fig. 4: the blank answer sheet has only printed characters (such as question marks) and does not include characters answered by the examinee.
In practice, the inscription number in a block character or the character in an inscription, not the horizontal line, is generally used as a reference character.
S302: and obtaining an image of the answer sheet of the examinee.
As shown in fig. 5, the answer sheet image of the examinee includes print characters and handwritten characters answered by the examinee.
S303: and taking the image of the answer sheet of the examinee as the input of the deep neural network model shown in the figure 2 to obtain an output result, wherein the output result comprises the position information of the candidate region and the probability that the characters in the candidate region are in handwriting.
S304: and deleting the candidate region with the probability of handwriting as the character larger than a preset threshold value from the image of the answer sheet of the examinee to obtain a reference image.
Fig. 6 is a reference image obtained after deleting handwritten characters.
S305: and carrying out binarization processing on the template image, and carrying out binarization processing on the reference image.
The binarization processing is advantageous for removing noise in the image.
S306: as for any reference area in the template image after the binarization processing, taking a reference area including "13" as shown in fig. 7 as an example, the reference area is slid on the binarized reference image by one pixel as a step, and the similarity between the covered area of the reference area and the reference area after each sliding is calculated to obtain a similarity matrix.
The size (length and width) of the similarity matrix coincides with the size of the reference image because it is in steps of one pixel.
Each similarity in the similarity matrix is calculated according to the following formula:
Figure BDA0001894925600000101
where x ', y' are the coordinates of the pixels in the reference area, T '(x', y ') is the value of the pixels in the reference area, x, y are the coordinates of the pixels in the reference image, and I' (x ', y') is the value of the pixels in the reference image.
S307: and taking the area corresponding to the maximum value in the similarity matrix as the target area of the reference area.
And the area corresponding to the maximum value is the coverage area of the maximum value obtained in the sliding process of the reference area.
The target region of each reference region is obtained in accordance with S306 to S307.
S308: the position information of each reference area is replaced with the position information of the corresponding target area.
Thus, updated template information is obtained.
S309: and extracting the region of interest from the image to be processed by using the position information of the reference region in the updated template information.
Namely, the position information of the reference area in the updated template information is used, the reference area in the image to be processed, namely the area where the question mark in the image to be processed is located, the area between the areas where the two question marks are located is cut into the region of interest, the characters in the region of interest are the handwritten characters answered by the examinee, then the characters in the region of interest are identified, and the answer of the examinee can be judged according to the identified characters.
The information extraction method shown in fig. 1 or 3 is applicable to any one of the images to be processed, i.e., before extracting the region of interest from the image to be processed, the template information is corrected by using the method shown in fig. 1 or 3 to obtain a more accurate cutting region (i.e., the region of interest). Under the condition that a plurality of images to be processed exist, for example, the number of examination answer sheets faced by the intelligent scoring is large, for any image to be processed, whether the template information is applicable or not can be judged, if not, the template information is corrected by using the method shown in fig. 1 or fig. 3, and if so, the step of correcting the template information can be skipped, so that the computing resources are saved. Based on this, the embodiment of the present application also discloses a method as shown in fig. 8, which includes the following steps:
s801: and acquiring template information, wherein the template information comprises reference characters and position information of the reference area.
S802: any one of the images to be processed is acquired.
S803: and intercepting a contrast area from the image to be processed, wherein the contrast area is an area with the same position information as the reference area.
Specifically, one comparison area may be cut out according to any one reference area, or the respective comparison areas of a plurality of reference areas may be cut out to obtain a plurality of comparison areas.
S804: and calculating the similarity of the reference region and the contrast region.
In the case of obtaining a plurality of contrast regions, the similarity between each reference region and each contrast region may be calculated, and then a similarity result may be obtained according to the plurality of similarities, for example, an average value of the plurality of similarities is calculated to obtain the similarity result.
S805: and judging whether the similarity is greater than a preset threshold value, if not, executing S806, and if so, extracting the region of interest in the image to be processed by using the template information without correcting the template information.
S806: and correcting the template information.
The specific steps of correcting the template information are as in S104-106, or S303-S308, which are not described herein again.
Because the similarity between the image to be processed and the template image can be reflected by the similarity between the comparison area and the reference area, if the similarity is greater than a preset threshold value, the deviation is not enough to influence the accuracy of the character recognition result when the image to be processed is compared with the template image, and therefore, the template information does not need to be corrected.
For the character recognition work of a large number of images to be processed, such as an intelligent marking process, because the process of scanning the answer sheets of examinees has continuity, the used template information can be the template information corrected last time (namely the template information corrected aiming at the answer sheet image of the examinee) aiming at any image of the answer sheets of the examinees, and under the condition that the template information corrected last time can be used, repeated correction processes can be avoided, so that the efficiency is obviously improved.
Of course, the initial template information may also be used, and the embodiment of the present application is not limited.
Fig. 9 is an information extraction apparatus disclosed in an embodiment of the present application, including: the device comprises an acquisition module, a deletion module, a search module and a correction module. Optionally, a determining module is further included.
The acquisition module is used for acquiring template information generated by a template image, the template information comprises reference characters and position information of reference areas, and any one reference area is an area occupied by at least one reference character in the template image. And the deleting module is used for deleting the non-reference characters from the image to be processed to obtain a reference image. The searching module is used for searching a target area in the reference image, wherein the target area is an area of which the similarity with any one reference area meets a preset condition. The correction module is used for correcting the position information of the reference area according to the position information of the target area, and the corrected position information of the reference area is used for extracting information from the image to be processed.
The determining module is used for acquiring first position information before the correcting module corrects the position information of the reference area, wherein the first position information is the position information of the first reference area in the template image, and the first reference area is any one of the reference areas. And intercepting the area indicated by the first position information from the image to be processed to obtain a comparison area, and determining that the similarity between the first reference area and the comparison area is smaller than a preset threshold value.
Further, the correction module corrects the position information of the reference region according to the position information of the target region in a specific implementation manner as follows: and replacing the position information of the reference area with the position information of the corresponding target area when the number of the target areas is the same as the number of the reference areas. Replacing the position information of the reference area with the position information of the corresponding target area when the number of the target areas is smaller than the number of the reference areas; calculating the deviation amount of the image to be processed and the template image according to the position information of the target area and the position information of the corresponding reference area; and updating the position information of other reference areas according to the deviation amount, wherein the other reference areas are reference areas without corresponding target areas.
The specific implementation manner of the searching module for searching the target area in the reference image is as follows: searching a covering area of a second reference area on the reference image according to a preset step length, wherein the covering area is an area overlapped with the second reference area; the second reference region is any one of the reference regions; calculating the similarity between the second reference area and the coverage area; and taking the coverage area with the similarity meeting the preset condition as a target area of the second reference area.
The specific implementation mode of deleting the non-reference character from the image to be processed by the deleting module is as follows: dividing characters in the image to be processed into the reference characters and the non-reference characters by using a discrimination model obtained by pre-training; and deleting the non-reference character from the image to be processed.
Further, inputting the image to be processed as the pre-trained discrimination model to obtain an output result, where the output result includes position information and probability of a candidate region, the candidate region is a region occupied by the non-reference character, and the probability is a probability that a character included in the candidate region is the non-reference character; and deleting the candidate region with the probability greater than a preset threshold value from the image to be processed according to the position information of the candidate region.
The information extraction device shown in fig. 9 corrects template information using the reference character, so that more correct information can be extracted from the image to be processed using the template information, and the accuracy of subsequent character recognition is improved.
The embodiment of the application also discloses an information extraction device, which comprises: a memory and a processor. The memory is used for storing one or more programs, and the processor is used for executing the one or more programs so as to enable the information extraction device to realize the information extraction method.
The embodiment of the application also discloses a computer readable storage medium, wherein the computer readable storage medium is stored with instructions, and when the computer readable storage medium runs on a computer, the computer is enabled to execute the information extraction method.
The functions described in the method of the embodiment of the present application, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An information extraction method, comprising:
acquiring template information generated by a template image, wherein the template information comprises reference characters and position information of reference areas, and any one reference area is an area occupied by at least one reference character in the template image;
deleting non-reference characters from the image to be processed to obtain a reference image;
for any one of the reference areas, searching a target area in the reference image, wherein the target area is an area with the maximum similarity with any one of the reference areas;
and correcting the position information of the reference area according to the position information of the target area, wherein the corrected position information of the reference area is used for extracting information from the image to be processed.
2. The method of claim 1, further comprising, prior to said correcting the position information of the reference region:
acquiring first position information, wherein the first position information is position information of a first reference area in the template image, and the first reference area is any one reference area;
intercepting the area indicated by the first position information from the image to be processed to obtain a contrast area;
and determining that the similarity between the first reference area and the contrast area is smaller than a preset threshold value.
3. The method according to claim 1 or 2, characterized in that the number of target areas is the same as the number of reference areas;
the correcting the position information of the reference region according to the position information of the target region includes:
and replacing the position information of the reference area with the position information of the corresponding target area.
4. The method according to claim 1 or 2, characterized in that the number of target regions is smaller than the number of reference regions;
the correcting the position information of the reference region according to the position information of the target region includes:
replacing the position information of the reference area with the position information of the corresponding target area;
calculating the deviation amount of the image to be processed and the template image according to the position information of the target area and the position information of the corresponding reference area;
and updating the position information of other reference areas according to the deviation amount, wherein the other reference areas are reference areas without corresponding target areas.
5. The method of claim 1, wherein the finding the target region in the reference image comprises:
searching a covering area of a second reference area on the reference image according to a preset step length, wherein the covering area is an area overlapped with the second reference area; the second reference region is any one of the reference regions;
calculating the similarity between the second reference area and the coverage area;
and taking the coverage area with the similarity meeting the preset condition as a target area of the second reference area.
6. The method of claim 1, wherein the deleting non-reference characters from the image to be processed comprises:
dividing characters in the image to be processed into the reference characters and the non-reference characters by using a discrimination model obtained by pre-training;
and deleting the non-reference character from the image to be processed.
7. The method of claim 6, wherein the using the pre-trained classification model to divide the characters in the image to be processed into the reference characters and the non-reference characters, and wherein the deleting the non-reference characters from the image to be processed comprises:
inputting the image to be processed as the classification model obtained by the pre-training to obtain an output result, wherein the output result comprises position information and probability of a candidate region, the candidate region is a region occupied by the non-reference character, and the probability is the probability that the character included in the candidate region is the non-reference character;
and deleting the candidate region with the probability greater than a preset threshold value from the image to be processed according to the position information of the candidate region.
8. An information extraction apparatus characterized by comprising:
the template information comprises reference characters and position information of reference areas, and any one of the reference areas is an area occupied by at least one of the reference characters in the template image;
the deleting module is used for deleting the non-reference characters from the image to be processed to obtain a reference image;
the searching module is used for searching a target area in the reference image for any one reference area, wherein the target area is an area with the maximum similarity with any one reference area;
and the correcting module is used for correcting the position information of the reference region according to the position information of the target region, and the corrected position information of the reference region is used for extracting information from the image to be processed.
9. An information extraction device characterized by comprising:
a memory and a processor;
the memory is used for storing one or more programs;
the processor is configured to execute the one or more programs to cause the information extraction device to implement the information extraction method of any one of claims 1 to 7.
10. A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to execute the information extraction method of any one of claims 1-7.
CN201811487626.4A 2018-12-06 2018-12-06 Information extraction method and device Active CN109635798B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811487626.4A CN109635798B (en) 2018-12-06 2018-12-06 Information extraction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811487626.4A CN109635798B (en) 2018-12-06 2018-12-06 Information extraction method and device

Publications (2)

Publication Number Publication Date
CN109635798A CN109635798A (en) 2019-04-16
CN109635798B true CN109635798B (en) 2022-02-25

Family

ID=66071630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811487626.4A Active CN109635798B (en) 2018-12-06 2018-12-06 Information extraction method and device

Country Status (1)

Country Link
CN (1) CN109635798B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626280B (en) * 2020-04-13 2021-09-07 北京邮电大学 Method and device for identifying answer sheet without positioning point
CN113689939B (en) * 2021-10-26 2022-04-08 萱闱(北京)生物科技有限公司 Image storage method, system and computing device for image feature matching

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8724924B2 (en) * 2008-01-18 2014-05-13 Mitek Systems, Inc. Systems and methods for processing mobile images to identify and extract content from forms
CN103383773B (en) * 2013-03-26 2016-09-28 中国科学院遥感与数字地球研究所 The remote sensing satellite image of a kind of dynamic extraction Image Control Point is the most just penetrating framework and the method for correction
CN103310211B (en) * 2013-04-26 2016-04-06 四川大学 A kind ofly fill in mark recognition method based on image procossing
CN103954334B (en) * 2014-04-28 2017-02-01 华中师范大学 Fully automatic image pickup type water meter verification system and operating method thereof
CN108460728B (en) * 2017-02-17 2022-05-31 北京大豪科技股份有限公司 Automatic template correction method and device

Also Published As

Publication number Publication date
CN109635798A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
CN110046529B (en) Two-dimensional code identification method, device and equipment
US11164027B2 (en) Deep learning based license plate identification method, device, equipment, and storage medium
CN107220640B (en) Character recognition method, character recognition device, computer equipment and computer-readable storage medium
US11836969B2 (en) Preprocessing images for OCR using character pixel height estimation and cycle generative adversarial networks for better character recognition
CN108108734B (en) License plate recognition method and device
CN107977658B (en) Image character area identification method, television and readable storage medium
CN112183038A (en) Form identification and typing method, computer equipment and computer readable storage medium
CN112966537B (en) Form identification method and system based on two-dimensional code positioning
CN111737478B (en) Text detection method, electronic device and computer readable medium
CN101896920A (en) Image processing method and device based on motion scan
CN110647882A (en) Image correction method, device, equipment and storage medium
CN110751500B (en) Processing method and device for sharing pictures, computer equipment and storage medium
CN111985465A (en) Text recognition method, device, equipment and storage medium
CN110443235B (en) Intelligent paper test paper total score identification method and system
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN109635798B (en) Information extraction method and device
CN110969052A (en) Operation correction method and equipment
CN109741273A (en) A kind of mobile phone photograph low-quality images automatically process and methods of marking
US8787702B1 (en) Methods and apparatus for determining and/or modifying image orientation
CN113139535A (en) OCR document recognition method
CN114429649B (en) Target image identification method and device
CN109508716B (en) Image character positioning method and device
CN114694161A (en) Text recognition method and equipment for specific format certificate and storage medium
CN112348019B (en) Answer sheet correction method and device, electronic equipment and storage medium
De Nardin et al. Few-shot pixel-precise document layout segmentation via dynamic instance generation and local thresholding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant