CN110008944B - OCR recognition method and device based on template matching and storage medium - Google Patents
OCR recognition method and device based on template matching and storage medium Download PDFInfo
- Publication number
- CN110008944B CN110008944B CN201910127136.1A CN201910127136A CN110008944B CN 110008944 B CN110008944 B CN 110008944B CN 201910127136 A CN201910127136 A CN 201910127136A CN 110008944 B CN110008944 B CN 110008944B
- Authority
- CN
- China
- Prior art keywords
- document
- document picture
- identified
- identification
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000012015 optical character recognition Methods 0.000 claims abstract description 103
- 238000013528 artificial neural network Methods 0.000 claims description 20
- 238000013527 convolutional neural network Methods 0.000 claims description 15
- 125000004122 cyclic group Chemical group 0.000 claims description 11
- 238000003062 neural network model Methods 0.000 claims description 11
- 230000000306 recurrent effect Effects 0.000 claims description 10
- 230000002457 bidirectional effect Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 8
- 230000008859 change Effects 0.000 claims description 6
- 238000003672 processing method Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 5
- 230000010365 information processing Effects 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 6
- 238000007639 printing Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000003709 image segmentation Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Character Input (AREA)
Abstract
The application discloses an OCR (optical character recognition) method and device based on template matching, a storage medium and computer equipment, and relates to the technical field of information processing. The method comprises the following steps: collecting sample document pictures of different appointed typesetting modes; performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture; establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples; collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified; and calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized, and performing OCR recognition on the document picture to be recognized. The recognition template database is established, so that the recognition template database can adapt to recognition of documents with various typesetting formats, and the accuracy of OCR recognition is improved.
Description
Technical Field
The present invention relates to the field of information processing technologies, and in particular, to an OCR recognition method and apparatus based on template matching, a storage medium, and a computer device.
Background
The optical character recognition (Optical Character Recognition, OCR) method refers to obtaining an electronic document of a paper document by an electronic device (e.g., a scanner or a digital camera), cutting character strings in the electronic document apart to form a small picture containing a single character, and then recognizing the cut text by using a certain method.
The conventional OCR recognition method can only accurately recognize the pictures with fixed character typesets such as identity cards, bank cards and the like because of various character typesets in the pictures to be recognized, but has poor picture recognition effect on other documents.
Disclosure of Invention
In view of this, the present application provides an OCR recognition method and apparatus, a storage medium, and a computer device based on template matching, which mainly aims to solve the problem of poor recognition effect of the existing OCR recognition method.
According to one aspect of the present application, there is provided an OCR recognition method based on template matching, the method comprising:
collecting sample document pictures of different appointed typesetting modes;
performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples;
collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified;
and calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized, and performing OCR recognition on the document picture to be recognized.
Optionally, the identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified includes:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
Optionally, the performing OCR recognition on the document picture to be recognized includes:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Optionally, the convolutional recurrent neural network model includes a neural network CNN, a bidirectional recurrent neural network LSTM, and a join time-classified CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Optionally, the collecting the document picture to be identified includes:
and acquiring a document picture to be identified by a high-definition camera with an automatic shooting angle adjusting function.
Optionally, before invoking the corresponding recognition template in the recognition template database to perform OCR recognition on the document picture to be recognized, the method further includes:
adjusting the brightness and contrast of the document picture to be identified;
carrying out gray processing on the document picture to be identified;
and receiving an angle adjustment instruction of a user on the document picture to be identified after gray processing, and adjusting the angle of the document picture to be identified.
Optionally, the first sample document picture is configured with a plurality of region identification templates in the identification template database, each region identification template being used for identifying a partial region of the sample document picture.
Optionally, the framing each sample document picture includes:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
According to another aspect of the present application, there is provided an OCR recognition device based on template matching, the device comprising:
the sample document picture collecting unit is used for collecting sample document pictures of different appointed typesetting modes;
the identification template acquisition unit is used for carrying out frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
the identification template database establishing unit is used for establishing an identification template database, and the identification template database stores identification templates corresponding to the document pictures of the samples;
the document type acquisition unit is used for acquiring a document picture to be identified, identifying the border and the title of the document picture to be identified and obtaining the document type of the document picture to be identified;
and the OCR recognition unit is used for calling a corresponding recognition template in the recognition template database according to the document type of the recognized document picture to be recognized so as to perform OCR recognition on the document picture to be recognized.
Optionally, the document type obtaining unit is further configured to:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
Optionally, the OCR recognition unit is further configured to:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Specifically, the convolutional recurrent neural network model comprises a neural network CNN, a bidirectional recurrent neural network LSTM and a joint time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Optionally, the apparatus further comprises:
the picture adjusting unit is used for adjusting the brightness and the contrast of the document picture to be identified;
the gray processing unit is used for carrying out gray processing on the document picture to be identified;
and the angle adjusting unit is used for receiving an angle adjusting instruction of a user on the document picture to be identified after the gray level processing and adjusting the angle of the document picture to be identified.
Optionally, the first sample document picture is configured with a plurality of region identification templates in the identification template database, each region identification template being used for identifying a partial region of the sample document picture.
Specifically, the identification template acquisition unit is further configured to:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
According to yet another aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described template matching-based OCR recognition method.
According to yet another aspect of the present application, there is provided a computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, the processor implementing the above-described template matching-based OCR recognition method when executing the program.
By means of the technical scheme, the OCR recognition method and device based on template matching, the storage medium and the computer equipment establish a recognition template database, can adapt to recognition of documents with different typesetting formats, and improve accuracy of OCR recognition.
In addition, the high-definition camera is also adopted to collect the picture of the document to be identified, so that the influence of shooting light and angles on OCR identification can be eliminated. And the specific region of a certain document picture can be identified based on the region identification template, so that the identification efficiency is improved.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 shows a schematic flow chart of an OCR recognition method based on template matching according to an embodiment of the present application;
FIG. 2 shows a schematic diagram of a sample document provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of an OCR recognition device based on template matching according to an embodiment of the present application.
Detailed Description
The present application will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.
Aiming at the problem of poor recognition effect of the existing OCR recognition method. The embodiment provides an OCR recognition method based on template matching, which can adapt to recognition of documents in a plurality of different typesetting formats and improve the accuracy of OCR recognition, as shown in fig. 1, and comprises the following steps:
s11: collecting sample document pictures of different appointed typesetting modes;
in practical application, sample document pictures in different appointed typesetting modes are collected through a high-definition camera with an automatic shooting angle adjusting function.
S12: performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
it should be noted that, in the embodiment of the present application, a sample document picture is collected, an identification area range of the sample document picture is determined, an identification template corresponding to the document picture to be identified is established, and the identification template includes coordinate positions and area names of each identification area.
It can be understood that in OCR recognition, the quality of image segmentation directly affects the recognition rate of OCR. When OCR recognition is performed on a cut-and-miss image, a correct recognition result is often not obtained. Therefore, the method and the device establish a template of the document picture to be identified, and record the coordinate positions and the area names of the identification areas in the template. And the printing area of the document picture to be identified is not identified, and the area name in the template is directly used as the characters of the printing area, so that the identification efficiency is improved.
S13: establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples;
it can be understood that, the embodiment of the application establishes the recognition template database comprising the recognition templates corresponding to various sample document pictures, so that the document pictures to be recognized with different typesetting modes can be recognized more accurately.
S14: collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified;
s15: and calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized, and performing OCR recognition on the document picture to be recognized.
It can be understood that, according to the embodiment of the application, the identification area and the area name of the document picture to be identified are determined according to the identification template matched with the document picture to be identified, the identification area can be accurately determined, OCR identification is performed on the identification area (in practice, namely, the handwriting area), and the area name in the identification template is used as the characters of the printing area.
According to the OCR recognition method based on the template matching, a recognition template database is established, the method can adapt to recognition of documents in various typesetting formats, and accuracy of OCR recognition is improved.
In an alternative implementation manner of the embodiment of the present application, similar to the method in fig. 1, step S14 identifies the border and the title of the document picture to be identified, and obtains the document type of the document picture to be identified, which includes:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
It should be noted that the borders of the document picture to be identified refer to the horizontal and vertical border of the table in the document to be identified. The specific process for extracting the frame of the document picture to be identified is as follows:
respectively selecting a horizontal structural element and a vertical structural element to perform open operation on the binarized table image after inclination correction to obtain a table horizontal line image and a table vertical line image;
performing AND operation on the table horizontal line image and the table vertical line image to obtain a table frame image;
and carrying out refinement treatment on the table frame graph, and extracting a table frame wire framework, namely extracting the frame of the document picture to be identified.
It is understood that the area in the middle of the upper portion of the document picture to be recognized is the title of the document picture to be recognized. Determining the major class of the document pictures to be identified, such as financial class or object class, according to the title; the subclasses of the document pictures to be identified, such as the money application form and the reimbursement approval form, can be further determined according to the frames, and the subclasses belong to the financial class form, but are respectively provided with different frames, so that the subclasses correspond to different document types.
In another optional implementation manner of the embodiment of the present application, the performing OCR recognition on the document picture to be recognized in step S15 includes:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Specifically, the convolutional recurrent neural network model comprises a neural network CNN, a bidirectional recurrent neural network LSTM and a joint time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Specifically, the tag distribution list corresponding to a certain feature is a softmax vector, the probability of each tag corresponding to the feature is represented, after the probabilities of all the features are transmitted to a CTC model, the most probable tag is output, and the final sequence tag, namely the text of the identification area, is obtained through operations such as space removal and the like.
It can be understood that the handwritten characters are not as regular as the printed characters, so that the effect of OCR recognition of the handwritten characters is poor.
Preferably, the collecting the document picture to be identified includes:
and acquiring a document picture to be identified by a high-definition camera with an automatic shooting angle adjusting function.
It should be noted that, the high definition digtal camera can adjust ISO value or exposure according to the intensity of ambient light to improve the quality of document picture. In a digital camera, ISO indicates the light sensing speed of a CCD or CMOS light sensing element, and a higher ISO value indicates a higher light sensing capability of the light sensing element.
In general, the lower the ISO value is, the higher the quality of the photo is, the finer the detail of the photo is, the higher the ISO value is, the higher the brightness of the photo is, the quality of the photo is reduced along with the increase of the ISO value, the noise becomes more serious, but the high ISO value can make up for the shortage of light.
In addition, the high-definition camera can adjust the shooting angle according to the document position, so that the effect of OCR recognition is prevented from being influenced by Chinese skew in the document.
OCR recognition requires a high quality input image, and often the user needs to provide a high quality image to have a good recognition quality. The resolution cannot be too low, the color cannot be too rich, the contrast cannot be too low, and the text on the image cannot be skewed.
Preferably, in the embodiment of the present application, before invoking a corresponding recognition template in the recognition template database to perform OCR recognition on the document picture to be recognized, the preprocessing of the document picture to be recognized further includes:
adjusting the brightness and contrast of the document picture to be identified;
carrying out gray processing on the document picture to be identified;
and receiving an angle adjustment instruction of a user on the document picture to be identified after gray processing, and adjusting the angle of the document picture to be identified.
In an alternative embodiment, the present application uses a Tesseactor-OCR open source framework to automatically adjust the brightness and contrast of an image; using an open source cross-platform computer vision library openCV to carry out gray processing on the image, so that the color of the image is changed into black and white to form clear contrast; the image interaction interface is displayed to enable a user to manually correct the image, and the user can adjust the angle of the image through the internal functions provided by the application program, so that characters on the image are not deflected any more.
In an alternative embodiment of the present application, the first sample document picture is configured with a plurality of region identification templates in the identification template database, and each region identification template is used for identifying a partial region of the sample document picture.
It should be noted that the first sample picture is any sample document picture in the recognition template database. By configuring a plurality of region identification templates for a certain sample document picture, the identification of partial regions of the document picture to be identified can be realized, and the efficiency of OCR identification is improved.
In practical application, taking fig. 2 as an example, a sample document is a bill document, and the sample document is configured with an area identification template a and an area identification template B, wherein the area identification template a has an identification area range including a money amount, a money department and a contract number, and the area identification template B has an identification area range including a money amount, a money department, an applicant and a department responsible person. Different area identification templates can be set according to actual service requirements.
In another embodiment of the present solution, in order to improve the efficiency of creating the template of the document picture to be identified, the frame selection may be further performed on each sample document picture so as to obtain an identification template corresponding to each sample document picture:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
It can be understood that, in the embodiment of the present application, the overall automatic frame selection is performed on the document picture to be identified, and then the result of the overall automatic frame selection is adjusted, so as to establish an identification template corresponding to the document picture to be identified, where the identification template includes the coordinate positions and the area names of the identification areas. The automatic marking area can mark the frame at any part of the area when the frame is arranged, the frame wire can be automatically adjusted to the boundary of the area, and the frame wire can be marked at the blank outside the four boundaries of the area, and can be automatically contracted to the boundary of the area.
Specifically: one-standard-number adjustment is carried out on the whole automatic frame selection result, and a plurality of areas marked by errors are combined into one area; and (3) performing multi-label one adjustment on the whole automatic frame selection result, and splitting the error mark into a plurality of areas.
The OCR recognition method establishes the template matched with the document to be recognized, can adapt to recognition of the documents with different typesetting formats, and improves recognition rate and accuracy of OCR recognition. And the influence of shooting light and angles on OCR recognition can be eliminated by adopting the high-definition camera. The OCR recognition method can also recognize the specific region of a certain document picture based on the region recognition template, so that recognition efficiency is improved.
Fig. 3 is a schematic structural diagram of an OCR recognition device based on template matching according to an embodiment of the present application. As shown in fig. 3, the apparatus of the embodiment of the present application includes:
a sample document picture collecting unit 31 for collecting sample document pictures of different appointed typesetting modes;
in practical application, the sample document picture collecting unit 31 collects sample document pictures of different specified typesetting modes through a high-definition camera with an automatic shooting angle adjusting function.
An identification template obtaining unit 32, configured to perform frame selection on each sample document picture, and obtain an identification template corresponding to each sample document picture;
it should be noted that, in the embodiment of the present application, a sample document picture is collected, an identification area range of the sample document picture is determined, an identification template corresponding to the document picture to be identified is established, and the identification template includes coordinate positions and area names of each identification area.
It can be understood that in OCR recognition, the quality of image segmentation directly affects the recognition rate of OCR. When OCR recognition is performed on a cut-and-miss image, a correct recognition result is often not obtained. Therefore, the method and the device establish a template of the document picture to be identified, and record the coordinate positions and the area names of the identification areas in the template. And the printing area of the document picture to be identified is not identified, and the area name in the template is directly used as the characters of the printing area, so that the identification efficiency is improved.
An identification template database creation unit 33, configured to create an identification template database, where identification templates corresponding to the respective sample document pictures are stored in the identification template database;
it can be understood that, the embodiment of the application establishes the recognition template database comprising the recognition templates corresponding to various sample document pictures, so that the document pictures to be recognized with different typesetting modes can be recognized more accurately.
The document type obtaining unit 34 is configured to collect a document picture to be identified, and identify a border and a title of the document picture to be identified, so as to obtain a document type of the document picture to be identified.
And the OCR recognition unit 35 is configured to collect a document picture to be recognized, and call a corresponding recognition template in the recognition template database to perform OCR recognition on the document picture to be recognized.
It can be understood that, according to the embodiment of the application, the identification area and the area name of the document picture to be identified are determined according to the identification template matched with the document picture to be identified, the identification area can be accurately determined, OCR identification is performed on the identification area (in practice, namely, the handwriting area), and the area name in the identification template is used as the characters of the printing area.
The OCR recognition device based on the template matching establishes a recognition template database, can adapt to recognition of documents in various typesetting formats, and improves the accuracy of OCR recognition.
Optionally, the document type obtaining unit 34 is further configured to:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
It should be noted that the borders of the document picture to be identified refer to the horizontal and vertical border of the table in the document to be identified. The specific process for extracting the frame of the document picture to be identified is as follows:
respectively selecting a horizontal structural element and a vertical structural element to perform open operation on the binarized table image after inclination correction to obtain a table horizontal line image and a table vertical line image;
performing AND operation on the table horizontal line image and the table vertical line image to obtain a table frame image;
and carrying out refinement treatment on the table frame graph, and extracting a table frame wire framework, namely extracting the frame of the document picture to be identified.
It is understood that the area in the middle of the upper portion of the document picture to be recognized is the title of the document picture to be recognized. Determining the major class of the document pictures to be identified, such as financial class or object class, according to the title; the subclasses of the document pictures to be identified, such as the money application form and the reimbursement approval form, can be further determined according to the frames, and the subclasses belong to the financial class form, but are respectively provided with different frames, so that the subclasses correspond to different document types.
Optionally, the OCR recognition unit 35 is further configured to:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Specifically, the convolutional recurrent neural network model comprises a neural network CNN, a bidirectional recurrent neural network LSTM and a joint time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Specifically, the tag distribution list corresponding to a certain feature is a softmax vector, the probability of each tag corresponding to the feature is represented, after the probabilities of all the features are transmitted to a CTC model, the most probable tag is output, and the final sequence tag, namely the text of the identification area, is obtained through operations such as space removal and the like.
It can be understood that the handwritten characters are not as regular as the printed characters, so that the effect of OCR recognition of the handwritten characters is poor.
OCR recognition requires a high quality input image, and often the user needs to provide a high quality image to have a good recognition quality. The resolution cannot be too low, the color cannot be too rich, the contrast cannot be too low, and the text on the image cannot be skewed.
Optionally, the apparatus further comprises:
the picture adjusting unit is used for adjusting the brightness and the contrast of the document picture to be identified;
the gray processing unit is used for carrying out gray processing on the document picture to be identified;
and the angle adjusting unit is used for receiving an angle adjusting instruction of a user on the document picture to be identified after the gray level processing and adjusting the angle of the document picture to be identified.
The image adjusting unit automatically adjusts the brightness and contrast of the image by using a Tesseact-OCR open source frame; the gray processing unit uses an open source cross-platform computer vision library openCV to perform gray processing on the image, so that the color of the image is changed into black and white, and bright contrast is formed; the angle adjusting unit displays an image interaction interface to enable a user to manually correct the image, and the user can adjust the angle of the image through an internal function provided by an application program, so that characters on the image are not deflected any more.
In an alternative embodiment of the present application, the first sample document picture is configured with a plurality of region identification templates in the identification template database, and each region identification template is used for identifying a partial region of the sample document picture.
It should be noted that the first sample picture is any sample document picture in the recognition template database. By configuring a plurality of region identification templates for a certain sample document picture, the identification of partial regions of the document picture to be identified can be realized, and the efficiency of OCR identification is improved.
In practical application, taking fig. 2 as an example, a sample document is a bill document, and the sample document is configured with an area identification template a and an area identification template B, wherein the area identification template a has an identification area range including a money amount, a money department and a contract number, and the area identification template B has an identification area range including a money amount, a money department, an applicant and a department responsible person. Different area identification templates can be set according to actual service requirements.
Specifically, the recognition template acquiring unit 32 is further configured to:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
The OCR recognition device establishes the template matched with the document to be recognized, can adapt to recognition of the documents with various typesetting formats, and improves recognition rate and accuracy of OCR recognition. And the influence of shooting light and angles on OCR recognition can be eliminated by adopting the high-definition camera. The OCR recognition method can also recognize the specific region of a certain document picture based on the region recognition template, so that recognition efficiency is improved.
It should be noted that, for other corresponding descriptions of each functional unit related to the OCR recognition device based on the template matching provided in the embodiment of the present application, reference may be made to corresponding descriptions in fig. 1 and fig. 2.
Based on the method shown in fig. 1, correspondingly, the embodiment of the application also provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the OCR recognition method based on template matching shown in fig. 1 is implemented.
Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.
Based on the method shown in fig. 1 and the virtual device embodiment shown in fig. 3, in order to achieve the above objective, the embodiment of the present application further provides a computer device, which may specifically be a personal computer, a server, a network device, etc., where the entity device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the above-described template matching-based OCR recognition method as shown in fig. 1.
Optionally, the computer device may also include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., bluetooth interface, WI-FI interface), etc.
It will be appreciated by those skilled in the art that the architecture of a computer device provided in this embodiment is not limited to this physical device, but may include more or fewer components, or may be combined with certain components, or may be arranged in a different arrangement of components.
The storage medium may also include an operating system, a network communication module. An operating system is a program that manages the hardware and software resources of a computer device, supporting the execution of information handling programs, as well as other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the entity equipment.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. By applying the technical scheme of the application.
It should be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the description of the present application, numerous specific details are set forth. It may be appreciated, however, that embodiments of the present application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description. Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the method of this application should not be interpreted as reflecting the intent: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.
Claims (7)
1. An OCR recognition method based on template matching, comprising:
collecting sample document pictures of different appointed typesetting modes;
performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples;
collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified;
invoking a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized obtained by recognition to perform OCR recognition on the document picture to be recognized;
the identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified includes:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified;
the OCR for the document picture to be identified comprises the following steps:
performing OCR (optical character recognition) on the document picture to be recognized by adopting a convolutional cyclic neural network model;
the frame selection of each sample document picture comprises the following steps:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
2. The method of claim 1, wherein the convolutional recurrent neural network model comprises a neural network CNN, a bi-directional recurrent neural network LSTM, and a join time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
3. The method of claim 1, wherein prior to invoking the corresponding recognition template in the recognition template database to OCR recognize the document picture to be recognized, the method further comprises:
adjusting the brightness and contrast of the document picture to be identified;
carrying out gray processing on the document picture to be identified;
and receiving an angle adjustment instruction of a user on the document picture to be identified after gray processing, and adjusting the angle of the document picture to be identified.
4. The method of claim 1, wherein a first sample document picture is configured with a plurality of region identification templates in the identification template database, each region identification template for identifying a partial region of the sample document picture.
5. An OCR recognition device based on template matching, comprising:
the sample document picture collecting unit is used for collecting sample document pictures of different appointed typesetting modes;
the identification template acquisition unit is used for carrying out frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
the identification template database establishing unit is used for establishing an identification template database, and the identification template database stores identification templates corresponding to the document pictures of the samples;
the document type acquisition unit is used for acquiring a document picture to be identified, identifying the border and the title of the document picture to be identified and obtaining the document type of the document picture to be identified;
the OCR recognition unit is used for calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized obtained by recognition to perform OCR recognition on the document picture to be recognized;
the document type acquisition unit is configured to:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified;
the OCR recognition unit is used for:
performing OCR (optical character recognition) on the document picture to be recognized by adopting a convolutional cyclic neural network model;
the identification template acquisition unit is used for:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
6. A storage medium having stored thereon a computer program, which when executed by a processor implements the template matching based OCR recognition method of any one of claims 1 to 4.
7. A computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, characterized in that the processor implements the template matching based OCR recognition method of any one of claims 1 to 4 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910127136.1A CN110008944B (en) | 2019-02-20 | 2019-02-20 | OCR recognition method and device based on template matching and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910127136.1A CN110008944B (en) | 2019-02-20 | 2019-02-20 | OCR recognition method and device based on template matching and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110008944A CN110008944A (en) | 2019-07-12 |
CN110008944B true CN110008944B (en) | 2024-02-13 |
Family
ID=67165937
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910127136.1A Active CN110008944B (en) | 2019-02-20 | 2019-02-20 | OCR recognition method and device based on template matching and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110008944B (en) |
Families Citing this family (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443317A (en) * | 2019-08-09 | 2019-11-12 | 上海尧眸电气科技有限公司 | A kind of method, apparatus and electronic equipment of paper shelves electronic data processing |
CN110909733A (en) * | 2019-10-28 | 2020-03-24 | 世纪保众(北京)网络科技有限公司 | Template positioning method and device based on OCR picture recognition and computer equipment |
CN110866457A (en) * | 2019-10-28 | 2020-03-06 | 世纪保众(北京)网络科技有限公司 | Electronic insurance policy obtaining method and device, computer equipment and storage medium |
CN110931097A (en) * | 2019-11-04 | 2020-03-27 | 武汉市纽艾云健康科技有限公司 | Processing and analyzing system for inspection report |
CN111046736B (en) * | 2019-11-14 | 2021-04-16 | 北京房江湖科技有限公司 | Method, device and storage medium for extracting text information |
CN110866388A (en) * | 2019-11-19 | 2020-03-06 | 重庆华龙网海数科技有限公司 | Publishing PDF layout analysis and identification method based on mixing of multiple neural networks |
CN111126380A (en) * | 2019-12-02 | 2020-05-08 | 贵州电网有限责任公司 | Method and system for identifying signature of nameplate of power equipment |
CN111047261B (en) * | 2019-12-11 | 2023-06-16 | 青岛盈智科技有限公司 | Warehouse logistics order identification method and system |
CN111178365A (en) * | 2019-12-31 | 2020-05-19 | 五八有限公司 | Picture character recognition method and device, electronic equipment and storage medium |
CN111291742B (en) | 2020-02-10 | 2023-08-04 | 北京百度网讯科技有限公司 | Object recognition method and device, electronic equipment and storage medium |
CN111507324B (en) * | 2020-03-16 | 2024-05-31 | 平安科技(深圳)有限公司 | Card frame recognition method, device, equipment and computer storage medium |
CN111476227B (en) * | 2020-03-17 | 2024-04-05 | 平安科技(深圳)有限公司 | Target field identification method and device based on OCR and storage medium |
CN111428725A (en) * | 2020-04-13 | 2020-07-17 | 北京令才科技有限公司 | Data structuring processing method and device and electronic equipment |
CN113537221A (en) * | 2020-04-15 | 2021-10-22 | 阿里巴巴集团控股有限公司 | Image recognition method, device and equipment |
CN111680679A (en) * | 2020-06-03 | 2020-09-18 | 重庆数道科技有限公司 | Automatic document identification method based on OCR |
CN111695566B (en) * | 2020-06-18 | 2023-03-14 | 郑州大学 | Method and system for identifying and processing fixed format document |
CN111782808A (en) * | 2020-06-29 | 2020-10-16 | 北京市商汤科技开发有限公司 | Document processing method, device, equipment and computer readable storage medium |
CN112149399B (en) * | 2020-09-25 | 2024-06-04 | 北京来也网络科技有限公司 | Table information extraction method, device, equipment and medium based on RPA and AI |
CN112348022B (en) * | 2020-10-28 | 2024-05-07 | 富邦华一银行有限公司 | Free-form document identification method based on deep learning |
CN112364790B (en) * | 2020-11-16 | 2022-10-25 | 中国民航大学 | Airport work order information identification method and system based on convolutional neural network |
CN112418215A (en) * | 2020-11-17 | 2021-02-26 | 峰米(北京)科技有限公司 | Video classification identification method and device, storage medium and equipment |
CN112508011A (en) * | 2020-12-02 | 2021-03-16 | 上海逸舟信息科技有限公司 | OCR (optical character recognition) method and device based on neural network |
CN112580499A (en) * | 2020-12-17 | 2021-03-30 | 上海眼控科技股份有限公司 | Text recognition method, device, equipment and storage medium |
CN112801084A (en) * | 2021-01-29 | 2021-05-14 | 杭州大拿科技股份有限公司 | Image processing method and device, electronic equipment and storage medium |
CN112818961A (en) * | 2021-03-26 | 2021-05-18 | 北京东方金朔信息技术有限公司 | Image feature identification method and device |
CN112906695B (en) * | 2021-04-14 | 2022-03-08 | 数库(上海)科技有限公司 | Form recognition method adapting to multi-class OCR recognition interface and related equipment |
CN113111829B (en) * | 2021-04-23 | 2023-04-07 | 杭州睿胜软件有限公司 | Method and device for identifying document |
CN114564912A (en) * | 2021-11-30 | 2022-05-31 | 中国电子科技集团公司第十五研究所 | Intelligent checking and correcting method and system for document format |
CN115830620B (en) * | 2023-02-14 | 2023-05-30 | 江苏联著实业股份有限公司 | Archive text data processing method and system based on OCR |
CN116137077B (en) * | 2023-04-13 | 2023-08-08 | 宁波为昕科技有限公司 | Method and device for establishing electronic component library, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258198A (en) * | 2013-04-26 | 2013-08-21 | 四川大学 | Extraction method for characters in form document image |
CN103810485A (en) * | 2014-01-22 | 2014-05-21 | 深圳市东信时代信息技术有限公司 | Recognition device, character recognition system and method |
CN108549881A (en) * | 2018-05-02 | 2018-09-18 | 杭州创匠信息科技有限公司 | The recognition methods of certificate word and device |
CN109086714A (en) * | 2018-07-31 | 2018-12-25 | 国科赛思(北京)科技有限公司 | Table recognition method, identifying system and computer installation |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8014604B2 (en) * | 2008-04-16 | 2011-09-06 | International Business Machines Corporation | OCR of books by word recognition |
RU2651144C2 (en) * | 2014-03-31 | 2018-04-18 | Общество с ограниченной ответственностью "Аби Девелопмент" | Data input from images of the documents with fixed structure |
-
2019
- 2019-02-20 CN CN201910127136.1A patent/CN110008944B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258198A (en) * | 2013-04-26 | 2013-08-21 | 四川大学 | Extraction method for characters in form document image |
CN103810485A (en) * | 2014-01-22 | 2014-05-21 | 深圳市东信时代信息技术有限公司 | Recognition device, character recognition system and method |
CN108549881A (en) * | 2018-05-02 | 2018-09-18 | 杭州创匠信息科技有限公司 | The recognition methods of certificate word and device |
CN109086714A (en) * | 2018-07-31 | 2018-12-25 | 国科赛思(北京)科技有限公司 | Table recognition method, identifying system and computer installation |
Also Published As
Publication number | Publication date |
---|---|
CN110008944A (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110008944B (en) | OCR recognition method and device based on template matching and storage medium | |
CN109492643B (en) | Certificate identification method and device based on OCR, computer equipment and storage medium | |
CN111652223B (en) | Certificate identification method and device | |
US8908975B2 (en) | Apparatus and method for automatically recognizing a QR code | |
CN111353497B (en) | Identification method and device for identity card information | |
CN110766014A (en) | Bill information positioning method, system and computer readable storage medium | |
CN110705534B (en) | Wrong problem book generation method suitable for electronic typoscope | |
US8755595B1 (en) | Automatic extraction of character ground truth data from images | |
US20060256388A1 (en) | Semantic classification and enhancement processing of images for printing applications | |
CN104143094A (en) | Test paper automatic test paper marking processing method and system without answer sheet | |
US11341739B2 (en) | Image processing device, image processing method, and program recording medium | |
CN102360419A (en) | Method and system for computer scanning reading management | |
CN112434699A (en) | Automatic extraction and intelligent scoring system for handwritten Chinese characters or components and strokes | |
CN108304562B (en) | Question searching method and device and intelligent terminal | |
CN110598566A (en) | Image processing method, device, terminal and computer readable storage medium | |
CN110909740A (en) | Information processing apparatus and storage medium | |
US11881043B2 (en) | Image processing system, image processing method, and program | |
CN111915635A (en) | Test question analysis information generation method and system supporting self-examination paper marking | |
CN112686257A (en) | Storefront character recognition method and system based on OCR | |
US20150379339A1 (en) | Techniques for detecting user-entered check marks | |
CN105678301B (en) | method, system and device for automatically identifying and segmenting text image | |
US20210209393A1 (en) | Image processing system, image processing method, and program | |
CN116092231A (en) | Ticket identification method, ticket identification device, terminal equipment and storage medium | |
CN113159029A (en) | Method and system for accurately capturing local information in picture | |
CN111213157A (en) | Express information input method and system based on intelligent terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |