CN110008944B - OCR recognition method and device based on template matching and storage medium - Google Patents

OCR recognition method and device based on template matching and storage medium Download PDF

Info

Publication number
CN110008944B
CN110008944B CN201910127136.1A CN201910127136A CN110008944B CN 110008944 B CN110008944 B CN 110008944B CN 201910127136 A CN201910127136 A CN 201910127136A CN 110008944 B CN110008944 B CN 110008944B
Authority
CN
China
Prior art keywords
document
document picture
identified
identification
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910127136.1A
Other languages
Chinese (zh)
Other versions
CN110008944A (en
Inventor
高梁梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910127136.1A priority Critical patent/CN110008944B/en
Publication of CN110008944A publication Critical patent/CN110008944A/en
Application granted granted Critical
Publication of CN110008944B publication Critical patent/CN110008944B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Character Input (AREA)

Abstract

The application discloses an OCR (optical character recognition) method and device based on template matching, a storage medium and computer equipment, and relates to the technical field of information processing. The method comprises the following steps: collecting sample document pictures of different appointed typesetting modes; performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture; establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples; collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified; and calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized, and performing OCR recognition on the document picture to be recognized. The recognition template database is established, so that the recognition template database can adapt to recognition of documents with various typesetting formats, and the accuracy of OCR recognition is improved.

Description

OCR recognition method and device based on template matching and storage medium
Technical Field
The present invention relates to the field of information processing technologies, and in particular, to an OCR recognition method and apparatus based on template matching, a storage medium, and a computer device.
Background
The optical character recognition (Optical Character Recognition, OCR) method refers to obtaining an electronic document of a paper document by an electronic device (e.g., a scanner or a digital camera), cutting character strings in the electronic document apart to form a small picture containing a single character, and then recognizing the cut text by using a certain method.
The conventional OCR recognition method can only accurately recognize the pictures with fixed character typesets such as identity cards, bank cards and the like because of various character typesets in the pictures to be recognized, but has poor picture recognition effect on other documents.
Disclosure of Invention
In view of this, the present application provides an OCR recognition method and apparatus, a storage medium, and a computer device based on template matching, which mainly aims to solve the problem of poor recognition effect of the existing OCR recognition method.
According to one aspect of the present application, there is provided an OCR recognition method based on template matching, the method comprising:
collecting sample document pictures of different appointed typesetting modes;
performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples;
collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified;
and calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized, and performing OCR recognition on the document picture to be recognized.
Optionally, the identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified includes:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
Optionally, the performing OCR recognition on the document picture to be recognized includes:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Optionally, the convolutional recurrent neural network model includes a neural network CNN, a bidirectional recurrent neural network LSTM, and a join time-classified CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Optionally, the collecting the document picture to be identified includes:
and acquiring a document picture to be identified by a high-definition camera with an automatic shooting angle adjusting function.
Optionally, before invoking the corresponding recognition template in the recognition template database to perform OCR recognition on the document picture to be recognized, the method further includes:
adjusting the brightness and contrast of the document picture to be identified;
carrying out gray processing on the document picture to be identified;
and receiving an angle adjustment instruction of a user on the document picture to be identified after gray processing, and adjusting the angle of the document picture to be identified.
Optionally, the first sample document picture is configured with a plurality of region identification templates in the identification template database, each region identification template being used for identifying a partial region of the sample document picture.
Optionally, the framing each sample document picture includes:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
According to another aspect of the present application, there is provided an OCR recognition device based on template matching, the device comprising:
the sample document picture collecting unit is used for collecting sample document pictures of different appointed typesetting modes;
the identification template acquisition unit is used for carrying out frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
the identification template database establishing unit is used for establishing an identification template database, and the identification template database stores identification templates corresponding to the document pictures of the samples;
the document type acquisition unit is used for acquiring a document picture to be identified, identifying the border and the title of the document picture to be identified and obtaining the document type of the document picture to be identified;
and the OCR recognition unit is used for calling a corresponding recognition template in the recognition template database according to the document type of the recognized document picture to be recognized so as to perform OCR recognition on the document picture to be recognized.
Optionally, the document type obtaining unit is further configured to:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
Optionally, the OCR recognition unit is further configured to:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Specifically, the convolutional recurrent neural network model comprises a neural network CNN, a bidirectional recurrent neural network LSTM and a joint time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Optionally, the apparatus further comprises:
the picture adjusting unit is used for adjusting the brightness and the contrast of the document picture to be identified;
the gray processing unit is used for carrying out gray processing on the document picture to be identified;
and the angle adjusting unit is used for receiving an angle adjusting instruction of a user on the document picture to be identified after the gray level processing and adjusting the angle of the document picture to be identified.
Optionally, the first sample document picture is configured with a plurality of region identification templates in the identification template database, each region identification template being used for identifying a partial region of the sample document picture.
Specifically, the identification template acquisition unit is further configured to:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
According to yet another aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described template matching-based OCR recognition method.
According to yet another aspect of the present application, there is provided a computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, the processor implementing the above-described template matching-based OCR recognition method when executing the program.
By means of the technical scheme, the OCR recognition method and device based on template matching, the storage medium and the computer equipment establish a recognition template database, can adapt to recognition of documents with different typesetting formats, and improve accuracy of OCR recognition.
In addition, the high-definition camera is also adopted to collect the picture of the document to be identified, so that the influence of shooting light and angles on OCR identification can be eliminated. And the specific region of a certain document picture can be identified based on the region identification template, so that the identification efficiency is improved.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 shows a schematic flow chart of an OCR recognition method based on template matching according to an embodiment of the present application;
FIG. 2 shows a schematic diagram of a sample document provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of an OCR recognition device based on template matching according to an embodiment of the present application.
Detailed Description
The present application will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.
Aiming at the problem of poor recognition effect of the existing OCR recognition method. The embodiment provides an OCR recognition method based on template matching, which can adapt to recognition of documents in a plurality of different typesetting formats and improve the accuracy of OCR recognition, as shown in fig. 1, and comprises the following steps:
s11: collecting sample document pictures of different appointed typesetting modes;
in practical application, sample document pictures in different appointed typesetting modes are collected through a high-definition camera with an automatic shooting angle adjusting function.
S12: performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
it should be noted that, in the embodiment of the present application, a sample document picture is collected, an identification area range of the sample document picture is determined, an identification template corresponding to the document picture to be identified is established, and the identification template includes coordinate positions and area names of each identification area.
It can be understood that in OCR recognition, the quality of image segmentation directly affects the recognition rate of OCR. When OCR recognition is performed on a cut-and-miss image, a correct recognition result is often not obtained. Therefore, the method and the device establish a template of the document picture to be identified, and record the coordinate positions and the area names of the identification areas in the template. And the printing area of the document picture to be identified is not identified, and the area name in the template is directly used as the characters of the printing area, so that the identification efficiency is improved.
S13: establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples;
it can be understood that, the embodiment of the application establishes the recognition template database comprising the recognition templates corresponding to various sample document pictures, so that the document pictures to be recognized with different typesetting modes can be recognized more accurately.
S14: collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified;
s15: and calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized, and performing OCR recognition on the document picture to be recognized.
It can be understood that, according to the embodiment of the application, the identification area and the area name of the document picture to be identified are determined according to the identification template matched with the document picture to be identified, the identification area can be accurately determined, OCR identification is performed on the identification area (in practice, namely, the handwriting area), and the area name in the identification template is used as the characters of the printing area.
According to the OCR recognition method based on the template matching, a recognition template database is established, the method can adapt to recognition of documents in various typesetting formats, and accuracy of OCR recognition is improved.
In an alternative implementation manner of the embodiment of the present application, similar to the method in fig. 1, step S14 identifies the border and the title of the document picture to be identified, and obtains the document type of the document picture to be identified, which includes:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
It should be noted that the borders of the document picture to be identified refer to the horizontal and vertical border of the table in the document to be identified. The specific process for extracting the frame of the document picture to be identified is as follows:
respectively selecting a horizontal structural element and a vertical structural element to perform open operation on the binarized table image after inclination correction to obtain a table horizontal line image and a table vertical line image;
performing AND operation on the table horizontal line image and the table vertical line image to obtain a table frame image;
and carrying out refinement treatment on the table frame graph, and extracting a table frame wire framework, namely extracting the frame of the document picture to be identified.
It is understood that the area in the middle of the upper portion of the document picture to be recognized is the title of the document picture to be recognized. Determining the major class of the document pictures to be identified, such as financial class or object class, according to the title; the subclasses of the document pictures to be identified, such as the money application form and the reimbursement approval form, can be further determined according to the frames, and the subclasses belong to the financial class form, but are respectively provided with different frames, so that the subclasses correspond to different document types.
In another optional implementation manner of the embodiment of the present application, the performing OCR recognition on the document picture to be recognized in step S15 includes:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Specifically, the convolutional recurrent neural network model comprises a neural network CNN, a bidirectional recurrent neural network LSTM and a joint time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Specifically, the tag distribution list corresponding to a certain feature is a softmax vector, the probability of each tag corresponding to the feature is represented, after the probabilities of all the features are transmitted to a CTC model, the most probable tag is output, and the final sequence tag, namely the text of the identification area, is obtained through operations such as space removal and the like.
It can be understood that the handwritten characters are not as regular as the printed characters, so that the effect of OCR recognition of the handwritten characters is poor.
Preferably, the collecting the document picture to be identified includes:
and acquiring a document picture to be identified by a high-definition camera with an automatic shooting angle adjusting function.
It should be noted that, the high definition digtal camera can adjust ISO value or exposure according to the intensity of ambient light to improve the quality of document picture. In a digital camera, ISO indicates the light sensing speed of a CCD or CMOS light sensing element, and a higher ISO value indicates a higher light sensing capability of the light sensing element.
In general, the lower the ISO value is, the higher the quality of the photo is, the finer the detail of the photo is, the higher the ISO value is, the higher the brightness of the photo is, the quality of the photo is reduced along with the increase of the ISO value, the noise becomes more serious, but the high ISO value can make up for the shortage of light.
In addition, the high-definition camera can adjust the shooting angle according to the document position, so that the effect of OCR recognition is prevented from being influenced by Chinese skew in the document.
OCR recognition requires a high quality input image, and often the user needs to provide a high quality image to have a good recognition quality. The resolution cannot be too low, the color cannot be too rich, the contrast cannot be too low, and the text on the image cannot be skewed.
Preferably, in the embodiment of the present application, before invoking a corresponding recognition template in the recognition template database to perform OCR recognition on the document picture to be recognized, the preprocessing of the document picture to be recognized further includes:
adjusting the brightness and contrast of the document picture to be identified;
carrying out gray processing on the document picture to be identified;
and receiving an angle adjustment instruction of a user on the document picture to be identified after gray processing, and adjusting the angle of the document picture to be identified.
In an alternative embodiment, the present application uses a Tesseactor-OCR open source framework to automatically adjust the brightness and contrast of an image; using an open source cross-platform computer vision library openCV to carry out gray processing on the image, so that the color of the image is changed into black and white to form clear contrast; the image interaction interface is displayed to enable a user to manually correct the image, and the user can adjust the angle of the image through the internal functions provided by the application program, so that characters on the image are not deflected any more.
In an alternative embodiment of the present application, the first sample document picture is configured with a plurality of region identification templates in the identification template database, and each region identification template is used for identifying a partial region of the sample document picture.
It should be noted that the first sample picture is any sample document picture in the recognition template database. By configuring a plurality of region identification templates for a certain sample document picture, the identification of partial regions of the document picture to be identified can be realized, and the efficiency of OCR identification is improved.
In practical application, taking fig. 2 as an example, a sample document is a bill document, and the sample document is configured with an area identification template a and an area identification template B, wherein the area identification template a has an identification area range including a money amount, a money department and a contract number, and the area identification template B has an identification area range including a money amount, a money department, an applicant and a department responsible person. Different area identification templates can be set according to actual service requirements.
In another embodiment of the present solution, in order to improve the efficiency of creating the template of the document picture to be identified, the frame selection may be further performed on each sample document picture so as to obtain an identification template corresponding to each sample document picture:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
It can be understood that, in the embodiment of the present application, the overall automatic frame selection is performed on the document picture to be identified, and then the result of the overall automatic frame selection is adjusted, so as to establish an identification template corresponding to the document picture to be identified, where the identification template includes the coordinate positions and the area names of the identification areas. The automatic marking area can mark the frame at any part of the area when the frame is arranged, the frame wire can be automatically adjusted to the boundary of the area, and the frame wire can be marked at the blank outside the four boundaries of the area, and can be automatically contracted to the boundary of the area.
Specifically: one-standard-number adjustment is carried out on the whole automatic frame selection result, and a plurality of areas marked by errors are combined into one area; and (3) performing multi-label one adjustment on the whole automatic frame selection result, and splitting the error mark into a plurality of areas.
The OCR recognition method establishes the template matched with the document to be recognized, can adapt to recognition of the documents with different typesetting formats, and improves recognition rate and accuracy of OCR recognition. And the influence of shooting light and angles on OCR recognition can be eliminated by adopting the high-definition camera. The OCR recognition method can also recognize the specific region of a certain document picture based on the region recognition template, so that recognition efficiency is improved.
Fig. 3 is a schematic structural diagram of an OCR recognition device based on template matching according to an embodiment of the present application. As shown in fig. 3, the apparatus of the embodiment of the present application includes:
a sample document picture collecting unit 31 for collecting sample document pictures of different appointed typesetting modes;
in practical application, the sample document picture collecting unit 31 collects sample document pictures of different specified typesetting modes through a high-definition camera with an automatic shooting angle adjusting function.
An identification template obtaining unit 32, configured to perform frame selection on each sample document picture, and obtain an identification template corresponding to each sample document picture;
it should be noted that, in the embodiment of the present application, a sample document picture is collected, an identification area range of the sample document picture is determined, an identification template corresponding to the document picture to be identified is established, and the identification template includes coordinate positions and area names of each identification area.
It can be understood that in OCR recognition, the quality of image segmentation directly affects the recognition rate of OCR. When OCR recognition is performed on a cut-and-miss image, a correct recognition result is often not obtained. Therefore, the method and the device establish a template of the document picture to be identified, and record the coordinate positions and the area names of the identification areas in the template. And the printing area of the document picture to be identified is not identified, and the area name in the template is directly used as the characters of the printing area, so that the identification efficiency is improved.
An identification template database creation unit 33, configured to create an identification template database, where identification templates corresponding to the respective sample document pictures are stored in the identification template database;
it can be understood that, the embodiment of the application establishes the recognition template database comprising the recognition templates corresponding to various sample document pictures, so that the document pictures to be recognized with different typesetting modes can be recognized more accurately.
The document type obtaining unit 34 is configured to collect a document picture to be identified, and identify a border and a title of the document picture to be identified, so as to obtain a document type of the document picture to be identified.
And the OCR recognition unit 35 is configured to collect a document picture to be recognized, and call a corresponding recognition template in the recognition template database to perform OCR recognition on the document picture to be recognized.
It can be understood that, according to the embodiment of the application, the identification area and the area name of the document picture to be identified are determined according to the identification template matched with the document picture to be identified, the identification area can be accurately determined, OCR identification is performed on the identification area (in practice, namely, the handwriting area), and the area name in the identification template is used as the characters of the printing area.
The OCR recognition device based on the template matching establishes a recognition template database, can adapt to recognition of documents in various typesetting formats, and improves the accuracy of OCR recognition.
Optionally, the document type obtaining unit 34 is further configured to:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
and obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified.
It should be noted that the borders of the document picture to be identified refer to the horizontal and vertical border of the table in the document to be identified. The specific process for extracting the frame of the document picture to be identified is as follows:
respectively selecting a horizontal structural element and a vertical structural element to perform open operation on the binarized table image after inclination correction to obtain a table horizontal line image and a table vertical line image;
performing AND operation on the table horizontal line image and the table vertical line image to obtain a table frame image;
and carrying out refinement treatment on the table frame graph, and extracting a table frame wire framework, namely extracting the frame of the document picture to be identified.
It is understood that the area in the middle of the upper portion of the document picture to be recognized is the title of the document picture to be recognized. Determining the major class of the document pictures to be identified, such as financial class or object class, according to the title; the subclasses of the document pictures to be identified, such as the money application form and the reimbursement approval form, can be further determined according to the frames, and the subclasses belong to the financial class form, but are respectively provided with different frames, so that the subclasses correspond to different document types.
Optionally, the OCR recognition unit 35 is further configured to:
and adopting a convolutional cyclic neural network model to perform OCR recognition on the document picture to be recognized.
Specifically, the convolutional recurrent neural network model comprises a neural network CNN, a bidirectional recurrent neural network LSTM and a joint time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
Specifically, the tag distribution list corresponding to a certain feature is a softmax vector, the probability of each tag corresponding to the feature is represented, after the probabilities of all the features are transmitted to a CTC model, the most probable tag is output, and the final sequence tag, namely the text of the identification area, is obtained through operations such as space removal and the like.
It can be understood that the handwritten characters are not as regular as the printed characters, so that the effect of OCR recognition of the handwritten characters is poor.
OCR recognition requires a high quality input image, and often the user needs to provide a high quality image to have a good recognition quality. The resolution cannot be too low, the color cannot be too rich, the contrast cannot be too low, and the text on the image cannot be skewed.
Optionally, the apparatus further comprises:
the picture adjusting unit is used for adjusting the brightness and the contrast of the document picture to be identified;
the gray processing unit is used for carrying out gray processing on the document picture to be identified;
and the angle adjusting unit is used for receiving an angle adjusting instruction of a user on the document picture to be identified after the gray level processing and adjusting the angle of the document picture to be identified.
The image adjusting unit automatically adjusts the brightness and contrast of the image by using a Tesseact-OCR open source frame; the gray processing unit uses an open source cross-platform computer vision library openCV to perform gray processing on the image, so that the color of the image is changed into black and white, and bright contrast is formed; the angle adjusting unit displays an image interaction interface to enable a user to manually correct the image, and the user can adjust the angle of the image through an internal function provided by an application program, so that characters on the image are not deflected any more.
In an alternative embodiment of the present application, the first sample document picture is configured with a plurality of region identification templates in the identification template database, and each region identification template is used for identifying a partial region of the sample document picture.
It should be noted that the first sample picture is any sample document picture in the recognition template database. By configuring a plurality of region identification templates for a certain sample document picture, the identification of partial regions of the document picture to be identified can be realized, and the efficiency of OCR identification is improved.
In practical application, taking fig. 2 as an example, a sample document is a bill document, and the sample document is configured with an area identification template a and an area identification template B, wherein the area identification template a has an identification area range including a money amount, a money department and a contract number, and the area identification template B has an identification area range including a money amount, a money department, an applicant and a department responsible person. Different area identification templates can be set according to actual service requirements.
Specifically, the recognition template acquiring unit 32 is further configured to:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
The OCR recognition device establishes the template matched with the document to be recognized, can adapt to recognition of the documents with various typesetting formats, and improves recognition rate and accuracy of OCR recognition. And the influence of shooting light and angles on OCR recognition can be eliminated by adopting the high-definition camera. The OCR recognition method can also recognize the specific region of a certain document picture based on the region recognition template, so that recognition efficiency is improved.
It should be noted that, for other corresponding descriptions of each functional unit related to the OCR recognition device based on the template matching provided in the embodiment of the present application, reference may be made to corresponding descriptions in fig. 1 and fig. 2.
Based on the method shown in fig. 1, correspondingly, the embodiment of the application also provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the OCR recognition method based on template matching shown in fig. 1 is implemented.
Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.
Based on the method shown in fig. 1 and the virtual device embodiment shown in fig. 3, in order to achieve the above objective, the embodiment of the present application further provides a computer device, which may specifically be a personal computer, a server, a network device, etc., where the entity device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the above-described template matching-based OCR recognition method as shown in fig. 1.
Optionally, the computer device may also include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., bluetooth interface, WI-FI interface), etc.
It will be appreciated by those skilled in the art that the architecture of a computer device provided in this embodiment is not limited to this physical device, but may include more or fewer components, or may be combined with certain components, or may be arranged in a different arrangement of components.
The storage medium may also include an operating system, a network communication module. An operating system is a program that manages the hardware and software resources of a computer device, supporting the execution of information handling programs, as well as other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the entity equipment.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. By applying the technical scheme of the application.
It should be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the description of the present application, numerous specific details are set forth. It may be appreciated, however, that embodiments of the present application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description. Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the method of this application should not be interpreted as reflecting the intent: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims (7)

1. An OCR recognition method based on template matching, comprising:
collecting sample document pictures of different appointed typesetting modes;
performing frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
establishing an identification template database, wherein the identification template database stores identification templates corresponding to the document pictures of the samples;
collecting a document picture to be identified, and identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified;
invoking a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized obtained by recognition to perform OCR recognition on the document picture to be recognized;
the identifying the border and the title of the document picture to be identified to obtain the document type of the document picture to be identified includes:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified;
the OCR for the document picture to be identified comprises the following steps:
performing OCR (optical character recognition) on the document picture to be recognized by adopting a convolutional cyclic neural network model;
the frame selection of each sample document picture comprises the following steps:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
2. The method of claim 1, wherein the convolutional recurrent neural network model comprises a neural network CNN, a bi-directional recurrent neural network LSTM, and a join time classification CTC model;
the adopting the convolutional neural network model to perform OCR (optical character recognition) on the document picture to be recognized comprises the following steps:
the neural network CNN extracts the characteristics of the identification area of the document picture to be identified and generates a characteristic sequence of the identification area;
determining a label distribution list corresponding to each feature in the feature sequence by the bidirectional cyclic neural network LSTM;
and determining the characters of the identification area by the joint time classification CTC model according to the label distribution list corresponding to each characteristic.
3. The method of claim 1, wherein prior to invoking the corresponding recognition template in the recognition template database to OCR recognize the document picture to be recognized, the method further comprises:
adjusting the brightness and contrast of the document picture to be identified;
carrying out gray processing on the document picture to be identified;
and receiving an angle adjustment instruction of a user on the document picture to be identified after gray processing, and adjusting the angle of the document picture to be identified.
4. The method of claim 1, wherein a first sample document picture is configured with a plurality of region identification templates in the identification template database, each region identification template for identifying a partial region of the sample document picture.
5. An OCR recognition device based on template matching, comprising:
the sample document picture collecting unit is used for collecting sample document pictures of different appointed typesetting modes;
the identification template acquisition unit is used for carrying out frame selection on each sample document picture to obtain an identification template corresponding to each sample document picture;
the identification template database establishing unit is used for establishing an identification template database, and the identification template database stores identification templates corresponding to the document pictures of the samples;
the document type acquisition unit is used for acquiring a document picture to be identified, identifying the border and the title of the document picture to be identified and obtaining the document type of the document picture to be identified;
the OCR recognition unit is used for calling a corresponding recognition template in the recognition template database according to the document type of the document picture to be recognized obtained by recognition to perform OCR recognition on the document picture to be recognized;
the document type acquisition unit is configured to:
performing binarization processing on the document picture to be identified to obtain a binarization table image;
performing tilt correction on the binarized table image based on a tilt correction algorithm of perspective change;
extracting the frame of the document picture to be identified by adopting an image morphology processing method based on the binarized table image after inclination correction;
OCR is carried out on a preset area of the binarized table image after inclination correction, so that a title of the document picture to be identified is obtained;
obtaining the document type of the document picture to be identified according to the border and the title of the document picture to be identified;
the OCR recognition unit is used for:
performing OCR (optical character recognition) on the document picture to be recognized by adopting a convolutional cyclic neural network model;
the identification template acquisition unit is used for:
carrying out integral automatic frame selection on each sample document picture to obtain an identification area of each sample document picture;
and adjusting the identification area selected by the automatic frame, adjusting a plurality of identification areas selected by the error automatic frame into one identification area, and splitting one identification area selected by the error automatic frame into a plurality of identification areas.
6. A storage medium having stored thereon a computer program, which when executed by a processor implements the template matching based OCR recognition method of any one of claims 1 to 4.
7. A computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, characterized in that the processor implements the template matching based OCR recognition method of any one of claims 1 to 4 when executing the computer program.
CN201910127136.1A 2019-02-20 2019-02-20 OCR recognition method and device based on template matching and storage medium Active CN110008944B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910127136.1A CN110008944B (en) 2019-02-20 2019-02-20 OCR recognition method and device based on template matching and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910127136.1A CN110008944B (en) 2019-02-20 2019-02-20 OCR recognition method and device based on template matching and storage medium

Publications (2)

Publication Number Publication Date
CN110008944A CN110008944A (en) 2019-07-12
CN110008944B true CN110008944B (en) 2024-02-13

Family

ID=67165937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910127136.1A Active CN110008944B (en) 2019-02-20 2019-02-20 OCR recognition method and device based on template matching and storage medium

Country Status (1)

Country Link
CN (1) CN110008944B (en)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443317A (en) * 2019-08-09 2019-11-12 上海尧眸电气科技有限公司 A kind of method, apparatus and electronic equipment of paper shelves electronic data processing
CN110909733A (en) * 2019-10-28 2020-03-24 世纪保众(北京)网络科技有限公司 Template positioning method and device based on OCR picture recognition and computer equipment
CN110866457A (en) * 2019-10-28 2020-03-06 世纪保众(北京)网络科技有限公司 Electronic insurance policy obtaining method and device, computer equipment and storage medium
CN110931097A (en) * 2019-11-04 2020-03-27 武汉市纽艾云健康科技有限公司 Processing and analyzing system for inspection report
CN111046736B (en) * 2019-11-14 2021-04-16 北京房江湖科技有限公司 Method, device and storage medium for extracting text information
CN110866388A (en) * 2019-11-19 2020-03-06 重庆华龙网海数科技有限公司 Publishing PDF layout analysis and identification method based on mixing of multiple neural networks
CN111126380A (en) * 2019-12-02 2020-05-08 贵州电网有限责任公司 Method and system for identifying signature of nameplate of power equipment
CN111047261B (en) * 2019-12-11 2023-06-16 青岛盈智科技有限公司 Warehouse logistics order identification method and system
CN111178365A (en) * 2019-12-31 2020-05-19 五八有限公司 Picture character recognition method and device, electronic equipment and storage medium
CN111291742B (en) 2020-02-10 2023-08-04 北京百度网讯科技有限公司 Object recognition method and device, electronic equipment and storage medium
CN111507324B (en) * 2020-03-16 2024-05-31 平安科技(深圳)有限公司 Card frame recognition method, device, equipment and computer storage medium
CN111476227B (en) * 2020-03-17 2024-04-05 平安科技(深圳)有限公司 Target field identification method and device based on OCR and storage medium
CN111428725A (en) * 2020-04-13 2020-07-17 北京令才科技有限公司 Data structuring processing method and device and electronic equipment
CN113537221A (en) * 2020-04-15 2021-10-22 阿里巴巴集团控股有限公司 Image recognition method, device and equipment
CN111680679A (en) * 2020-06-03 2020-09-18 重庆数道科技有限公司 Automatic document identification method based on OCR
CN111695566B (en) * 2020-06-18 2023-03-14 郑州大学 Method and system for identifying and processing fixed format document
CN111782808A (en) * 2020-06-29 2020-10-16 北京市商汤科技开发有限公司 Document processing method, device, equipment and computer readable storage medium
CN112149399B (en) * 2020-09-25 2024-06-04 北京来也网络科技有限公司 Table information extraction method, device, equipment and medium based on RPA and AI
CN112348022B (en) * 2020-10-28 2024-05-07 富邦华一银行有限公司 Free-form document identification method based on deep learning
CN112364790B (en) * 2020-11-16 2022-10-25 中国民航大学 Airport work order information identification method and system based on convolutional neural network
CN112418215A (en) * 2020-11-17 2021-02-26 峰米(北京)科技有限公司 Video classification identification method and device, storage medium and equipment
CN112508011A (en) * 2020-12-02 2021-03-16 上海逸舟信息科技有限公司 OCR (optical character recognition) method and device based on neural network
CN112580499A (en) * 2020-12-17 2021-03-30 上海眼控科技股份有限公司 Text recognition method, device, equipment and storage medium
CN112801084A (en) * 2021-01-29 2021-05-14 杭州大拿科技股份有限公司 Image processing method and device, electronic equipment and storage medium
CN112818961A (en) * 2021-03-26 2021-05-18 北京东方金朔信息技术有限公司 Image feature identification method and device
CN112906695B (en) * 2021-04-14 2022-03-08 数库(上海)科技有限公司 Form recognition method adapting to multi-class OCR recognition interface and related equipment
CN113111829B (en) * 2021-04-23 2023-04-07 杭州睿胜软件有限公司 Method and device for identifying document
CN114564912A (en) * 2021-11-30 2022-05-31 中国电子科技集团公司第十五研究所 Intelligent checking and correcting method and system for document format
CN115830620B (en) * 2023-02-14 2023-05-30 江苏联著实业股份有限公司 Archive text data processing method and system based on OCR
CN116137077B (en) * 2023-04-13 2023-08-08 宁波为昕科技有限公司 Method and device for establishing electronic component library, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258198A (en) * 2013-04-26 2013-08-21 四川大学 Extraction method for characters in form document image
CN103810485A (en) * 2014-01-22 2014-05-21 深圳市东信时代信息技术有限公司 Recognition device, character recognition system and method
CN108549881A (en) * 2018-05-02 2018-09-18 杭州创匠信息科技有限公司 The recognition methods of certificate word and device
CN109086714A (en) * 2018-07-31 2018-12-25 国科赛思(北京)科技有限公司 Table recognition method, identifying system and computer installation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8014604B2 (en) * 2008-04-16 2011-09-06 International Business Machines Corporation OCR of books by word recognition
RU2651144C2 (en) * 2014-03-31 2018-04-18 Общество с ограниченной ответственностью "Аби Девелопмент" Data input from images of the documents with fixed structure

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258198A (en) * 2013-04-26 2013-08-21 四川大学 Extraction method for characters in form document image
CN103810485A (en) * 2014-01-22 2014-05-21 深圳市东信时代信息技术有限公司 Recognition device, character recognition system and method
CN108549881A (en) * 2018-05-02 2018-09-18 杭州创匠信息科技有限公司 The recognition methods of certificate word and device
CN109086714A (en) * 2018-07-31 2018-12-25 国科赛思(北京)科技有限公司 Table recognition method, identifying system and computer installation

Also Published As

Publication number Publication date
CN110008944A (en) 2019-07-12

Similar Documents

Publication Publication Date Title
CN110008944B (en) OCR recognition method and device based on template matching and storage medium
CN109492643B (en) Certificate identification method and device based on OCR, computer equipment and storage medium
CN111652223B (en) Certificate identification method and device
US8908975B2 (en) Apparatus and method for automatically recognizing a QR code
CN111353497B (en) Identification method and device for identity card information
CN110766014A (en) Bill information positioning method, system and computer readable storage medium
CN110705534B (en) Wrong problem book generation method suitable for electronic typoscope
US8755595B1 (en) Automatic extraction of character ground truth data from images
US20060256388A1 (en) Semantic classification and enhancement processing of images for printing applications
CN104143094A (en) Test paper automatic test paper marking processing method and system without answer sheet
US11341739B2 (en) Image processing device, image processing method, and program recording medium
CN102360419A (en) Method and system for computer scanning reading management
CN112434699A (en) Automatic extraction and intelligent scoring system for handwritten Chinese characters or components and strokes
CN108304562B (en) Question searching method and device and intelligent terminal
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN110909740A (en) Information processing apparatus and storage medium
US11881043B2 (en) Image processing system, image processing method, and program
CN111915635A (en) Test question analysis information generation method and system supporting self-examination paper marking
CN112686257A (en) Storefront character recognition method and system based on OCR
US20150379339A1 (en) Techniques for detecting user-entered check marks
CN105678301B (en) method, system and device for automatically identifying and segmenting text image
US20210209393A1 (en) Image processing system, image processing method, and program
CN116092231A (en) Ticket identification method, ticket identification device, terminal equipment and storage medium
CN113159029A (en) Method and system for accurately capturing local information in picture
CN111213157A (en) Express information input method and system based on intelligent terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant