CN114038004A

CN114038004A - Certificate information extraction method, device, equipment and storage medium

Info

Publication number: CN114038004A
Application number: CN202111375715.1A
Authority: CN
Inventors: 刘瑞; 李玉惠; 傅强; 蔡琳; 阿曼太; 梁彧; 马寒军; 田野; 王杰; 杨满智; 金红; 陈晓光
Original assignee: Beijing Hengan Jiaxin Safety Technology Co ltd
Current assignee: Beijing Hengan Jiaxin Safety Technology Co ltd
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2022-02-11

Abstract

The embodiment of the invention discloses a certificate information extraction method, a certificate information extraction device, certificate information extraction equipment and a storage medium. The method comprises the following steps: according to the shooting scene of the input certificate image, corresponding processing operation is carried out on the input certificate image to obtain a processed certificate image; performing character recognition on the processed certificate image to obtain characters in the certificate image; analyzing key field information of characters in the certificate image according to the certificate category to obtain certificate key field information corresponding to the certificate category; and carrying out information verification and correction on the key field information of the certificate, and outputting accurate certificate information. By the technical scheme of the embodiment of the invention, the certificate information can be conveniently and quickly extracted under multiple scenes with guaranteed identification results, the working efficiency of certificate information extraction is improved, and the use feeling of a user is improved.

Description

Certificate information extraction method, device, equipment and storage medium

Technical Field

The embodiment of the invention relates to an image recognition technology, in particular to a certificate information extraction method, a certificate information extraction device, certificate information extraction equipment and a storage medium.

Background

At present, a plurality of companies provide certificate information extraction services for users, but most manufacturers make severe restrictions on original certificates to be recognized, for example, the shooting angle of a certificate needs to be adjusted, the front and back sides of the certificate need to be separately and independently recognized, different types of certificates need to be separately and independently recognized, and the like.

However, due to the requirement limitation of the original image of the certificate to be recognized, the user cannot conveniently and quickly extract the certificate information, and the use feeling of the user is reduced. Therefore, how to provide a method for extracting certificate information that is convenient and fast and has a guaranteed recognition result is a problem that needs to be solved urgently at present.

Disclosure of Invention

The embodiment of the invention provides a certificate information extraction method, a certificate information extraction device, certificate information extraction equipment and a storage medium, which can realize convenient and fast certificate information extraction under multiple scenes and guarantee the recognition result.

In a first aspect, an embodiment of the present invention provides a method for extracting credential information, including:

according to the shooting scene of the input certificate image, corresponding processing operation is carried out on the input certificate image to obtain a processed certificate image;

performing character recognition on the processed certificate image to obtain characters in the certificate image;

analyzing key field information of characters in the certificate image according to the certificate category to obtain certificate key field information corresponding to the certificate category;

and carrying out information verification and correction on the key field information of the certificate, and outputting accurate certificate information.

In a second aspect, an embodiment of the present invention further provides a credential information extraction device, where the device includes:

the certificate area preprocessing module is used for executing corresponding processing operation on the input certificate image according to the shooting scene of the input certificate image to obtain a processed certificate image;

the text detection and identification module is used for carrying out character identification on the processed certificate image to obtain characters in the certificate image;

the key field information extraction module is used for analyzing the key field information of the characters in the certificate image according to the certificate category to obtain the certificate key field information corresponding to the certificate category;

and the information checking and correcting module is used for checking and correcting the information of the key field information of the certificate and outputting accurate certificate information.

In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the certificate information extraction method according to any embodiment of the invention.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the certificate information extraction method according to any embodiment of the present invention.

According to the technical scheme of the embodiment of the invention, the corresponding processing is carried out on the input certificate image according to the shooting scene of the input certificate image, then the character recognition is carried out on the processed certificate image, the key field information analysis is carried out on the characters in the certificate image according to the certificate category to obtain the certificate key field information corresponding to the certificate category, and finally, the information verification and correction are carried out on the obtained certificate key field information to output accurate certificate information.

Drawings

FIG. 1 is a flow chart of a certificate information extraction method provided by an embodiment of the invention;

FIG. 2a is a flowchart of a certificate information extraction method according to an embodiment of the present invention;

FIG. 2b is a flowchart illustrating a method for text line detection and tilt correction according to an embodiment of the present invention;

FIG. 2c is a schematic flowchart of a certificate correction, interpolation fixed dimension and certificate image masking method according to an embodiment of the present invention;

FIG. 2d is a flowchart illustrating an example segmentation algorithm provided by an embodiment of the present invention;

fig. 2e is a schematic flowchart of a key field information parsing method according to an embodiment of the present invention;

FIG. 3 is a schematic flowchart of a method for extracting credential information in a specific scenario according to an embodiment of the present invention;

FIG. 4 is a schematic flowchart of a certificate information extraction method in a natural scene according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a certificate information extraction device provided by an embodiment of the invention;

fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Fig. 1 is a flowchart of a certificate information extraction method according to an embodiment of the present invention, where the embodiment is applicable to a case where certificate information is extracted in multiple scenarios, and the method may be executed by a certificate information extraction apparatus, which may be implemented in a hardware and/or software manner and may be generally integrated in a computer device.

As shown in fig. 1, a method for extracting credential information provided by an embodiment of the present invention includes the following specific steps:

s110: and executing corresponding processing operation on the input certificate image according to the shooting scene of the input certificate image to obtain the processed certificate image.

The shooting scene of the input certificate image may be divided into a scene in which the input certificate image is shot by using a specific Application (App), for example, a scene in which the bank identity card uploads authentication and the like, or a scene in which the input certificate image is shot in a natural scene. It is worth noting that for a certificate image shot by a specified App, the certificate image occupies the main part of the shot image, and the background is cleaner and tidier; the certificate image shot in the natural scene is not the main part of the shot image, and the background is disordered. Therefore, corresponding processing operations need to be performed on the input certificate image according to the shooting scene of the input certificate image, and illustratively, the background of the certificate image shot in a natural scene is removed, so that the subsequent operations can extract the certificate information through the processed certificate image.

In an optional implementation manner of the embodiment of the present invention, according to a shooting scene of an input certificate image, performing corresponding processing operations on the input certificate image, includes: if the shooting scene of the input certificate image is the designated scene, performing text line detection and inclination correction operation on the input certificate image; and if the shooting scene of the input certificate image is a natural scene, carrying out certificate segmentation and certificate image preprocessing on the input certificate image. The appointed scene can be a scene for shooting an input certificate image by utilizing an appointed App; credential segmentation may refer to an operation that segments a credential image from an input credential image; the certificate image preprocessing can refer to the operation of performing inclination correction or fixing the size and the like on the certificate image after certificate segmentation so that the processed certificate image meets the requirements of subsequent operation. Specifically, because the shooting scene of the input certificate image is a designated scene, the certificate image can be ensured to occupy the main part of the input certificate image, so that the certificate image does not need to be segmented to obtain the outline of the certificate image, but because a certain inclination condition may exist during shooting, the certificate image needs to be subjected to inclination correction, so that subsequent certificate information extraction can be smoothly carried out. Correspondingly, because the shooting scene of the input certificate image is a natural scene, the main body part of the certificate image which is not input by the certificate image is caused, and the background is disordered, so that certificate segmentation and certificate image preprocessing operations need to be performed on the input certificate image to ensure that certificate information extraction can be smoothly performed.

S120: and performing character recognition on the processed certificate image to obtain characters in the certificate image.

The character recognition can refer to character recognition of characters in a processed certificate image by using a related character recognition algorithm, illustratively, character recognition of characters in a processed certificate image can be performed by using a Convolutional Recurrent Neural Network (CRNN) in deep learning, the CRNN algorithm can be used for solving a sequence recognition problem based on an image, particularly a scene character recognition problem, and the CRNN algorithm has the greatest characteristic that the character recognition is converted into a sequence learning problem of time sequence dependence without cutting a single character, namely the sequence recognition based on the image. CRNN considers text recognition as a method for predicting a sequence, so a Recurrent Neural Network (RNN) for sequence prediction is used, the picture features are extracted by CNN, then the sequence is predicted by RNN, and finally a final result is obtained by a translation layer of a sequential Classification (CTC).

S130: and analyzing key field information of characters in the certificate image according to the certificate category to obtain the certificate key field information corresponding to the certificate category.

The certificate category may refer to a category of a certificate to which an input certificate image belongs, such as an identification card, a driver's license, or a driving license. The key field information analysis can be used for performing disassembly analysis on the key field information in the certificate image according to different certificate types so as to obtain the certificate key field information corresponding to the certificate types. Optionally, the certificate key field information includes key field information extracted by the support of an identity card, key field information extracted by the support of a driving license, and key field information extracted by the support of a driving license. In the embodiment of the invention, the key field information extracted by the identity card support can comprise information such as name, gender, ethnicity, year and month of birth, address, identity card number, issuing authority, validity period and the like; the key field information extracted by the driving license support can comprise information such as license numbers, names, sexes, nationalities, birth years and months, initial license receiving dates, validity periods, driving license types, driving license approval models, file numbers and the like; the key field information extracted by the driving license support may include information such as a Number plate Number, a Vehicle type, an owner, a use property, a brand model, a Vehicle Identification Number (VIN), an engine Number, a registration date, and a certificate issuance date.

S140: and carrying out information verification and correction on the key field information of the certificate, and outputting accurate certificate information.

The information verification may refer to verifying the analyzed part of the certificate key field information according to the encoding rule of the certificate information, and the object of the information verification is digital key field information in the certificate key field information or key field information that can be derived according to the corresponding encoding rule, such as an identity card number or a driver license file number.

For example, if the certificate type is an identity card, the identity card number consists of a 17-bit body code and a 1-bit check code, and the encoding rule may be as follows: the first 6 bits are address codes, which represent administrative division codes of the place where the user is registered; 7 to 14 are the year, month, and day of birth in YYYYMMDD format, e.g., 2021 year, month 11, day 1, 7 to 14 are 20211101; the 15 th to 17 th digits are sequence codes which represent sequence numbers compiled for people born in the same year, month and day in the area identified by the same address code, wherein odd numbers of the sequence codes are distributed to males and even numbers are distributed to females, namely, the 17 th odd numbers represent males and the even numbers represent females; the 18 th bit is a check code, and the character system is checked by adopting ISO7064:1983, MOD 11-2. If the certificate type is a driving license, the certificate number in the driving license is the ID card number, the coding rule is the same as the ID card number, and the file numbering coding rule can be as follows: the file number is a serial code formed by connecting 12 digits, and the first 2 digits are province codes of the driving license; the 3 rd to 4 th bits are the area code of the driving license; the back 8 bits are the serial number of the driving license, and the number sections are divided by the local vehicle management stations, wherein the provincial code and the regional code have a certain range, and the checking can be carried out according to the ranges. If the certificate type is a driving certificate, the VIN encoding rule may be as follows: VIN in the driving license is composed of 17 English words or numbers, in order to avoid confusion with 1 or 0 of the number, English letters 'I', 'O' and 'Q' are not used, each digit or letter of VIN corresponds to a digit and a weighted value, VIN starts from the first digit, the corresponding value of the digit is multiplied by the weighted value of the digit, the product value of all 17 digits is calculated, and the sum is divided by 11, and the remainder is the ninth verification value.

The correction can be to correspondingly correct the key field information of the certificate with problems according to the coding rule of the certificate information, and for example, to correspondingly correct the key field information of the certificate of the identity card, the last digit can be calculated from the previous digits according to the coding rule of the identity card number, so as to correct the last digit; the specific gender can also be judged according to the last bit of the identification number, and if the key field information in the gender key field is a word close to the male, the gender key field can be corrected.

Therefore, the unique identification information of the key field information can be verified and modified according to the encoding rule, and the accuracy of the key field information output is improved.

Fig. 2a is a flowchart of a certificate information extraction method according to an embodiment of the present invention. In this embodiment, optionally, the certificate segmentation and certificate image preprocessing operations performed on the input certificate image include: and if the divided certificate image has the conditions of inclination, deformation and inversion, carrying out certificate correction, interpolation fixed size and certificate image masking operation on the divided certificate image.

As shown in fig. 2a, a method for extracting credential information provided by an embodiment of the present invention includes the following specific steps:

s210: and if the shooting scene of the input certificate image is the designated scene, performing text line detection and inclination correction operation on the input certificate image.

For the certificate image to be recognized, the input certificate image can be an image at any angle within 360 degrees due to factors such as shooting equipment, scenes, people and the like. Since the tilted certificate image not only affects the text detection effect, but also affects the subsequent extraction of key field information of the certificate, it is necessary to perform tilt correction on the inputted certificate image. In an optional implementation manner, if the shooting scene of the input certificate image is a designated scene, performing text line detection and tilt correction on the input certificate image includes: performing text detection on the input certificate image by using a deep learning algorithm to obtain position coordinates of a text box in the input certificate image; traversing all text boxes in the input certificate image, and acquiring a text box line segment with the longest length as a target text box line segment; calculating the inclination angle of the target text box line segment, and judging whether the inclination angle is smaller than a set threshold angle; if not, performing inclination correction on the input certificate image; if yes, classifying the direction of the text in the input certificate image into 0 degree or 180 degrees through a text direction classification model, and when the direction of the text in the input certificate image is 180 degrees, rotating the input certificate image by 180 degrees in a clockwise direction. The method comprises the steps of obtaining a binary image of a document image, performing text detection on the document image by using a Differential Binarization (DB) text detection algorithm in deep learning, inserting the binary operation into a segmentation network for combination optimization by using the DB text line detection algorithm, and thus realizing the self-adaption of a threshold value at each position of a thermodynamic diagram. The set threshold angle may refer to a value set in advance for evaluating the tilt angle, and may be set to 10 degrees in the embodiment of the present invention. The inclination correction may refer to angle correction of the input certificate image according to a difference between the inclination angle and a set threshold angle.

Specifically, fig. 2b is a schematic flow chart of a text line detection and tilt correction method according to an embodiment of the present invention. In fig. 2b, step 1: performing text line detection on the input certificate image by using a DB text detection algorithm to obtain a position coordinate bbox of a text box; step 2: traversing all text boxes in the currently input certificate image according to a formula line_max＝max(line₁₁,line₁₂,line₂₁,line₂₂,...,line_n1,line_n2) Acquiring the line _ max of the text box line segment with the longest length₁₁A line segment which represents the line segment determined by the coordinates of the upper left corner and the upper right corner of the first text box and specifies the coordinate of the upper left corner of each text box as (x)₁，y₁) Clockwise rotation with the coordinate of the upper right corner being (x)₂，y₂) The coordinate of the lower right corner is (x)₃，y₃) The coordinate of the lower left corner is (x)₄，y₄) Then, the distance formula of two points on the straight line can be obtained

Wherein, line_n1，line_n2Then the length of the line segment representing the nth text box; step 3: after line _ max is acquired, it is substituted into the formula shown below to calculate the tilt angle:

step 4: judging whether the current angle is larger than a set threshold angle _ t (such as 10 degrees) or not, if the angle is smaller than or equal to the angle _ t, indicating that the inclination degree of the currently input certificate image meets the requirement, and entering step 7; if the angle is larger than the angle _ t, the inclination degree of the currently input certificate image is not in accordance with the requirement, and the currently input certificate image is required to be subjected to inclination correction according to the current angle and then returns to step 2; step 5: obtaining a certificate image which is corrected to be 0 degree or 180 degrees and has a vertical deviation of 10 degrees; step 6: judging whether the text direction in the certificate image is 0 degree or 180 degrees by using a text direction classification model based on a deep learning technology, and outputting the corrected certificate image if the text direction is 0 degree; if the character direction is 180 degrees, the input certificate image is rotated 180 degrees clockwise. In the embodiment of the present invention, arctan2 is a function for calculating the azimuth angle between two points, and can be expressed as the following formula:

therefore, the inclination correction of the input certificate image is completed by utilizing the inclination angle of the target text frame line segment, and compared with a method that the minimum circumscribed rectangle of the certificate is firstly detected by using a traditional image processing algorithm and then is corrected according to the inclination angle of the circumscribed rectangle, the method has better robustness; compared with a method for detecting and correcting the certificate outline by additionally using a new model, the method for detecting and correcting the certificate outline, provided by the embodiment of the invention, is a text line detection model required by subsequent operation, and can finish inclination correction on the basis of not increasing extra algorithm overhead.

S220: and if the shooting scene of the input certificate image is a natural scene, performing certificate segmentation on the input certificate image, and if the segmented certificate image has the conditions of inclination, deformation and inversion, performing certificate correction, interpolation fixed size and certificate image masking operation on the segmented certificate image.

The certificate segmentation can be realized by adopting an instance segmentation algorithm, and compared with a target detection algorithm, the accurate outline of the certificate image can be directly acquired by adopting the instance segmentation algorithm, so that the accuracy of subsequent information extraction is effectively improved.

Fig. 2c is a schematic flow chart of a certificate correction, interpolation fixed size and certificate image masking method according to an embodiment of the present invention. In the embodiment of the invention, the inclination angle or the deformation condition of the certificate image can be obtained by utilizing 4 certificate angular coordinates output by certificate segmentation, and then the certificate image is corrected by using an affine transformation algorithm. The interpolation fixed size can be used for zooming or enlarging the certificate image to a fixed size, and it is noted that when the length of the certificate image is interpolated to 640, the overall recognition rate of the certificate image is the highest. Taking the identity card as an example, the length-width ratio of the standard second-generation identity card is 1.58: 1, the ID card image needs to be interpolated to 640 x 410; taking a driving license and a driving license as an example, the length-width ratio of the two licenses is 1.47: 1, the driver license and driving license images need to be interpolated to 640 x 435. The certificate image mask can mask the corrected certificate image according to the key field information coordinate position of the blank template picture, so that a large number of interference of areas which do not need to be identified can be eliminated, the efficiency is improved, and meanwhile the accuracy of subsequent text detection and character identification is improved.

Because the output result of the example segmentation algorithm is a mask of the certificate area, but the algorithm is required to output the coordinates of the certificate image in the subsequent image preprocessing, post-processing logic needs to be added to the mask result output by the example segmentation algorithm to meet the requirements of image preprocessing operation. In an optional embodiment of the present invention, before performing certificate segmentation and certificate image preprocessing on an input certificate image if a shooting scene of the input certificate image is a natural scene, the method further includes: carrying out size scaling and frame filling on the input certificate image to obtain a certificate image meeting the requirement of a certificate segmentation model; if the shooting scene of the input certificate image is a natural scene, after certificate segmentation and certificate image preprocessing operations are performed on the input certificate image, the method further comprises the following steps: removing the false-detected rectangle in the certificate image, and acquiring a contour polygon coordinate set of the certificate image; clustering the outline polygon coordinate set of the certificate image by using a clustering algorithm to obtain an angular point set corresponding to four angular points; performing rotation fitting on the corner point set of the four corner points according to a set rule to obtain coordinates of the four corner points; and performing coordinate mapping on the four corner point coordinates, acquiring four corner coordinates corresponding to the four corner point coordinates in the certificate original image, and completing the contour detection of the certificate image. The rotation fitting according to the set rule can mean that the corner point set is rotated according to the set angle, illustratively, if the result obtained after the example segmentation is 45-degree inclined, the set angle can be set to be 15 degrees, the corner point set is rotated, then the coordinates in the corner point set are fitted, and the problem that the effect of corner point aggregation is poor if the result obtained after the example segmentation of the certificate image is inclined is solved.

Specifically, as shown in fig. 2d, a schematic flow chart of an example segmentation algorithm provided in the embodiment of the present invention is shown. In fig. 2d, Step 1: scaling the size of the input certificate image to a fixed size to meet the model requirement of an example segmentation algorithm; step 2: a circle of frame is filled around the image, so that the phenomenon of inaccurate segmentation caused by the certificate image at the edge of the input certificate image is prevented; step 3: inputting the processed input certificate image into a trained example segmentation algorithm to complete the segmentation of the certificate image; step 4: layering the segmented result to separate the certificate image from the background, illustratively, setting the pixel point of the certificate image as 0 and setting the pixel point of the non-certificate area as 1; step 5: the example area of the acquired certificate image is checked, illustratively, the false detection rectangle with smaller area ratio can be filtered by calculating the ratio of the outline area in the rectangle; step 6: extracting the outline coordinates of the certificate image to form an outline polygon coordinate set poly of the certificate image; step 7: using a clustering algorithm such as K-means to cluster all coordinate points of poly into 4 classes, wherein the 4 classes correspond to the 4 angular points respectively; step 8: performing rotating flattening on the corner point set of the four corner points according to a set rule; step 9: performing corner coordinate fitting on the corner sets of the four corners respectively to obtain four corner coordinates; step 10: and performing coordinate mapping on the four corner point coordinates, acquiring four corner coordinates corresponding to the four corner point coordinates in the certificate original image, and completing the contour detection of the certificate image.

Therefore, by adding post-processing logic to the mask result output by the example segmentation algorithm, the requirement in subsequent image preprocessing can be met, and the certificate information extraction result is more accurate and reliable.

S230: and performing character recognition on the processed certificate image to obtain characters in the certificate image.

S240: and analyzing key field information of characters in the certificate image according to the certificate category to obtain the certificate key field information corresponding to the certificate category.

Fig. 2e is a schematic flow chart of a method for parsing key field information according to an embodiment of the present invention. In this embodiment of the present invention, optionally, the analyzing key field information of the text in the certificate image according to the certificate category includes: identifying each text box containing characters in the certificate image; storing each text box into a text box list; selecting one text box from the text box list as a target text box, taking the coordinate of the target text box as the reference coordinate of the current line, and traversing the rest text boxes in the text box list; when a text box which is crossed with the Y-axis coordinate of the target text box exists, adding the text box to the current text line according to the sequence of the X-axis coordinate; when no text box which is crossed with the Y-axis coordinate of the target text box exists in the text box list, returning to select one text box from the text box list as the target text boxUntil there is no text box in the text box list; traversing all text lines, sequencing all the text lines according to the Y-axis coordinate, and outputting a text line list corresponding to the sequenced text lines; and analyzing the text content of the text line in the text line list based on the key field and the position information corresponding to the certificate category to obtain the key field information of the certificate. Storing each text box in the text box list may mean storing a text box position coordinate bbox output by the DB text detection algorithm in the text box list _ all in an array form, where, for example, ist _ all ═ bbox₁,bbox₂,...,bbox_n]Where n denotes that the current certificate image has n text boxes, and the form of each text box bbox may be as follows: bbox_i＝[(x₁,y₁),(x₂,y₂)，(x₃,y₃)(x₄,y₄)]I is more than or equal to 1 and less than or equal to n. A text box that intersects the Y-axis coordinate of the target text box may refer to a text box that is in the same line of text as the target text box; adding the text boxes to the current text line according to the sequence of the X-axis coordinate can mean adding the text boxes to the current text line according to the book sequence of the X-axis coordinate from small to large; sorting all text lines by Y-axis coordinate may refer to sorting all text lines in order of decreasing Y-axis coordinate. The key fields corresponding to the certificate categories can refer to the key fields in different certificate categories, such as name, age or nationality in the identity card. Analyzing the text content of the text line in the text line list based on the key field and the position information corresponding to the certificate category may mean that the position of the certificate key field information corresponding to the certificate category is analyzed according to the known key field and the position information in the certificate image, so as to obtain the certificate key field information.

Therefore, the position information of the key field can be more accurate by reordering the text rows in the certificate image, the text content analysis of the text rows in the text row list based on the key field and the position information corresponding to the certificate category is ensured, and the accuracy of the key field information of the certificate is obtained.

S250: and carrying out information verification and correction on the key field information of the certificate, and outputting accurate certificate information.

The technical scheme of the embodiment of the invention comprises the steps of carrying out text line detection and inclination correction operation on an input certificate image when the shooting scene of the input certificate image is a designated scene, carrying out certificate segmentation on the input certificate image when the shooting scene of the input certificate image is a natural scene, carrying out certificate correction, interpolation fixed size and certificate image masking operation on the segmented certificate image when the segmented certificate image has inclination, deformation and inversion, carrying out character recognition on the processed certificate image, carrying out key field information analysis on characters in the certificate image according to the certificate type to obtain the key field information of the certificate corresponding to the certificate type, finally carrying out information verification and correction on the obtained key field information of the certificate to output accurate certificate information, solving the requirement limit of original images of the certificates to be recognized in the prior art, the problem of the unable convenient, quick realization of user to the drawing of certificate information can realize that convenient, quick and the recognition result is guaranteed under the multi-scene carries out certificate information and draws, has promoted the work efficiency that certificate information drawed, has improved user's sense of use.

On the basis of the embodiment, the embodiment of the invention also comprises the operations of training data collection, training data production, training data labeling, model building, model training, model testing and model output conversion. Because the three certificate images of the embodiment of the invention all belong to sensitive data and no related public data set exists at present, the required training data needs to be collected and manufactured by self. Firstly, a small amount of acquired identity card, driving license and driving license pictures are stored in a local computer, and data with the resolution of the certificate pictures being more than 355 × 288 and the quality meeting the training requirements are selected. Then, three kinds of certificate picture originals are collected, and real data of the three kinds of certificates are collected under different conditions (different backgrounds, illumination, distances, angles and the like) by using different terminal photographing devices (different collection, cameras and the like). And then, manually labeling all original data, wherein the labels are classified into 6 types, namely an identification card human image surface, a national emblem surface, a driving license main page and sub-page, each type corresponds to one label and is respectively represented by idcard _0, idcard _1, dlcard _0, dlcard _1, vlcard _0 and vlcard _1, specifically, labeling software can be adopted to perform polygon labeling on the outline of the 6 types of certificate data, and a labeling file corresponding to the name of a labeling picture can be generated after the manual labeling is completed. However, because the data amount of the current training sample is insufficient and the labeled polygon samples of various types are unbalanced, the embodiment of the invention cuts the manually labeled certificate data in a CutMix mode, obtains the specified amount of certificate data in a mode of randomly pasting the certificate in other background pictures without certificates, and can solve the problems of insufficient training and unbalanced training sample types. And finally, inputting the prepared picture data and the prepared label file into the convolutional neural network, and simultaneously initializing the parameters of the whole convolutional neural network model. The convolutional neural network is a hierarchical structure and is formed by arranging and combining a series of convolutional layers, activation layers, pooling layers and normalization layers and finally connected to a full-connection layer and a loss layer. The loss layer is used for calculating the difference value between the predicted value and the true value, and in order to minimize the difference value, the parameters of the whole convolutional neural network model are updated through a back propagation algorithm of the convolutional neural network. And repeating the forward propagation and the backward propagation of the convolutional neural network for N times to obtain the optimal parameters, and finally obtaining the convolutional neural network algorithm model. Therefore, the trained convolutional neural network algorithm model is obtained, so that when the shooting scene of the input certificate image is a natural scene, the processed input certificate image can be input into the trained example segmentation algorithm to complete the segmentation of the certificate image.

Fig. 3 is a schematic flowchart of a certificate information extraction method in a specific scene according to an embodiment of the present invention. In fig. 3, the shooting scene of the input certificate image is a designated scene, and first, the text line detection and the inclination correction operation are performed on the input certificate image; then, carrying out text line positioning (namely text line detection) and character recognition on the processed certificate image to obtain a text box containing a text; and then, according to certificate classification, namely an identity card, a driving license and a driving license, performing key field information analysis and information verification on characters in the identified certificate image, if the information verification does not accord with the corresponding coding rule, performing information correction on the key field information according to the corresponding coding rule, and then returning to the step of performing key field information analysis until the result of the information verification is accurate, extracting the certificate information, and outputting the structured identification result. Therefore, certificate information extraction can be conveniently and quickly carried out under the appointed scene, the recognition result is guaranteed, extraction of key field information is completed, the structured recognition result is output, time cost of manual input is greatly saved, the work efficiency of certificate information extraction is improved, and the use feeling of a user is improved.

Fig. 4 is a schematic flow chart of a certificate information extraction method in a natural scene according to an embodiment of the present invention. In fig. 4, the shooting scene of the input certificate image is a natural scene, and the input certificate image is first input into a certificate segmentation network to perform certificate segmentation, wherein in the certificate segmentation network, whether the certificate image and the type of the certificate image exist can be determined, for example, if the certificate segmentation network outputs the coordinates of the certificate image, it is proved that the certificate exists, and if the certificate segmentation network outputs 0 or a null version, it is proved that the certificate image does not exist. When a certificate image exists, certificate image preprocessing, text positioning (text line detection) and character recognition operation are sequentially carried out on the divided certificate image, key field information analysis and information verification are carried out on the result of character recognition, if the information verification does not accord with the corresponding coding rule, information correction is carried out on the key field information according to the corresponding coding rule, then the step of executing key field information analysis is returned until the result of information verification is accurate, certificate information is extracted, and the structured recognition result is output. When no certificate image exists, the input certificate image may not belong to the predefined certificate category, or the input certificate image has poor quality and cannot be detected by the certificate segmentation network, if the certificate image is subjected to certificate information extraction, the extracted certificate information has low accuracy and no practical value, and therefore, an empty identification result is directly output, namely the certificate image under the condition is not identified temporarily. Therefore, certificate information extraction can be conveniently and quickly carried out under a natural scene, the recognition result is guaranteed, extraction of key field information is completed, the structured recognition result is output, time cost of manual input is greatly saved, the work efficiency of certificate information extraction is improved, and the use feeling of a user is improved.

Fig. 5 is a schematic structural diagram of a credential information extraction device according to an embodiment of the present invention, which can execute the credential information extraction methods described in the embodiments. The device can be implemented in software and/or hardware, and as shown in fig. 5, the certificate information extraction device specifically includes: certificate area preprocessing module 510, text detection and recognition module 520, key field information extraction module 530, and information verification and correction module 540.

The certificate area preprocessing module 510 is configured to perform corresponding processing operations on an input certificate image according to a shooting scene of the input certificate image, so as to obtain a processed certificate image;

the text detection and identification module 520 is used for performing character identification on the processed certificate image to obtain characters in the certificate image;

the key field information extraction module 530 is configured to perform key field information analysis on the characters in the certificate image according to the certificate category to obtain certificate key field information corresponding to the certificate category;

and the information checking and correcting module 540 is configured to check and correct the information of the certificate key field information and output accurate certificate information.

Optionally, the credential area preprocessing module 510 may include a first credential area preprocessing unit and a second credential area preprocessing unit;

the first certificate area preprocessing unit is used for performing text line detection and inclination correction operation on the input certificate image if the shooting scene of the input certificate image is a designated scene;

and the second certificate area preprocessing unit is used for carrying out certificate segmentation and certificate image preprocessing operation on the input certificate image if the shooting scene of the input certificate image is a natural scene.

Optionally, the key field information extraction module 530 may be specifically configured to identify each text box containing characters in the certificate image; storing each text box into a text box list; selecting one text box from the text box list as a target text box, taking the coordinate of the target text box as the reference coordinate of the current line, and traversing the rest text boxes in the text box list; when a text box which is crossed with the Y-axis coordinate of the target text box exists, adding the text box to the current text line according to the sequence of the X-axis coordinate; when no text box which is crossed with the Y-axis coordinate of the target text box exists in the text box list, returning to select one text box from the text box list as the operation of the target text box until no text box exists in the text box list; traversing all text lines, sequencing all the text lines according to the Y-axis coordinate, and outputting a text line list corresponding to the sequenced text lines; and analyzing the text content of the text line in the text line list based on the key field and the position information corresponding to the certificate category to obtain the key field information of the certificate.

Optionally, the second certificate area preprocessing unit may be specifically configured to, if the divided certificate image is inclined, deformed, or inverted, perform certificate correction, interpolation fixing size, and certificate image masking on the divided certificate image.

Optionally, the certificate information extraction device further includes a preprocessing module, configured to perform size scaling and frame filling on the input certificate image before certificate segmentation and certificate image preprocessing operations are performed on the input certificate image if a shooting scene of the input certificate image is a natural scene, so as to obtain a certificate image meeting requirements of a certificate segmentation model;

correspondingly, the certificate information extraction device also comprises a post-processing module, wherein the post-processing module is used for removing the false-detected rectangle in the certificate image and acquiring the outline polygon coordinate set of the certificate image after certificate segmentation and certificate image preprocessing operations are carried out on the input certificate image if the shooting scene of the input certificate image is a natural scene; clustering the outline polygon coordinate set of the certificate image by using a clustering algorithm to obtain an angular point set corresponding to four angular points; performing rotation fitting on the corner point set of the four corner points according to a set rule to obtain coordinates of the four corner points; and performing coordinate mapping on the four corner point coordinates, acquiring four corner coordinates corresponding to the four corner point coordinates in the certificate original image, and completing the contour detection of the certificate image.

Optionally, the first certificate area preprocessing unit may be specifically configured to perform text detection on the input certificate image by using a deep learning algorithm, and obtain a position coordinate of a text box in the input certificate image; traversing all text boxes in the input certificate image, and acquiring a text box line segment with the longest length as a target text box line segment; calculating the inclination angle of the target text box line segment, and judging whether the inclination angle is smaller than a set threshold angle; if not, performing inclination correction on the input certificate image; if yes, classifying the direction of the text in the input certificate image into 0 degree or 180 degrees through a text direction classification model, and when the direction of the text in the input certificate image is 180 degrees, rotating the input certificate image by 180 degrees in a clockwise direction.

Optionally, the certificate key field information includes key field information extracted by the support of an identity card, key field information extracted by the support of a driving license, and key field information extracted by the support of a driving license.

The certificate information extraction device provided by the embodiment of the invention can execute the certificate information extraction method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Fig. 6 is a schematic structural diagram of a computer apparatus according to an embodiment of the present invention, as shown in fig. 6, the computer apparatus includes a processor 610, a memory 620, an input device 630, and an output device 640; the number of processors 610 in the computer device may be one or more, and one processor 610 is taken as an example in fig. 6; the processor 610, the memory 620, the input device 630 and the output device 640 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 6.

The memory 620, as a computer-readable storage medium, can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the certificate information extraction method in the embodiment of the present invention (e.g., the certificate region preprocessing module 510, the text detection and recognition module 520, the key field information extraction module 530, and the information verification and correction module 540 in the certificate information extraction apparatus). The processor 610 executes various functional applications and data processing of the computer device by executing software programs, instructions and modules stored in the memory 620, that is, implements the certificate information extraction method described above.

The memory 620 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 620 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 620 may further include memory located remotely from the processor 610, which may be connected to a computer device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input means 630 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the computer apparatus. The output device 640 may include a display device such as a display screen.

Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method of credential information extraction, the method comprising:

Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the certificate information extraction method provided by any embodiment of the present invention.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiment of the credential information extraction device, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A certificate information extraction method is characterized by comprising the following steps:

2. The method of claim 1, wherein performing corresponding processing operations on the input document image according to a shooting scene of the input document image comprises:

if the shooting scene of the input certificate image is the designated scene, performing text line detection and inclination correction operation on the input certificate image;

and if the shooting scene of the input certificate image is a natural scene, carrying out certificate segmentation and certificate image preprocessing on the input certificate image.

3. The method of claim 1, wherein the parsing key field information for the text in the document image according to the document category comprises:

identifying each text box containing characters in the certificate image;

storing each text box into a text box list;

selecting one text box from the text box list as a target text box, taking the coordinate of the target text box as the reference coordinate of the current line, and traversing the rest text boxes in the text box list;

when a text box which is crossed with the Y-axis coordinate of the target text box exists, adding the text box to the current text line according to the sequence of the X-axis coordinate;

when no text box which is crossed with the Y-axis coordinate of the target text box exists in the text box list, returning to select one text box from the text box list as the operation of the target text box until no text box exists in the text box list;

traversing all text lines, sequencing all the text lines according to the Y-axis coordinate, and outputting a text line list corresponding to the sequenced text lines;

and analyzing the text content of the text line in the text line list based on the key field and the position information corresponding to the certificate category to obtain the key field information of the certificate.

4. The method of claim 2, wherein performing the document segmentation and document image pre-processing operations on the input document image comprises:

and if the divided certificate image has the conditions of inclination, deformation and inversion, carrying out certificate correction, interpolation fixed size and certificate image masking operation on the divided certificate image.

5. The method of claim 2, wherein before performing the certificate segmentation and certificate image preprocessing operations on the input certificate image if the shooting scene of the input certificate image is a natural scene, the method further comprises:

carrying out size scaling and frame filling on the input certificate image to obtain a certificate image meeting the requirement of a certificate segmentation model;

if the shooting scene of the input certificate image is a natural scene, after certificate segmentation and certificate image preprocessing operations are performed on the input certificate image, the method further comprises the following steps:

removing the false-detected rectangle in the certificate image, and acquiring a contour polygon coordinate set of the certificate image;

clustering the outline polygon coordinate set of the certificate image by using a clustering algorithm to obtain an angular point set corresponding to four angular points;

performing rotation fitting on the corner point set of the four corner points according to a set rule to obtain coordinates of the four corner points;

and performing coordinate mapping on the four corner point coordinates, acquiring four corner coordinates corresponding to the four corner point coordinates in the certificate original image, and completing the contour detection of the certificate image.

6. The method of claim 2, wherein if the capturing scene of the input document image is a designated scene, performing the text line detection and the tilt correction operation on the input document image comprises:

performing text detection on the input certificate image by using a deep learning algorithm to obtain position coordinates of a text box in the input certificate image;

traversing all text boxes in the input certificate image, and acquiring a text box line segment with the longest length as a target text box line segment;

calculating the inclination angle of the target text box line segment, and judging whether the inclination angle is smaller than a set threshold angle;

if not, performing inclination correction on the input certificate image;

if yes, classifying the direction of the text in the input certificate image into 0 degree or 180 degrees through a text direction classification model, and when the direction of the text in the input certificate image is 180 degrees, rotating the input certificate image by 180 degrees in a clockwise direction.

7. The method of claim 1, wherein the certificate key field information comprises key field information extracted in support of identification cards, key field information extracted in support of drivers licenses, and key field information extracted in support of driving licenses.

8. A certificate information extraction device, characterized by comprising:

9. A computer device, characterized in that the computer device comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the credential information extraction method of any one of claims 1-7.

10. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the certificate information extraction method as claimed in any one of claims 1 to 7.