CN112464941A

CN112464941A - Invoice identification method and system based on neural network

Info

Publication number: CN112464941A
Application number: CN202011148662.5A
Authority: CN
Inventors: 漆孟冬
Original assignee: Beijing Si Tech Information Technology Co Ltd
Current assignee: Beijing Si Tech Information Technology Co Ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2021-03-09
Anticipated expiration: 2040-10-23
Also published as: CN112464941B

Abstract

The invention discloses an invoice identification method and system based on a neural network, and relates to the technical field of computers, wherein the method comprises the steps of cutting an invoice according to the content of the invoice, identifying a text box in a cutting graph through a first neural network model, further cutting the cutting graph based on the position area of the text box to obtain a text block diagram so as to delete redundant blank areas, thereby reducing the calculated amount and improving the identification efficiency on one hand, and deleting ruled lines on the invoice on the other hand, avoiding the interference of the ruled lines on the text identification and improving the accuracy of text positioning; identifying text of the text block diagram based on the second neural network model; and splicing the recognized characters based on the position areas of the character block diagrams to obtain the character content of the score-cut diagram, thereby obtaining the recognition result of the invoice.

Description

Invoice identification method and system based on neural network

Technical Field

The invention relates to the technical field of computers, in particular to an invoice identification method and system based on a neural network.

Background

Invoice management is an important item of financial management, needs to invest in a large amount of manpower and material resources, collects original bills and inputs information, and tedious bill input and management work consumes manpower and time to influence office efficiency.

The existing invoice Recognition mainly adopts image processing, and adopts an OCR (Optical Character Recognition) engine based on Tesseract to recognize characters, but the image processing is adopted only, so that the interference of ruled lines on the invoice is caused, and the positioning accuracy of the characters on the invoice is limited; and the text recognition speed of Tesseract is slow, the recognition accuracy rate can not be improved,

disclosure of Invention

Aiming at the technical problems in the prior art, the invention provides an invoice identification method and system based on a neural network, which are convenient for accurately identifying characters on an invoice and positioning the characters on the invoice.

The invention discloses an invoice identification method based on a neural network, which comprises the following steps: cutting the invoice according to the position area of the invoice content to obtain a score cutting chart; identifying a text box in the segmentation graph based on a first neural network model; based on the position area of the text box, the slitting graph is slit into text block diagrams; identifying a word in the word block diagram based on a second neural network model; acquiring a splicing result of the segmentation chart according to the position area of the text box and the recognized text; and acquiring an identification result of the invoice according to the splicing result of the segmentation graph.

Preferably, the method further comprises a method for invoice preprocessing: converting the invoice into an invoice picture; and correcting the invoice picture.

Preferably, the ticket image correcting method includes: performing slope correction on the invoice picture based on Hough transform; acquiring the position of the two-dimensional code or the seal in the invoice picture, and acquiring the orientation of the invoice picture according to the position relation of the two-dimensional code or the seal; and correcting the invoice picture according to the orientation of the invoice picture.

Preferably, the method for cutting the invoice and obtaining the cutting map comprises the following steps: acquiring a position area of the invoice content according to the position relation between the invoice content and the two-dimensional code or the seal; and cutting the invoice picture into a cutting map according to the position area.

Preferably, the first neural network model includes a CTPN model, and the method for obtaining the CTPN model includes: acquiring a preset number of invoice picture samples; cutting the invoice picture sample according to the invoice content to obtain a cut sample; setting a label for the cut sample to obtain a training set, wherein the label is a character frame coordinate in the cut sample; and training the training set based on the CTPN neural network to obtain the first neural network model.

Preferably, the method for obtaining the second neural network model comprises: establishing a training sample and a second sample set according to character features in the invoice; taking characters of characters in the training sample as a label of the training sample; and training the second sample set based on the DenseNet + CTC neural network to obtain a second neural network model.

Preferably, the method for obtaining the segmentation chart splicing result according to the text box and the recognized text comprises the following steps: matching the content of the text box with the content of the segmentation graph or the invoice content; and checking the formats of the content of the text box and the value of the content of the invoice, and acquiring the value of the content of the cutting chart or the value of the content of the invoice.

Preferably, the method for obtaining the segmentation map splicing result according to the position area of the text box and the text comprises the following steps: acquiring a coordinate area of the text box; matching the content of the text box with the content of the segmentation chart or the invoice based on the abscissa or the ordinate area to obtain first content; and acquiring the content of the cutting chart or the content of the invoice for the first content matching value based on the abscissa or the ordinate area.

Preferably, the method of the present invention further comprises a format checking method: and checking the matching value of the first content according to the format characteristic of the first content value.

The invention also provides a system for realizing the method, which comprises a cutting module, a first neural network module, a text block diagram cutting module, a second neural network module, a text frame splicing module and a cutting diagram splicing module, wherein the cutting module is used for cutting the invoice according to the position area of the invoice content to obtain a cutting diagram; the first neural network module is used for identifying a text box in the segmentation graph based on a first neural network model; the text block diagram cutting module is used for cutting the cutting diagram into text block diagrams based on the position area of the text frame; the second neural network module is used for identifying the characters in the character block diagram based on a second neural network model; the character frame splicing module is used for acquiring a splicing result of the segmentation graph according to the character frame and the recognized characters; and the segmentation map splicing module is used for acquiring the identification result of the invoice according to the splicing result of the segmentation map.

Compared with the prior art, the invention has the beneficial effects that: according to the method, the invoice is cut according to the invoice content, the text frame in the cut graph is identified through the first neural network model, the cut graph is further cut based on the position area of the text frame, and redundant blank areas are deleted, so that on one hand, the calculated amount is reduced, the identification efficiency is improved, on the other hand, the ruled line on the invoice is deleted, the interference of the ruled line on the character identification is avoided, and the character positioning accuracy is improved; identifying text of the text block diagram based on the second neural network model; and splicing the recognized characters based on the position areas of the character block diagrams to obtain the character content of the score-cut diagram, thereby obtaining the recognition result of the invoice.

Drawings

FIG. 1 is a flow chart of an invoice identification method of the present invention;

FIG. 2 is a flow chart of a method of invoice preprocessing;

FIG. 3 is a flow chart of a method of forwarding the invoice picture;

FIG. 4 is a flow chart of a method of applying a cut to an invoice and obtaining a cut map;

FIG. 5 is a graph of a cut of invoice codes;

FIG. 6 is a flow chart of a method for obtaining a segmentation graph stitching result according to the text box and the text;

FIG. 7 is a flowchart of a method for obtaining a stitching result of a segmentation chart according to a position area of a text box and a text;

FIG. 8 is a schematic view of the identification of the text box for the invoice amount;

FIG. 9 is a flow chart of a method of obtaining the CTPN model;

FIG. 10 is a flow chart of a method of obtaining the second neural network model;

FIG. 11 is a logical block diagram of the system of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

The invention is described in further detail below with reference to the attached drawing figures:

an invoice identification method based on a neural network, as shown in fig. 1, the method comprises the following steps:

step 101: and cutting the invoice according to the position area of the invoice content to obtain a score cutting chart. The invoice has fixed format and distribution, such as invoice head-up, machine code, invoice code, money tax rate and seller information, etc. has specific position areas, and the specific position areas are cut into different cutting graphs, wherein, the content of each cutting graph, such as the cutting graph of the position area of the machine code, the content of which is the mark of the machine code and the specific value thereof, can also be obtained.

Step 102: identifying a text box in the segmentation graph based on a first neural network model. The first neural network model may employ a CTPN model. But not limited thereto, such as using MSER (maximum Stable extreme Regions) algorithm.

The CTPN model is based on a CTPN character detection algorithm, combines a CNN and an LSTM deep network, and can effectively detect characters and character frames which are transversely distributed in a complex scene.

Step 103: and cutting the cutting chart into a text chart based on the position area of the text box. And a large number of blank areas are removed through the text block diagram, so that the calculation amount is reduced.

Step 104: identifying a word in the word block diagram based on a second neural network model. The second neural network model may employ an OCR model, such as a Tesseract-based model, a DenseNet + CTC-based model, or a CRNN + CTC-based model.

Wherein, DenseNet (dense Convolutional network) is used for extracting image convolution characteristics for a density Convolutional network, and connected CTC (connected Temporal classification) is used for solving the problem that training characters cannot be aligned; tesseract is an open source OCR engine that can recognize and convert image files in multiple formats into text. But not limited thereto, for example, a CNN + LSTM + CTC algorithm and a model combining three methods may also be adopted, wherein LSTM (Long Short-Term Memory network) is a time-cycle neural network and is specially designed to solve the Long-Term dependence problem of a general RNN (cyclic neural network).

Step 105: and acquiring a splicing result of the segmentation chart according to the position area of the text box and the recognized text. That is, the characters identified in the character block diagram are spliced to obtain the splicing result of the segmentation chart, for example, the segmentation chart of the machine coding comprises two character block diagrams: machine code and its value.

Step 106: and acquiring an identification result of the invoice according to the splicing result of the segmentation graph. The invoice contents comprise invoice head-up, machine codes, invoice codes, money tax rate, seller information and the like, and the invoice structured recognition result is obtained after the contents are combined.

According to the method, the invoice is cut according to the content of the invoice, the text frame in the cut graph is identified through the first neural network model, the cut graph is further cut based on the position area of the text frame, and redundant blank areas are deleted, so that on one hand, the calculated amount is reduced, the identification efficiency is improved, on the other hand, the ruled line on the invoice is deleted, the interference of the ruled line on the character identification is avoided, and the character positioning accuracy is improved; identifying text of the text block diagram based on the second neural network model; and splicing the recognized characters based on the position areas of the character block diagrams to obtain the character content of the score-cut diagram, thereby obtaining the recognition result of the invoice.

Example 1

As shown in fig. 2, the present embodiment provides a method for invoice preprocessing:

step 201: and converting the invoice into an invoice picture. The invoice picture can be obtained by taking a picture and scanning, or the electronic file of the invoice can be converted into a picture format, such as a jpeg format. The format of the invoice picture may also be sized, such as the height and width of the picture being uniform to 32 and 200 pixels, respectively, so that the size of the invoice picture is not limited thereto.

Step 202: and correcting the invoice picture. The invoice picture may have a slant condition, and after correction, reading is facilitated, and meanwhile recognition of the first neural network model and the second neural network model is facilitated.

As shown in fig. 3, the method for correcting the invoice picture may include:

step 301: and performing slope correction on the invoice picture based on Hough transform. Hough transformation is used for correcting the inclined chiffon of a text picture, and mainly utilizes the transformation between the space where the picture is located and the Hough space to map a curve or a straight line with a shape in a rectangular coordinate system where the picture is located to one point of the Hough space to form a peak value, so that the problem of detecting any shape is converted into the problem of calculating the peak value.

Step 302: and acquiring the position of the two-dimensional code or the seal in the invoice picture, and acquiring the orientation of the invoice picture according to the position relation of the two-dimensional code or the seal. The positions of the two-dimensional codes and/or the stamps can be identified through the colors and the shapes of the two-dimensional codes and/or the stamps in the invoice picture, but the position is not limited to the above, and for example, the area positions of the two-dimensional codes can be obtained through identifying the two-dimensional codes.

Step 303: and correcting the invoice picture according to the orientation of the invoice picture. If the invoice picture is inverted, the two-dimensional code is on the lower right corner, the invoice picture is rotated by 180 degrees, so that the invoice picture is rotated to be correct, namely the two-dimensional code of the invoice picture is on the upper left corner usually, and the stamp is on the lower right corner or in the middle of the upper side to facilitate reading, and meanwhile, the identification efficiency of the first neural network model and the second neural network model is improved by unifying the orientation or format of the invoice picture.

In one embodiment, the location area of the invoice stamp is found by image processing module cv2 of python:

lower_red＝np.array([0,148,148])

upper_red＝np.array([10,255,220])

hsv＝cv2.cvtColor(im,cv2.COLOR_BGR2HSV)

mask ═ cv2.inRange (hsv, lower _ red, upper _ red) # only retains the red part of the original image

red＝cv2.bitwise_and(im,im,mask＝mask)

Similarly, the blue or black part in the original image is reserved, and then the square with the largest area is found to determine the two-dimensional code.

Example 2

As shown in fig. 4, the method for cutting the invoice and obtaining the cutting map includes:

step 401: and acquiring the position area of the invoice content according to the position relation between the invoice content and the two-dimensional code or the seal. For example, the machine code of the invoice is positioned below the two-dimensional code difference, and the position area of the machine code can be obtained through the area position of the two-dimensional code.

Step 402: and cutting the invoice picture into a cutting map according to the position area of the invoice content.

Taking invoice codes as an example, the implementation codes of the python image processing module cv2 are as follows:

finding the two-dimensional code by x, y, w, h ═ find _ code (img) # and returning values x, y, w, h which are respectively the coordinates (x, y) at the upper left corner, the width w and the height h of the circumscribed rectangle of the two-dimensional code

zx, zy, zw, zh fine _ seal (img) # finds the stamp and returns the values zx, zy, zw and zh, wherein (zx, zy) is the coordinate of the upper left corner of the circumscribed rectangle of the stamp, zw is the width of the circumscribed rectangle, zh is the height of the circumscribed rectangle

Wherein, the invoice code areas are determined to be x3, w3, y3 and h3, and the calculation mode is as follows:

the coordinate x3 of the invoice code area of x3 (int (x + w) # is the right side of the rectangular frame of the two-dimensional code

The width of w3 (int (zw 1.9) # invoice code area is 1.9 times the width of the stamp

The y3 coordinate of y3 int (zy) # invoice code area is the zy coordinate of stamp

The height of an invoice code area h3 (int (zh 0.9) # is 0.9 times the stamp height zh, wherein the origin of coordinates refers to the upper left corner of the area. As shown in fig. 5, after the cut graph of the invoice code is obtained, the text box in the cut graph is identified through the first neural network model, and the content in the text box is identified through the second neural network model.

Example 3

The embodiment provides a processing method after character recognition.

As shown in fig. 6, the method for obtaining the segmentation map splicing result according to the text box and the recognized text includes:

step 601: and matching the content of the text box with the content of the segmentation graph or the invoice content.

Step 602: and checking the format of the content of the text box, and acquiring the value of the content of the cutting chart or the content of the invoice. For example, the invoice code ID is a continuous number having a certain number of digits, such as 10 or 12 digits, and the identification result is checked by the number of digits.

As shown in fig. 5, the identification text box includes the invoice code ID and the machine number, the machine code cannot be matched with the content of the segmentation graph, and the invoice code ID on the upper side of the machine code can be determined as the matched identification result according to the inherent format of the invoice.

As shown in fig. 7, the method for obtaining the splicing result of the segmentation map according to the position area of the text box and the text includes:

step 701: and acquiring a coordinate area of the text box.

Step 702: and matching the content of the text box with the content of the segmentation graph or the invoice based on the abscissa or the ordinate area to obtain first content.

Step 703: and acquiring the content of the segmentation graph or the invoice content for the first content matching value based on the abscissa or ordinate area.

In a specific embodiment, as shown in fig. 8, after the text boxes "gold" and "amount" are matched with the amount of money of the invoice content, the ordinate interval of the "amount" is obtained, and the ordinate interval of the numerical value below the amount is matched with the ordinate of the amount, so as to obtain the value of the amount; the text boxes of the second, second and third links are matched with the content of the invoice, and the text boxes of the second, third and fourth links are matched with the value of the second link on the basis of the ordinate interval. In fig. 8, text boxes are indicated by blocks.

The invention also comprises a format checking method: and checking the matching value of the first content according to the format characteristic of the first content value. Such as the format of the invoice code described above.

Example 4

As shown in fig. 9, the first neural network model includes a CTPN model, and the method for obtaining the CTPN model includes:

step 901: and acquiring a preset number of invoice picture samples. The invoice picture can be set to a specified size.

Step 902: and cutting the invoice picture sample according to the invoice content to obtain a cut sample.

Step 903: and setting a label for the slitting sample to obtain a training set, wherein the label is a character frame coordinate in the slitting sample. The text box coordinates may include coordinate values for corners of the text box, such as upper left and lower right corners. In one particular embodiment, a training set with 200 ten thousand invoice pictures was constructed.

Step 904: and training the training set based on the CTPN neural network to obtain a first neural network model.

As shown in fig. 10, the method of obtaining the second neural network model includes:

step 1001: and establishing a training sample and a second sample set according to the character characteristics in the invoice. The training sample may be a block diagram of text, and the second set of samples should cover various components of the invoice content.

Step 1002: and taking the characters of the characters in the training sample as the labels of the training sample. In the receipt, the characters commonly used include numbers, English letters, punctuations and Chinese characters, so the codes of the characters can be used as labels. In one embodiment, the character length of the training sample can be set to 10 or 12, which is adapted to the longest value of the invoice content value to improve the training efficiency.

Step 1003: and training the second sample set based on the DenseNet + CTC neural network to obtain a second neural network model. And the DenseNet neural network is connected with the CTC neural network, and the second sample set is trained as a whole.

Wherein, the CTPN neural network and the DenseNet + CTC neural network are prior art, and the present invention is not described in detail.

The invention also provides a system for realizing the method, as shown in fig. 11, comprising a cutting module 1, a first neural network module 2, a text block diagram cutting module 3, a second neural network module 4, a text frame splicing module 5 and a cutting diagram splicing module 6,

the cutting module 1 is used for cutting the invoice according to the position area of the invoice content to obtain a score cutting chart;

the first neural network module 2 is used for identifying a text box in the segmentation graph based on a first neural network model;

the text block diagram cutting module 3 is used for cutting the cutting diagram into text block diagrams based on the position area of the text frame;

the second neural network module 4 is used for identifying the characters in the character block diagram based on a second neural network model;

the text frame splicing module 5 is used for acquiring a splicing result of the segmentation graph according to the text frame and the recognized characters;

and the segmentation map splicing module 6 is used for acquiring the identification result of the invoice according to the splicing result of the segmentation map.

It should be noted that, the invoice has different types or versions, the arrangement, arrangement and distribution area of the invoice content are different, and the specific algorithm is also different.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An invoice identification method based on a neural network, which is characterized by comprising the following steps:

cutting the invoice according to the position area of the invoice content to obtain a score cutting chart;

identifying a text box in the segmentation graph based on a first neural network model;

based on the position area of the text box, the slitting graph is slit into text block diagrams;

identifying a word in the word block diagram based on a second neural network model;

acquiring a splicing result of the segmentation chart according to the position area of the text box and the recognized text;

and acquiring an identification result of the invoice according to the splicing result of the segmentation graph.

2. The invoice identification method according to claim 1, characterized in that the method further comprises a method of invoice preprocessing:

converting the invoice into an invoice picture;

and correcting the invoice picture.

3. The invoice identification method according to claim 2, wherein the invoice picture correcting method comprises the following steps:

performing slope correction on the invoice picture based on Hough transform;

acquiring the position of the two-dimensional code or the seal in the invoice picture, and acquiring the orientation of the invoice picture according to the position relation of the two-dimensional code or the seal;

and correcting the invoice picture according to the orientation of the invoice picture.

4. The invoice recognition method of claim 3, wherein the method for cutting the invoice and obtaining the cut map comprises:

acquiring a position area of the invoice content according to the position relation between the invoice content and the two-dimensional code or the seal;

and cutting the invoice picture into a cutting map according to the position area.

5. The invoice recognition method according to claim 1 or 4, wherein the first neural network model comprises a CTPN model, and the method for obtaining the CTPN model comprises the following steps:

acquiring a preset number of invoice picture samples;

cutting the invoice picture sample according to the invoice content to obtain a cut sample;

setting labels for the slitting samples to obtain a training set; wherein the label is a character frame coordinate in the slitting sample;

and training the training set based on the CTPN neural network to obtain the first neural network model.

6. The invoice identification method of claim 1, wherein the method of obtaining the second neural network model comprises:

establishing a training sample and a second sample set according to character features in the invoice;

taking characters of characters in the training sample as a label of the training sample;

and training the second sample set based on the DenseNet + CTC neural network to obtain a second neural network model.

7. The invoice recognition method of claim 1, wherein the method for obtaining the segmentation map splicing result according to the text box and the recognized text comprises the following steps:

matching the content of the text box with the content of the segmentation graph or the invoice content;

and checking the formats of the content of the text box and the value of the content of the invoice, and acquiring the value of the content of the cutting chart or the value of the content of the invoice.

8. The invoice recognition method of claim 1, wherein the method for obtaining the segmentation map splicing result according to the position area and the characters of the character frame comprises the following steps:

acquiring a coordinate area of the text box;

matching the content of the text box with the content of the segmentation chart or the invoice based on the abscissa or the ordinate area to obtain first content;

and acquiring the content of the cutting chart or the content of the invoice for the first content matching value based on the abscissa or the ordinate area.

9. The invoice identification method of claim 8, further comprising a format check method:

and checking the matching value of the first content according to the format characteristic of the first content value.

10. A system for realizing the invoice recognition method of any one of claims 1-8, characterized by comprising a cutting module, a first neural network module, a text block diagram cutting module, a second neural network module, a text block splicing module and a cutting diagram splicing module,

the cutting module is used for cutting the invoice according to the position area of the invoice content to obtain a score cutting chart;

the first neural network module is used for identifying a text box in the segmentation graph based on a first neural network model;

the text block diagram cutting module is used for cutting the cutting diagram into text block diagrams based on the position area of the text frame;

the second neural network module is used for identifying the characters in the character block diagram based on a second neural network model;

the character frame splicing module is used for acquiring a splicing result of the segmentation graph according to the character frame and the recognized characters;

and the segmentation map splicing module is used for acquiring the identification result of the invoice according to the splicing result of the segmentation map.