CN112464941B

CN112464941B - Invoice identification method and system based on neural network

Info

Publication number: CN112464941B
Application number: CN202011148662.5A
Authority: CN
Inventors: 漆孟冬
Original assignee: Beijing Si Tech Information Technology Co Ltd
Current assignee: Beijing Si Tech Information Technology Co Ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2024-05-24
Anticipated expiration: 2040-10-23
Also published as: CN112464941A

Abstract

The invention discloses an invoice recognition method and system based on a neural network, and relates to the technical field of computers, wherein the method is characterized in that an invoice is split according to invoice content, character frames in a split drawing are recognized through a first neural network model, the split drawing is further split based on the position area of the character frames, a character block diagram is obtained, redundant blank areas are deleted, on one hand, the calculated amount is reduced, the recognition efficiency is improved, on the other hand, grid lines on the invoice are deleted, interference of the grid lines on character recognition is avoided, and the accuracy of character positioning is improved; identifying the text of the text block based on the second neural network model; and splicing the identified characters based on the position areas of the character block diagrams to obtain the character content of the slitting diagrams, thereby obtaining the identification result of the invoice.

Description

Invoice identification method and system based on neural network

Technical Field

The invention relates to the technical field of computers, in particular to an invoice recognition method and system based on a neural network.

Background

Invoice management is an important item of financial management, a large amount of manpower and material resources are needed to be input, original bills and information input are collected, and heavy bill input and management work is carried out, so that manpower is consumed, and office efficiency is influenced due to time consumption.

At present, the invoice identification mainly adopts image processing, adopts an OCR (Optical Character Recognition ) engine based on TESSERACT to identify characters, but adopts image processing only, is interfered by grid lines on the invoice, and has limited positioning accuracy rate on the characters on the invoice; and TESSERACT has slower character recognition speed, the recognition accuracy can not be improved,

Disclosure of Invention

Aiming at the technical problems in the prior art, the invention provides an invoice recognition method and system based on a neural network, which are convenient for recognizing characters on an invoice accurately and positioning the characters on the invoice.

The invention discloses an invoice recognition method based on a neural network, which comprises the following steps: cutting the invoice according to the position area of the invoice content to obtain a cutting diagram; identifying a text box in the slitting map based on a first neural network model; dividing the dividing and cutting diagram into character block diagrams based on the position area of the character frame; identifying text in the text block based on a second neural network model; acquiring a splicing result of the splitting graph according to the position area of the text frame and the recognized text; and acquiring an identification result of the invoice according to the splicing result of the slitting diagram.

Preferably, the method further comprises a method of invoice preprocessing: converting the invoice into an invoice picture; and turning the invoice picture right.

Preferably, the ticket picture correcting method comprises the following steps: performing inclination correction on the invoice picture based on Hough transformation; acquiring the position of a two-dimensional code or a seal in the invoice picture, and acquiring the orientation of the invoice picture according to the position relation of the two-dimensional code or the seal; and turning the invoice picture forward according to the orientation of the invoice picture.

Preferably, the method for slitting the invoice and obtaining the slitting map comprises the following steps: acquiring a position area of the invoice content according to the position relation between the invoice content and the two-dimensional code or the seal; and dividing the invoice picture into a cutting picture according to the position area.

Preferably, the first neural network model includes a CTPN model, and the method for obtaining the CTPN model includes: acquiring invoice picture samples of a preset number; dividing the invoice picture sample according to invoice content to obtain divided samples; setting a label for the cut sample to obtain a training set, wherein the label is a text frame coordinate in the cut sample; and training the training set based on CTPN neural networks to obtain the first neural network model.

Preferably, the method for acquiring the second neural network model includes: according to character features in the invoice, a training sample and a second sample set are established; taking characters of characters in the training sample as labels of the training sample; and training the second sample set based on DenseNet +CTC neural network to obtain a second neural network model.

Preferably, the method for obtaining the splitting graph splicing result according to the text frame and the identified text comprises the following steps: matching the content of the text frame with the content of the slitting map or the invoice content; and checking the formats of the text box content and the invoice content value, and obtaining the value of the slitting diagram content or the invoice content.

Preferably, the method for obtaining the splitting graph splicing result according to the position area of the text frame and the text comprises the following steps: acquiring a coordinate area of a text frame; matching text frame content with slitting map or invoice content based on an abscissa or ordinate area to obtain first content; and acquiring the content of the slitting map or the invoice for the first content matching value based on the abscissa or the ordinate area.

Preferably, the method of the present invention further comprises a method of format verification: and verifying the matching value of the first content according to the format characteristics of the first content value.

The invention also provides a system for realizing the method, which comprises a slitting module, a first neural network module, a text block diagram slitting module, a second neural network module, a text block splicing module and a slitting diagram splicing module, wherein the slitting module is used for slitting the invoice according to the position area of the invoice content to obtain a slitting diagram; the first neural network module is used for identifying a text box in the slitting map based on a first neural network model; the text block diagram dividing and cutting module is used for dividing and cutting the dividing and cutting diagram into text block diagrams based on the position area of the text block; the second neural network module is used for identifying characters in the character block diagram based on a second neural network model; the text frame splicing module is used for acquiring a splicing result of the splitting graph according to the text frame and the recognized text; and the splitting graph splicing module is used for acquiring the identification result of the invoice according to the splicing result of the splitting graph.

Compared with the prior art, the invention has the beneficial effects that: according to the method, the invoice is cut according to the invoice content, character frames in the cut drawing are identified through the first neural network model, the cut drawing is further cut based on the position areas of the character frames, and redundant blank areas are deleted, so that on one hand, the calculated amount is reduced, the identification efficiency is improved, on the other hand, grid lines on the invoice are deleted, interference of the grid lines on character identification is avoided, and the accuracy of character positioning is improved; identifying the text of the text block based on the second neural network model; and splicing the identified characters based on the position areas of the character block diagrams to obtain the character content of the slitting diagrams, thereby obtaining the identification result of the invoice.

Drawings

FIG. 1 is a flow chart of an invoice recognition method of the present invention;

FIG. 2 is a flow chart of a method of invoice preprocessing;

FIG. 3 is a flow chart of a method of forwarding the invoice picture;

FIG. 4 is a flow chart of a method for slitting an invoice and obtaining a slitting map;

FIG. 5 is a cut-away view of an invoice code;

FIG. 6 is a flow chart of a method for obtaining a split map splice result based on the text box and text;

FIG. 7 is a flow chart of a method for obtaining a splice result of a slit map based on a position area of a text box and text;

FIG. 8 is a schematic diagram of recognition of text boxes for invoice amounts;

FIG. 9 is a flow chart of a method of obtaining the CTPN model;

FIG. 10 is a flow chart of a method of acquiring the second neural network model;

fig. 11 is a system logic block diagram of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention is described in further detail below with reference to the attached drawing figures:

an invoice recognition method based on a neural network, as shown in fig. 1, the method comprising:

Step 101: and cutting the invoice according to the position area of the invoice content to obtain a cutting diagram. The invoice has fixed format and distribution, such as invoice head-up, machine code, invoice code, amount tax rate, seller information and the like, has specific position areas, and the specific position areas are divided into different slitting diagrams, wherein the content of each slitting diagram can be obtained, such as the slitting diagram of the position area of the machine code, the content of which is the mark of the machine code and the specific value of the machine code.

Step 102: and identifying the text box in the slitting map based on the first neural network model. The first neural network model may employ CTPN models. But are not limited thereto, such as with the MSER (Maximally Stable Extrernal Regions, maximally stable extremal region) algorithm.

The CTPN model is based on CTPN text detection algorithm, combines CNN and LSTM depth network, and can effectively detect transversely distributed text and text frames of complex scenes.

Step 103: and cutting the cutting graph into character block diagrams based on the position area of the character frame. And a large number of blank areas are removed through the text block diagram, so that the calculated amount is reduced.

Step 104: words in the word block diagram are identified based on a second neural network model. The second neural network model may employ an OCR model, such as TESSERACT-based model, denseNet +ctc-based model, or crnn+ctc-based model.

DenseNet (Dense Convolutional Network) is a density convolution network for extracting image convolution characteristics, and CTC (Connectionist Temporal Classification) is connected for solving the problem that training characters cannot be aligned; TESSERACT is an open source OCR engine that can recognize and convert image files in multiple formats into text. However, the method is not limited thereto, and a model combining three methods, such as cnn+lstm+ctc algorithm, may be used, where LSTM (Long Short-Term Memory network) is a time-loop neural network, and is specifically designed to solve the Long-Term dependency problem existing in the general RNN (loop neural network).

Step 105: and acquiring a splicing result of the splitting graph according to the position area of the text frame and the recognized text. I.e. splicing the characters identified in the character block diagram to obtain the splicing result of the splitting diagram, wherein the splitting diagram of the machine code comprises two character block diagrams: machine encoding and its value.

Step 106: and acquiring an identification result of the invoice according to the splicing result of the slitting diagram. The invoice content comprises invoice head-up, machine code, invoice code, amount tax rate, seller information and the like, and the invoice structural identification result is obtained after the invoice head-up, machine code, invoice code, amount tax rate, seller information and the like are combined.

According to the method, the invoice is cut according to the invoice content, the character frames in the cut drawing are identified through the first neural network model, the cut drawing is further cut based on the position areas of the character frames, and redundant blank areas are deleted, so that on one hand, the calculated amount is reduced, the identification efficiency is improved, on the other hand, grid lines on the invoice are deleted, interference of the grid lines on character identification is avoided, and the accuracy of character positioning is improved; identifying the text of the text block based on the second neural network model; and splicing the identified characters based on the position areas of the character block diagrams to obtain the character content of the slitting diagrams, thereby obtaining the identification result of the invoice.

Example 1

As shown in fig. 2, the present embodiment provides a method for preprocessing an invoice:

Step 201: and converting the invoice into an invoice picture. The invoice picture can be obtained by photographing and scanning, and the electronic file of the invoice can be converted into a picture format, such as a jpeg format. The size of the format of the invoice picture may also be specified, for example, the height and width of the picture are unified to 32 and 200 pixels, respectively, so that the size of the invoice picture is not limited thereto.

Step 202: and turning the invoice picture right. The invoice picture can be inclined, and after the invoice picture is turned over, the invoice picture is convenient to read and convenient for the identification of the first neural network model and the second neural network model.

As shown in fig. 3, the method for correcting the invoice picture may include:

Step 301: and performing inclination correction on the invoice picture based on Hough transformation. The Hough transformation is used for correcting the text picture by using the transformation between the space where the picture is located and the Hough space, and a curve or a straight line with a shape in a rectangular coordinate system where the picture is located is mapped to one point of the Hough space to form a peak value, so that the problem of detecting any shape is converted into the problem of calculating the peak value.

Step 302: and acquiring the position of the two-dimensional code or the seal in the invoice picture, and acquiring the orientation of the invoice picture according to the position relation of the two-dimensional code or the seal. The positions of the two-dimensional codes and/or the seal can be identified through the colors and the shapes of the two-dimensional codes and/or the seal in the invoice picture, but the method is not limited to the method, and the region positions of the two-dimensional codes can be obtained through identifying the two-dimensional codes.

Step 303: and turning the invoice picture forward according to the orientation of the invoice picture. If the invoice picture is inverted, the two-dimensional code is arranged at the lower right corner, the invoice picture is rotated by 180 degrees, so that the invoice picture is rotated rightly, namely, the two-dimensional code of the invoice picture is arranged at the upper left corner generally, the seal is arranged at the lower right corner or the middle part of the upper side, reading is convenient, and meanwhile, the identification efficiency of the first neural network model and the second neural network model is improved by unifying the orientation or the format of the invoice picture.

In one specific embodiment, the location area of the invoice seal is found by python's image processing module cv 2:

lower_red＝np.array([0,148,148])

upper_red＝np.array([10,255,220])

hsv＝cv2.cvtColor(im,cv2.COLOR_BGR2HSV)

mask=cv2. InRange (hsv, lower_red, upper_red) # retains only the red part of the original image

red＝cv2.bitwise_and(im,im,mask＝mask)

Similarly, the blue or black part in the original image is reserved, and then the square with the largest area is found out to determine the two-dimensional code.

Example 2

As shown in fig. 4, the method for slitting the invoice and obtaining the slitting map includes:

Step 401: and acquiring the position area of the invoice content according to the position relation between the invoice content and the two-dimensional code or the seal. For example, the machine code of the invoice is positioned below the two-dimension code distinction, and the position area of the machine code can be obtained through the area position of the two-dimension code.

Step 402: and dividing the invoice picture into a cutting picture according to the position area of the invoice content.

Taking invoice code as an example, the implementation code using python image processing module cv2 is as follows:

Finding the two-dimensional code by using x, y, w and h=find_code (img) # and respectively obtaining return values x, y, w and h which are the upper left corner coordinates (x, y), width w and height h of the circumscribed rectangle of the two-dimensional code

Zx, zy, zw, zh=find_seal (img) # find the stamp, return values zx, zy, zw and zh, where (zx, zy) is the upper left corner coordinates of the circumscribed rectangle of the stamp, zw is the width of the circumscribed rectangle, and zh is the height of the circumscribed rectangle

The invoice code areas are determined to be x3, w3, y3 and h3, and the calculation mode is as follows:

The coordinate x3 of the x3=int (x+w) # invoice code region is the right side of the two-dimensional code rectangular frame

W3=int (zw×1.9) # invoice code region width is 1.9 times of seal width

The coordinate y3 of the y3=int (zy) # invoice code region is the zy coordinate of the seal

H3 The height of the =int (zh 0.9) # invoice code region is 0.9 times the stamp height zh, wherein the origin of coordinates refers to the upper left corner of the region. As shown in fig. 5, after the split map of the invoice code is obtained, a text box in the split map is identified through a first neural network model, and the content in the text box is identified through a second neural network model.

Example 3

The embodiment provides a processing method after character recognition.

As shown in fig. 6, the method for obtaining the splitting graph splicing result according to the text box and the recognized text includes:

Step 601: and matching the content of the text box with the content of the slitting map or the invoice content.

Step 602: and checking the format of the text box content to obtain the value of the slitting diagram content or the invoice content. For example, the invoice code ID is a continuous number, has a certain number of digits, such as 10 digits or 12 digits, and the identification result is checked by the number of digits.

As shown in fig. 5, the identification text box includes an invoice code ID and a machine number, the machine code cannot be matched with the content of the slitting map, and according to the inherent format of the invoice, the invoice code ID on the upper side of the machine code can be determined as a matched identification result.

As shown in fig. 7, the method for obtaining the splicing result of the splitting graph according to the position area of the text frame and the text includes:

Step 701: and acquiring a coordinate area of the text frame.

Step 702: and matching the text frame content with the slitting diagram or invoice content based on the abscissa or ordinate area to obtain first content.

Step 703: and acquiring the content of the slitting map or the invoice content for the first content matching value based on the abscissa or the ordinate area.

In a specific embodiment, as shown in fig. 8, after the text boxes "gold" and "amount" are matched with the amount of the invoice content, an ordinate interval of "amount" is obtained, and a numerical ordinate interval on the lower side of the amount is matched with the ordinate of the amount to obtain the value of the amount; the text boxes of the second and the link are matched with invoice contents, and the text boxes of the abutting, buckling and the link are matched with the values of the second link based on the ordinate interval. In fig. 8, text boxes are represented by boxes.

The invention can also comprise a method for checking the format: and verifying the matching value of the first content according to the format characteristics of the first content value. Such as the invoice code format described above.

Example 4

As shown in fig. 9, the first neural network model includes a CTPN model, and the method for obtaining the CTPN model includes:

step 901: and acquiring invoice picture samples with preset numbers. The invoice picture may be provided with a specified size.

Step 902: and cutting the invoice picture sample according to the invoice content to obtain a cut sample.

Step 903: and setting a label for the cut sample to obtain a training set, wherein the label is a text frame coordinate in the cut sample. The text box coordinates may include coordinate values of the opposite corners of the text box, such as the upper left corner and the lower right corner. In one particular embodiment, a training set with 200 ten thousand invoice pictures is constructed.

Step 904: and training the training set based on CTPN neural networks to obtain a first neural network model.

As shown in fig. 10, the method for acquiring the second neural network model includes:

Step 1001: and establishing a training sample and a second sample set according to the character features in the invoice. The training samples may be text block diagrams and the second sample set should cover the individual components of the invoice content.

Step 1002: and taking the characters of the characters in the training sample as the labels of the training sample. In invoices, common characters include numbers, english letters, punctuation and chinese characters, so the codes of these characters can be used as labels. In one embodiment, the character length of the training sample may be set to 10 or 12, which is adapted to the longest value of the invoice content value, so as to improve training efficiency.

Step 1003: and training the second sample set based on DenseNet +CTC neural network to obtain a second neural network model. Wherein DenseNet neural network is connected to CTC neural network, and trains the second sample set as a whole.

Wherein CTPN neural network and DenseNet +CTC neural network are prior art, and the invention is not repeated.

The invention also provides a system for realizing the method, as shown in figure 11, which comprises a slitting module 1, a first neural network module 2, a text block slitting module 3, a second neural network module 4, a text block splicing module 5 and a slitting diagram splicing module 6,

The slitting module 1 is used for slitting the invoice according to the position area of the invoice content to obtain a slitting diagram;

The first neural network module 2 is used for identifying text boxes in the slitting map based on a first neural network model;

The text block diagram dividing and cutting module 3 is used for dividing and cutting the dividing and cutting diagram into text block diagrams based on the position area of the text block;

The second neural network module 4 is used for identifying the characters in the character block diagram based on the second neural network model;

The text frame splicing module 5 is used for acquiring the splicing result of the splitting graph according to the text frame and the recognized text;

the splitting diagram splicing module 6 is used for acquiring the identification result of the invoice according to the splicing result of the splitting diagram.

It should be noted that the type or version of the invoice is different, the arrangement, placement and distribution area of the invoice content is different, and the specific algorithm is different.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An invoice recognition method based on a neural network, which is characterized by comprising the following steps:

cutting the invoice according to the position area of the invoice content to obtain a cutting diagram;

identifying a text box in the slitting map based on a first neural network model;

Dividing the dividing and cutting diagram into character block diagrams based on the position area of the character frame;

identifying text in the text block based on a second neural network model;

Acquiring a splicing result of the splitting graph according to the position area of the text frame and the recognized text;

Acquiring an identification result of the invoice according to the splicing result of the slitting map;

The method for dividing the dividing and cutting diagram into the text block diagram comprises the following steps: and deleting the grid lines of the slitting map to obtain a text block diagram.

2. The invoice recognition method according to claim 1, wherein the method further comprises a method of invoice preprocessing:

Converting the invoice into an invoice picture;

turning the invoice picture right;

Wherein, blank areas are also removed in the text block diagram.

3. The invoice recognition method according to claim 2, wherein the invoice picture forwarding method comprises:

performing inclination correction on the invoice picture based on Hough transformation;

Acquiring the position of a two-dimensional code or a seal in the invoice picture, and acquiring the orientation of the invoice picture according to the position relation of the two-dimensional code or the seal;

And turning the invoice picture forward according to the orientation of the invoice picture.

4. The invoice recognition method as claimed in claim 3, wherein the method of slitting the invoice and obtaining a slitting map comprises:

Acquiring a position area of the invoice content according to the position relation between the invoice content and the two-dimensional code or the seal;

And dividing the invoice picture into a cutting picture according to the position area.

5. The invoice recognition method according to claim 1 or 4, wherein the first neural network model comprises CTPN model, and the method of obtaining the CTPN model comprises:

acquiring invoice picture samples of a preset number;

Dividing the invoice picture sample according to invoice content to obtain divided samples;

Setting a label for the cut sample to obtain a training set; the labels are text frame coordinates in the slitting samples;

and training the training set based on CTPN neural networks to obtain the first neural network model.

6. The invoice recognition method according to claim 1, wherein the method of acquiring the second neural network model includes:

according to character features in the invoice, a training sample and a second sample set are established;

taking characters of characters in the training sample as labels of the training sample;

And training the second sample set based on DenseNet +CTC neural network to obtain a second neural network model.

7. The invoice recognition method according to claim 1, wherein the method for obtaining the split map splicing result according to the text box and the recognized text comprises the following steps:

matching the content of the text frame with the content of the slitting map or the invoice content;

And checking the formats of the text box content and the invoice content value, and obtaining the value of the slitting diagram content or the invoice content.

8. The invoice recognition method according to claim 1, wherein the method for acquiring the split map splicing result according to the position area of the text box and the text comprises the steps of:

Acquiring a coordinate area of a text frame;

matching text frame content with slitting map or invoice content based on an abscissa or ordinate area to obtain first content;

and acquiring the content of the slitting map or the invoice for the first content matching value based on the abscissa or the ordinate area.

9. The invoice recognition method as claimed in claim 8, further comprising a format verification method of:

And verifying the matching value of the first content according to the format characteristics of the first content value.

10. The system for implementing the invoice recognition method as claimed in any one of claims 1 to 8, which comprises a slitting module, a first neural network module, a text block slitting module, a second neural network module, a text block splicing module and a slitting map splicing module,

The slitting module is used for slitting the invoice according to the position area of the invoice content to obtain a slitting diagram;

the first neural network module is used for identifying a text box in the slitting map based on a first neural network model;

The text block diagram dividing and cutting module is used for dividing and cutting the dividing and cutting diagram into text block diagrams based on the position area of the text block;

The second neural network module is used for identifying characters in the character block diagram based on a second neural network model;

the text frame splicing module is used for acquiring a splicing result of the slitting map according to the text frame and the recognized text;

And the splitting graph splicing module is used for acquiring the identification result of the invoice according to the splicing result of the splitting graph.