CN110889311A

CN110889311A - Financial electronic facsimile document identification system and method

Info

Publication number: CN110889311A
Application number: CN201811046027.9A
Authority: CN
Inventors: 白石; 郭庆河; 宋嘉琪; 宫路; 张怀朋; 高海慧; 石珍珍; 王子芃
Original assignee: Shanghai Huairuo Intelligent Technology Co Ltd
Current assignee: Shenzhen Yingshisheng Information Technology Co ltd
Priority date: 2018-09-07
Filing date: 2018-09-07
Publication date: 2020-03-17

Abstract

The invention relates to a financial electronic fax document identification system which comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, and the image preprocessing module comprises an image correction module and an image noise reduction module. The invention also discloses a financial electronic fax document identification method, which comprises the processes of image preprocessing, image detection, image character identification, data combination and the like. The invention adopts artificial intelligence to carry out intelligent processing and identification on the pictures with noise in financial transactions, and has the characteristics of high identification efficiency and high quality.

Description

Financial electronic facsimile document identification system and method

Technical Field

The invention relates to a document identification system and a method, in particular to a financial electronic fax document identification system and a financial electronic fax document identification method, and belongs to the field of financial management.

Background

At present, fax documents in the financial field are manually input, so that the fax documents cannot be effectively input in a large batch in a peak period of a transaction, the whole transaction time is prolonged, and the transaction time is delayed, so that the financial field has urgent need for improving the efficiency. At present, the character recognition system on the market can only recognize characters in natural scenes, and cannot process pictures generated by equipment such as a fax machine and a scanner. The pictures generated by the fax machine and the scanner can generate noise different from natural scenes, most of transaction documents in the financial industry are pictures in a form, and the segmentation and detection of the form also make the existing character recognition system in the market useless.

Disclosure of Invention

The invention discloses a system and a method for identifying financial electronic fax documents, which disclose a new scheme, carry out intelligent processing and identification on images with noise in financial transactions by adopting artificial intelligence, and solve the problems of low identification efficiency and low quality caused by adopting an artificial or common character identification system in the existing scheme.

The invention relates to a financial electronic fax document identification system which comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the target detection module is used for detecting the regional content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the regional content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data.

The invention also discloses a financial electronic fax document identification method which is based on a financial electronic fax document identification system, the financial electronic fax document identification system comprises a server, the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character identification module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the method comprises the steps of ⑴ correcting the angle of a received image by the image correction module, denoising the image with the corrected angle by the image denoising module, adjusting the size and the channel attribute of the image subjected to denoising by the image correction module, storing the image into the preprocessed image database ⑵ detecting the preprocessed image by the target detection module by adopting a form detection neural network to obtain the region data type and the coordinates of the image, sending the region data type and the coordinates of the image to the character identification module, segmenting the preprocessed image into a plurality of small images by the character identification mental network by the ⑶ character identification module according to the received data type and the coordinates, then carrying out character identification by adopting a character identification mental network to obtain the text content and the coordinates of the image, and sending the text content and the coordinates to the data merging module ⑷ integrating the text content and the received data merging module into a communication format appointed by a communication protocol.

Furthermore, the preprocessed image database of the method is a memory database, the image preprocessing module stores the preprocessed image into a memory, and the target detection module and the character recognition module read the data information of the image from the memory.

Further, in step ⑴ of the method, the image correction module determines the image direction of the received image by using vgg16 convolutional neural network, and rotates the image into a forward image, the image noise reduction module changes the value of each pixel into the average value of the pixel and the area pixel forming the area by using an image averaging method, and the image correction module adjusts the noise-reduced image into a single channel and stores the single channel into the preprocessed image database for subsequent use.

Further, in step ⑵ of the method, the target detection module loads the detection model to perform target detection on the image, outputs the detected image region type and coordinates, the target detection module sorts the table coordinates, deletes the text coordinates coinciding with the table coordinates, avoids repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends the data of the json structure formed by the plurality of small images, the coordinates and the types to the character recognition module.

Further, in step ⑶ of the method, the character recognition module adopts vgg16 convolutional neural network to match with bidirectional lstm neural network, sequence decryption is performed by adopting ctc time sequence classification, character recognition is performed on the image according to context information, the character recognition module firstly loads a recognition model, analyzes the json file sent by the target detection module, then performs character recognition on the image specified in the json file, and after recognition is completed, coordinates and text content are correspondingly generated to generate json format data, and the json format data is sent to the data merging module.

Further, in step ⑷ of the method, the data merging module filters and deduplicates the data sent by the character recognition module, deletes the coordinates containing negative values and the text, filters the data, sorts the data, integrates the data according to the sort from small to large of the ordinate and the abscissa, and restores the coordinates corresponding to the text according to the position of the original image.

The system and the method for identifying the financial electronic fax document adopt artificial intelligence to carry out intelligent processing and identification on the financial transaction type pictures with noise, and have the characteristics of high identification efficiency and high quality.

Drawings

FIG. 1 is a flow chart of a method of financial electronic facsimile document identification.

Fig. 2 is a flow chart of image preprocessing.

Fig. 3 is a flow chart of object detection.

Fig. 4 is a flow chart of text recognition.

FIG. 5 is a flow chart of data consolidation.

Detailed Description

As shown in FIG. 1, the financial electronic facsimile document identification system of the invention comprises a server, the server comprises an image preprocessing module, a preprocessed image database, an object detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the object detection module is used for detecting the regional content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the regional content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data. According to the scheme, the intelligent processing and recognition of the pictures with noise in the financial transaction class are carried out by adopting artificial intelligence, and the recognition efficiency and quality of the financial electronic fax class documents are greatly improved.

The invention also discloses a financial electronic fax document identification method, which is based on a financial electronic fax document identification system, and comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character identification module and a data merging module, the image preprocessing module comprises an image correction module and an image noise reduction module, the method comprises the steps of correcting the received image by the image correction module at an angle, reducing the noise of the image with the angle corrected by the image noise reduction module, adjusting the size and channel attribute of the image subjected to noise reduction by the image correction module, storing the image into the preprocessed image database, detecting the preprocessed image by the target detection module by a form detection neural network by the ⑵ target detection module to obtain the regional data type and the coordinates of the image, sending the regional data type and the coordinates of the image to the character identification module, segmenting the preprocessed image into a plurality of small images by the character identification mental network by the ⑶ character identification module according to the received data type and the coordinates, carrying out character identification by a text content and coordinates of the received text and the received by the text identification module according to an intelligent region data sequence, and carrying out an intelligent artificial recognition protocol for carrying out fast financial transaction and increasing the efficiency of the financial electronic fax documents.

In order to acquire the file generated by preprocessing more quickly, the preprocessing image database of the method is a memory database, the image preprocessing module stores the preprocessed image into a memory, and the target detection module and the character recognition module read the data information of the image from the memory.

In order to implement the image preprocessing process, as shown in fig. 2, in step ⑴ of the method according to this embodiment, an image correction module determines an image direction of a received image by using a vgg16 convolutional neural network, and rotates the image into a forward image, an image noise reduction module changes a value of each pixel into an average value of the pixel and a region pixel constituting a region by using an image averaging method, and the image correction module adjusts the noise-reduced image into a single channel and stores the single channel in a preprocessed image database for subsequent use.

In order to implement the target detection process of the image, as shown in fig. 3, in step ⑵ of the method according to this embodiment, the target detection module loads the detection model to perform target detection on the image, outputs the type and coordinates of the detected image region, the target detection module sorts the table coordinates, deletes the text coordinates coinciding with the table coordinates, avoids repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends data of a json structure formed by the plurality of small images, the coordinates, and the type to the character recognition module.

In order to implement the text recognition process of the image, as shown in fig. 4, in step ⑶ of the method, a vgg16 convolutional neural network is adopted by a text recognition module to cooperate with a bidirectional lstm neural network, sequence decryption is performed by adopting ctc time sequence classification, text recognition is performed on the image according to context information, the text recognition module firstly loads a recognition model, analyzes a json file sent by a target detection module, then performs text recognition on an image specified in the json file, and after recognition is completed, coordinates and text content are correspondingly generated to generate json format data, and the json format data are sent to a data merging module.

In order to implement the merged output of the data, as shown in fig. 5, in step ⑷ of the method according to this embodiment, the data merging module filters and removes duplicate data sent by the character recognition module, first deletes coordinates and texts containing negative values, performs data filtering, then sorts the data, performs data integration according to the descending order of the ordinate and the abscissa, and restores coordinates corresponding to the texts according to the original image position.

The scheme uses mature tenserflow as neural network frame, can quickly implement weight calculation to mass data to implement character recognition of electronic facsimile, i.e. noise-reducing treatment to the picture noise produced by facsimile machine and scanner, then makes position detection and division to the form in the picture for making subsequent character recognition, as shown in figure 1, the whole flow of said scheme includes that after the picture is completely received, firstly, the picture is fed into image preprocessing module, the picture is preprocessed, including correcting picture angle, removing image noise, storing the processed file in internal memory, the image preprocessing module is connected with target detection module, ⑵ target detection module reads the picture in internal memory, the network is run to detect the form, the detected form is transferred into internal memory, and the text recognition module is combined with small picture detection module according to the character recognition data format, and the above-mentioned character recognition data is combined into small picture detection data, and the above-mentioned character recognition data is combined into small picture data.

The image preprocessing module is a processing entry system of the scheme, and after receiving the picture, firstly, the vgg16 network is used for judging the direction of the picture, and the picture is rotated into a forward file. And then, carrying out noise reduction on the image, wherein the noise reduction mainly adopts an image averaging method, the value of each pixel is changed into the average value of the area formed by the pixel and the field pixel, and finally, the image subjected to noise reduction is adjusted into a single channel and is put into an internal memory for subsequent use, and the module is connected with a target detection module.

The target detection module is a core module of the scheme, the target detection module of the scheme adopts an improved ctpn network, the anti-noise capability is enhanced, the target classification capability is added, and the network training speed is improved. In the aspect of training, five anchors are reduced, the sizes of other anchors are increased, and meanwhile, a softmax layer is added at the last of a network structure, so that the picture areas after returning are classified, and the type of the areas is distinguished to be text content or table content. And finally, converting the picture into a single channel for training, and accelerating the training speed. The reason for reducing anchors is that the image has been corrected, no multi-directional detection of text and tables is needed, and region types are added to remove duplicate text and tables. In the process, firstly, a model is loaded, target detection is carried out on an image, the model can output the detected region type and the region type, the model sorts table coordinates, then deletes text coordinates coincident with the table coordinates to avoid repeated cutting, finally, the current image is divided into a plurality of small images according to the remaining coordinates, and data of a json structure formed by the coordinates, the small images and.

The character recognition module is used for performing a character recognition function and converting character contents in the picture into character strings of the computer. This module uses vgg16 plus a bidirectional lstm network and finally sequence decryption by the ctc. The model is characterized in that bidirectional lstm and ctc are used, and the picture can be effectively subjected to character recognition according to context information. The module firstly loads a network model, analyzes a json file sent by a target detection module, and then performs character recognition on a designated picture in the json file. And the model reads the picture of the specified path and performs character recognition on the picture. And after the identification is finished, the coordinates correspond to the text content to generate data in a json format, and the data are sent to the data merging module.

The data merging module can filter and deduplicate the data sent by the character recognition module. Firstly, deleting coordinates and texts containing negative values, and filtering data. And then sorting the data according to the vertical coordinate and the horizontal coordinate from small to large. And finally, integrating data, and restoring the coordinate-text pair according to the position of the original image.

The scheme can quickly and accurately identify characters of various financial fax files. The system supports large-scale concurrent processing and multi-node deployment, can reduce noise of input pictures, and correctly identifies table regions, text regions and text contents. The system uses mature gpu operation, so that the character recognition of a large picture can be supported, and the processing speed is accelerated. After the system finishes processing, the identification type (background, table and text), the coordinate and the character content can be sent to a designated system. Based on the characteristics, compared with the existing similar schemes, the financial electronic fax document identification system and method have prominent substantive characteristics and remarkable progress. The system and method for identifying financial electronic fax documents is not limited to the disclosure in the specific embodiment, the technical solutions presented in the examples can be extended based on the understanding of those skilled in the art, and the simple alternatives made by those skilled in the art according to the present solution in combination with common general knowledge also belong to the scope of the present solution.

Claims

1. The financial electronic fax document identification system is characterized by comprising a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the target detection module is used for detecting the region content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the region content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data.

2. The financial electronic fax document identification method is based on a financial electronic fax document identification system, the financial electronic fax document identification system comprises a server, the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image noise reduction module, and the method is characterized by comprising the following steps of:

⑴ correcting the received image by the image correction module, denoising the image with the corrected angle by the image denoising module, and storing the image with the denoised size and channel attribute in the preprocessed image database by the image correction module;

⑵ the target detection module adopts the table detection neural network to detect the preprocessed image to obtain the area data type and the coordinate of the image, and sends the area data type and the coordinate to the character recognition module;

⑶ the character recognition module divides the preprocessed image into multiple small images according to the received data type and coordinates, and then performs character recognition by using a character recognition mental network to obtain the text content and coordinates of the image and sends the text content and coordinates to the data merging module;

⑷ the data merging module integrates the received text content and coordinates according to the region data type, and serializes them into the designated communication protocol format.

3. The method as claimed in claim 2, wherein the pre-processed image database is a memory database, the image pre-processing module stores the pre-processed image in the memory, and the object detection module and the text recognition module read data information of the image from the memory.

4. The method of claim 2, wherein in step ⑴, the image correction module determines the orientation of the received image by using vgg16 convolutional neural network to rotate the image to a positive direction, the image de-noising module changes the value of each pixel to the average of the pixel and the area pixels forming the area by using image averaging, and the image correction module adjusts the de-noised image to a single channel and stores the single channel in the pre-processed image database for subsequent use.

5. The method of claim 2, wherein in step ⑵, the object detection module loads a detection model to perform object detection on the image, outputs the type and coordinates of the detected image region, orders the table coordinates, deletes the text coordinates coinciding with the table coordinates to avoid repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends the data of json structure composed of the plurality of small images, coordinates and types to the text recognition module.

6. The method as claimed in claim 2, wherein in step ⑶, the character recognition module employs vgg16 convolutional neural network in cooperation with bidirectional lstm neural network, sequence decryption is performed by ctc time-series classification, character recognition is performed on the image according to context information, the character recognition module first loads a recognition model, parses json file sent by the target detection module, then performs character recognition on the image specified in the json file, and after recognition, correspondingly generates json format data by coordinates and text content and sends the json format data to the data merging module.

7. The method as claimed in claim 2, wherein in step ⑷, the data merging module filters and deduplicates the data sent by the text recognition module, and deletes the negative coordinates and the text to filter the data, sorts the data, integrates the data according to the ascending order of the ordinate and the abscissa, and restores the corresponding coordinates of the text according to the position of the original.