CN110889311A - Financial electronic facsimile document identification system and method - Google Patents
Financial electronic facsimile document identification system and method Download PDFInfo
- Publication number
- CN110889311A CN110889311A CN201811046027.9A CN201811046027A CN110889311A CN 110889311 A CN110889311 A CN 110889311A CN 201811046027 A CN201811046027 A CN 201811046027A CN 110889311 A CN110889311 A CN 110889311A
- Authority
- CN
- China
- Prior art keywords
- image
- module
- data
- coordinates
- character recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000001514 detection method Methods 0.000 claims abstract description 49
- 238000007781 pre-processing Methods 0.000 claims abstract description 25
- 238000003702 image correction Methods 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 230000002457 bidirectional effect Effects 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 4
- 230000003340 mental effect Effects 0.000 claims description 4
- 238000004891 communication Methods 0.000 claims description 3
- 230000001174 ascending effect Effects 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 238000012549 training Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000007596 consolidation process Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/243—Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Accounting & Taxation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Economics (AREA)
- Finance (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Character Input (AREA)
- Character Discrimination (AREA)
Abstract
The invention relates to a financial electronic fax document identification system which comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, and the image preprocessing module comprises an image correction module and an image noise reduction module. The invention also discloses a financial electronic fax document identification method, which comprises the processes of image preprocessing, image detection, image character identification, data combination and the like. The invention adopts artificial intelligence to carry out intelligent processing and identification on the pictures with noise in financial transactions, and has the characteristics of high identification efficiency and high quality.
Description
Technical Field
The invention relates to a document identification system and a method, in particular to a financial electronic fax document identification system and a financial electronic fax document identification method, and belongs to the field of financial management.
Background
At present, fax documents in the financial field are manually input, so that the fax documents cannot be effectively input in a large batch in a peak period of a transaction, the whole transaction time is prolonged, and the transaction time is delayed, so that the financial field has urgent need for improving the efficiency. At present, the character recognition system on the market can only recognize characters in natural scenes, and cannot process pictures generated by equipment such as a fax machine and a scanner. The pictures generated by the fax machine and the scanner can generate noise different from natural scenes, most of transaction documents in the financial industry are pictures in a form, and the segmentation and detection of the form also make the existing character recognition system in the market useless.
Disclosure of Invention
The invention discloses a system and a method for identifying financial electronic fax documents, which disclose a new scheme, carry out intelligent processing and identification on images with noise in financial transactions by adopting artificial intelligence, and solve the problems of low identification efficiency and low quality caused by adopting an artificial or common character identification system in the existing scheme.
The invention relates to a financial electronic fax document identification system which comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the target detection module is used for detecting the regional content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the regional content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data.
The invention also discloses a financial electronic fax document identification method which is based on a financial electronic fax document identification system, the financial electronic fax document identification system comprises a server, the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character identification module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the method comprises the steps of ⑴ correcting the angle of a received image by the image correction module, denoising the image with the corrected angle by the image denoising module, adjusting the size and the channel attribute of the image subjected to denoising by the image correction module, storing the image into the preprocessed image database ⑵ detecting the preprocessed image by the target detection module by adopting a form detection neural network to obtain the region data type and the coordinates of the image, sending the region data type and the coordinates of the image to the character identification module, segmenting the preprocessed image into a plurality of small images by the character identification mental network by the ⑶ character identification module according to the received data type and the coordinates, then carrying out character identification by adopting a character identification mental network to obtain the text content and the coordinates of the image, and sending the text content and the coordinates to the data merging module ⑷ integrating the text content and the received data merging module into a communication format appointed by a communication protocol.
Furthermore, the preprocessed image database of the method is a memory database, the image preprocessing module stores the preprocessed image into a memory, and the target detection module and the character recognition module read the data information of the image from the memory.
Further, in step ⑴ of the method, the image correction module determines the image direction of the received image by using vgg16 convolutional neural network, and rotates the image into a forward image, the image noise reduction module changes the value of each pixel into the average value of the pixel and the area pixel forming the area by using an image averaging method, and the image correction module adjusts the noise-reduced image into a single channel and stores the single channel into the preprocessed image database for subsequent use.
Further, in step ⑵ of the method, the target detection module loads the detection model to perform target detection on the image, outputs the detected image region type and coordinates, the target detection module sorts the table coordinates, deletes the text coordinates coinciding with the table coordinates, avoids repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends the data of the json structure formed by the plurality of small images, the coordinates and the types to the character recognition module.
Further, in step ⑶ of the method, the character recognition module adopts vgg16 convolutional neural network to match with bidirectional lstm neural network, sequence decryption is performed by adopting ctc time sequence classification, character recognition is performed on the image according to context information, the character recognition module firstly loads a recognition model, analyzes the json file sent by the target detection module, then performs character recognition on the image specified in the json file, and after recognition is completed, coordinates and text content are correspondingly generated to generate json format data, and the json format data is sent to the data merging module.
Further, in step ⑷ of the method, the data merging module filters and deduplicates the data sent by the character recognition module, deletes the coordinates containing negative values and the text, filters the data, sorts the data, integrates the data according to the sort from small to large of the ordinate and the abscissa, and restores the coordinates corresponding to the text according to the position of the original image.
The system and the method for identifying the financial electronic fax document adopt artificial intelligence to carry out intelligent processing and identification on the financial transaction type pictures with noise, and have the characteristics of high identification efficiency and high quality.
Drawings
FIG. 1 is a flow chart of a method of financial electronic facsimile document identification.
Fig. 2 is a flow chart of image preprocessing.
Fig. 3 is a flow chart of object detection.
Fig. 4 is a flow chart of text recognition.
FIG. 5 is a flow chart of data consolidation.
Detailed Description
As shown in FIG. 1, the financial electronic facsimile document identification system of the invention comprises a server, the server comprises an image preprocessing module, a preprocessed image database, an object detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the object detection module is used for detecting the regional content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the regional content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data. According to the scheme, the intelligent processing and recognition of the pictures with noise in the financial transaction class are carried out by adopting artificial intelligence, and the recognition efficiency and quality of the financial electronic fax class documents are greatly improved.
The invention also discloses a financial electronic fax document identification method, which is based on a financial electronic fax document identification system, and comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character identification module and a data merging module, the image preprocessing module comprises an image correction module and an image noise reduction module, the method comprises the steps of correcting the received image by the image correction module at an angle, reducing the noise of the image with the angle corrected by the image noise reduction module, adjusting the size and channel attribute of the image subjected to noise reduction by the image correction module, storing the image into the preprocessed image database, detecting the preprocessed image by the target detection module by a form detection neural network by the ⑵ target detection module to obtain the regional data type and the coordinates of the image, sending the regional data type and the coordinates of the image to the character identification module, segmenting the preprocessed image into a plurality of small images by the character identification mental network by the ⑶ character identification module according to the received data type and the coordinates, carrying out character identification by a text content and coordinates of the received text and the received by the text identification module according to an intelligent region data sequence, and carrying out an intelligent artificial recognition protocol for carrying out fast financial transaction and increasing the efficiency of the financial electronic fax documents.
In order to acquire the file generated by preprocessing more quickly, the preprocessing image database of the method is a memory database, the image preprocessing module stores the preprocessed image into a memory, and the target detection module and the character recognition module read the data information of the image from the memory.
In order to implement the image preprocessing process, as shown in fig. 2, in step ⑴ of the method according to this embodiment, an image correction module determines an image direction of a received image by using a vgg16 convolutional neural network, and rotates the image into a forward image, an image noise reduction module changes a value of each pixel into an average value of the pixel and a region pixel constituting a region by using an image averaging method, and the image correction module adjusts the noise-reduced image into a single channel and stores the single channel in a preprocessed image database for subsequent use.
In order to implement the target detection process of the image, as shown in fig. 3, in step ⑵ of the method according to this embodiment, the target detection module loads the detection model to perform target detection on the image, outputs the type and coordinates of the detected image region, the target detection module sorts the table coordinates, deletes the text coordinates coinciding with the table coordinates, avoids repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends data of a json structure formed by the plurality of small images, the coordinates, and the type to the character recognition module.
In order to implement the text recognition process of the image, as shown in fig. 4, in step ⑶ of the method, a vgg16 convolutional neural network is adopted by a text recognition module to cooperate with a bidirectional lstm neural network, sequence decryption is performed by adopting ctc time sequence classification, text recognition is performed on the image according to context information, the text recognition module firstly loads a recognition model, analyzes a json file sent by a target detection module, then performs text recognition on an image specified in the json file, and after recognition is completed, coordinates and text content are correspondingly generated to generate json format data, and the json format data are sent to a data merging module.
In order to implement the merged output of the data, as shown in fig. 5, in step ⑷ of the method according to this embodiment, the data merging module filters and removes duplicate data sent by the character recognition module, first deletes coordinates and texts containing negative values, performs data filtering, then sorts the data, performs data integration according to the descending order of the ordinate and the abscissa, and restores coordinates corresponding to the texts according to the original image position.
The scheme uses mature tenserflow as neural network frame, can quickly implement weight calculation to mass data to implement character recognition of electronic facsimile, i.e. noise-reducing treatment to the picture noise produced by facsimile machine and scanner, then makes position detection and division to the form in the picture for making subsequent character recognition, as shown in figure 1, the whole flow of said scheme includes that after the picture is completely received, firstly, the picture is fed into image preprocessing module, the picture is preprocessed, including correcting picture angle, removing image noise, storing the processed file in internal memory, the image preprocessing module is connected with target detection module, ⑵ target detection module reads the picture in internal memory, the network is run to detect the form, the detected form is transferred into internal memory, and the text recognition module is combined with small picture detection module according to the character recognition data format, and the above-mentioned character recognition data is combined into small picture detection data, and the above-mentioned character recognition data is combined into small picture data.
The image preprocessing module is a processing entry system of the scheme, and after receiving the picture, firstly, the vgg16 network is used for judging the direction of the picture, and the picture is rotated into a forward file. And then, carrying out noise reduction on the image, wherein the noise reduction mainly adopts an image averaging method, the value of each pixel is changed into the average value of the area formed by the pixel and the field pixel, and finally, the image subjected to noise reduction is adjusted into a single channel and is put into an internal memory for subsequent use, and the module is connected with a target detection module.
The target detection module is a core module of the scheme, the target detection module of the scheme adopts an improved ctpn network, the anti-noise capability is enhanced, the target classification capability is added, and the network training speed is improved. In the aspect of training, five anchors are reduced, the sizes of other anchors are increased, and meanwhile, a softmax layer is added at the last of a network structure, so that the picture areas after returning are classified, and the type of the areas is distinguished to be text content or table content. And finally, converting the picture into a single channel for training, and accelerating the training speed. The reason for reducing anchors is that the image has been corrected, no multi-directional detection of text and tables is needed, and region types are added to remove duplicate text and tables. In the process, firstly, a model is loaded, target detection is carried out on an image, the model can output the detected region type and the region type, the model sorts table coordinates, then deletes text coordinates coincident with the table coordinates to avoid repeated cutting, finally, the current image is divided into a plurality of small images according to the remaining coordinates, and data of a json structure formed by the coordinates, the small images and.
The character recognition module is used for performing a character recognition function and converting character contents in the picture into character strings of the computer. This module uses vgg16 plus a bidirectional lstm network and finally sequence decryption by the ctc. The model is characterized in that bidirectional lstm and ctc are used, and the picture can be effectively subjected to character recognition according to context information. The module firstly loads a network model, analyzes a json file sent by a target detection module, and then performs character recognition on a designated picture in the json file. And the model reads the picture of the specified path and performs character recognition on the picture. And after the identification is finished, the coordinates correspond to the text content to generate data in a json format, and the data are sent to the data merging module.
The data merging module can filter and deduplicate the data sent by the character recognition module. Firstly, deleting coordinates and texts containing negative values, and filtering data. And then sorting the data according to the vertical coordinate and the horizontal coordinate from small to large. And finally, integrating data, and restoring the coordinate-text pair according to the position of the original image.
The scheme can quickly and accurately identify characters of various financial fax files. The system supports large-scale concurrent processing and multi-node deployment, can reduce noise of input pictures, and correctly identifies table regions, text regions and text contents. The system uses mature gpu operation, so that the character recognition of a large picture can be supported, and the processing speed is accelerated. After the system finishes processing, the identification type (background, table and text), the coordinate and the character content can be sent to a designated system. Based on the characteristics, compared with the existing similar schemes, the financial electronic fax document identification system and method have prominent substantive characteristics and remarkable progress. The system and method for identifying financial electronic fax documents is not limited to the disclosure in the specific embodiment, the technical solutions presented in the examples can be extended based on the understanding of those skilled in the art, and the simple alternatives made by those skilled in the art according to the present solution in combination with common general knowledge also belong to the scope of the present solution.
Claims (7)
1. The financial electronic fax document identification system is characterized by comprising a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the target detection module is used for detecting the region content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the region content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data.
2. The financial electronic fax document identification method is based on a financial electronic fax document identification system, the financial electronic fax document identification system comprises a server, the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image noise reduction module, and the method is characterized by comprising the following steps of:
⑴ correcting the received image by the image correction module, denoising the image with the corrected angle by the image denoising module, and storing the image with the denoised size and channel attribute in the preprocessed image database by the image correction module;
⑵ the target detection module adopts the table detection neural network to detect the preprocessed image to obtain the area data type and the coordinate of the image, and sends the area data type and the coordinate to the character recognition module;
⑶ the character recognition module divides the preprocessed image into multiple small images according to the received data type and coordinates, and then performs character recognition by using a character recognition mental network to obtain the text content and coordinates of the image and sends the text content and coordinates to the data merging module;
⑷ the data merging module integrates the received text content and coordinates according to the region data type, and serializes them into the designated communication protocol format.
3. The method as claimed in claim 2, wherein the pre-processed image database is a memory database, the image pre-processing module stores the pre-processed image in the memory, and the object detection module and the text recognition module read data information of the image from the memory.
4. The method of claim 2, wherein in step ⑴, the image correction module determines the orientation of the received image by using vgg16 convolutional neural network to rotate the image to a positive direction, the image de-noising module changes the value of each pixel to the average of the pixel and the area pixels forming the area by using image averaging, and the image correction module adjusts the de-noised image to a single channel and stores the single channel in the pre-processed image database for subsequent use.
5. The method of claim 2, wherein in step ⑵, the object detection module loads a detection model to perform object detection on the image, outputs the type and coordinates of the detected image region, orders the table coordinates, deletes the text coordinates coinciding with the table coordinates to avoid repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends the data of json structure composed of the plurality of small images, coordinates and types to the text recognition module.
6. The method as claimed in claim 2, wherein in step ⑶, the character recognition module employs vgg16 convolutional neural network in cooperation with bidirectional lstm neural network, sequence decryption is performed by ctc time-series classification, character recognition is performed on the image according to context information, the character recognition module first loads a recognition model, parses json file sent by the target detection module, then performs character recognition on the image specified in the json file, and after recognition, correspondingly generates json format data by coordinates and text content and sends the json format data to the data merging module.
7. The method as claimed in claim 2, wherein in step ⑷, the data merging module filters and deduplicates the data sent by the text recognition module, and deletes the negative coordinates and the text to filter the data, sorts the data, integrates the data according to the ascending order of the ordinate and the abscissa, and restores the corresponding coordinates of the text according to the position of the original.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811046027.9A CN110889311A (en) | 2018-09-07 | 2018-09-07 | Financial electronic facsimile document identification system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811046027.9A CN110889311A (en) | 2018-09-07 | 2018-09-07 | Financial electronic facsimile document identification system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110889311A true CN110889311A (en) | 2020-03-17 |
Family
ID=69744700
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811046027.9A Pending CN110889311A (en) | 2018-09-07 | 2018-09-07 | Financial electronic facsimile document identification system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110889311A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652157A (en) * | 2020-06-04 | 2020-09-11 | 广东外语外贸大学 | Dictionary entry extraction and identification method for low-resource languages and general languages |
CN111860487A (en) * | 2020-07-28 | 2020-10-30 | 天津恒达文博科技股份有限公司 | Inscription marking detection and recognition system based on deep neural network |
CN112215159A (en) * | 2020-10-13 | 2021-01-12 | 苏州工业园区报关有限公司 | International trade document splitting system based on OCR and artificial intelligence technology |
CN112418204A (en) * | 2020-11-18 | 2021-02-26 | 杭州未名信科科技有限公司 | Text recognition method, system and computer medium based on paper document |
CN112990110A (en) * | 2021-04-20 | 2021-06-18 | 数库(上海)科技有限公司 | Method for extracting key information from research report and related equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0785215A (en) * | 1993-09-14 | 1995-03-31 | Nippon Digital Kenkyusho:Kk | Character recognizing device |
JPH1166196A (en) * | 1997-08-15 | 1999-03-09 | Ricoh Co Ltd | Document image recognition device and computer-readable recording medium where program allowing computer to function as same device is recorded |
US9471833B1 (en) * | 2012-04-03 | 2016-10-18 | Intuit Inc. | Character recognition using images at different angles |
CN108416279A (en) * | 2018-02-26 | 2018-08-17 | 阿博茨德(北京)科技有限公司 | Form analysis method and device in file and picture |
-
2018
- 2018-09-07 CN CN201811046027.9A patent/CN110889311A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0785215A (en) * | 1993-09-14 | 1995-03-31 | Nippon Digital Kenkyusho:Kk | Character recognizing device |
JPH1166196A (en) * | 1997-08-15 | 1999-03-09 | Ricoh Co Ltd | Document image recognition device and computer-readable recording medium where program allowing computer to function as same device is recorded |
US9471833B1 (en) * | 2012-04-03 | 2016-10-18 | Intuit Inc. | Character recognition using images at different angles |
CN108416279A (en) * | 2018-02-26 | 2018-08-17 | 阿博茨德(北京)科技有限公司 | Form analysis method and device in file and picture |
Non-Patent Citations (2)
Title |
---|
JIANGXILUNING: "chinese-ocr", 《GITHUB》 * |
WENYUAN XUE, QINGYONG LI, ZHEN ZHANG, YULEI ZHAO, HAO WANG: "Table Analysis and Information Extraction", 《2018 IEEE 16TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 16TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 4TH INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652157A (en) * | 2020-06-04 | 2020-09-11 | 广东外语外贸大学 | Dictionary entry extraction and identification method for low-resource languages and general languages |
CN111860487A (en) * | 2020-07-28 | 2020-10-30 | 天津恒达文博科技股份有限公司 | Inscription marking detection and recognition system based on deep neural network |
CN111860487B (en) * | 2020-07-28 | 2022-08-19 | 天津恒达文博科技股份有限公司 | Inscription marking detection and recognition system based on deep neural network |
CN112215159A (en) * | 2020-10-13 | 2021-01-12 | 苏州工业园区报关有限公司 | International trade document splitting system based on OCR and artificial intelligence technology |
CN112418204A (en) * | 2020-11-18 | 2021-02-26 | 杭州未名信科科技有限公司 | Text recognition method, system and computer medium based on paper document |
CN112990110A (en) * | 2021-04-20 | 2021-06-18 | 数库(上海)科技有限公司 | Method for extracting key information from research report and related equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110889311A (en) | Financial electronic facsimile document identification system and method | |
US9898808B1 (en) | Systems and methods for removing defects from images | |
CN108171104B (en) | Character detection method and device | |
WO2020107866A1 (en) | Text region obtaining method and apparatus, storage medium and terminal device | |
CN104751142B (en) | A kind of natural scene Method for text detection based on stroke feature | |
CN110276279B (en) | Method for detecting arbitrary-shape scene text based on image segmentation | |
CN106355174B (en) | Dynamic extraction method and system for key information of express bill | |
US11704925B2 (en) | Systems and methods for digitized document image data spillage recovery | |
CN113901952A (en) | Print form and handwritten form separated character recognition method based on deep learning | |
CN113139535A (en) | OCR document recognition method | |
CN110490185A (en) | One kind identifying improved method based on repeatedly comparison correction OCR card information | |
CN113642380A (en) | Identification technology for wireless form | |
Lu et al. | A shadow removal method for tesseract text recognition | |
Zhao et al. | An effective binarization method for disturbed camera-captured document images | |
CN114399670A (en) | Control method for extracting characters in pictures in 5G messages in real time | |
CN104715248B (en) | A kind of recognition methods to email advertisement picture | |
CN114267035A (en) | Document image processing method and system, electronic device and readable medium | |
CN113177556A (en) | Text image enhancement model, training method, enhancement method and electronic equipment | |
CN112381088A (en) | License plate recognition method and system for oil tank truck | |
CN111797838A (en) | Blind denoising system, method and device for picture documents | |
CN111445433A (en) | Method and device for detecting blank page and fuzzy page of electronic file | |
Saluja et al. | Table Detection and Extraction using OpenCV and Novel Optimization Methods | |
Liu | Degraded character recognition by image quality evaluation | |
CN111553317B (en) | Anti-fake code acquisition method and device, computer equipment and storage medium | |
CN106056031A (en) | Image segmentation algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230412 Address after: Room 3701, Building T2, Shenye Shangcheng (South District), No. 5001 Huanggang Road, Lianhua Yicun Community, Huafu Street, Futian District, Shenzhen City, Guangdong Province, 518035 Applicant after: Shenzhen yingshisheng Information Technology Co.,Ltd. Address before: Room 823, 2 / F, 148 Lane 999, XINER Road, Baoshan District, Shanghai Applicant before: Shanghai Huairuo Intelligent Technology Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200317 |
|
RJ01 | Rejection of invention patent application after publication |