CN110889311A - Financial electronic facsimile document identification system and method - Google Patents

Financial electronic facsimile document identification system and method Download PDF

Info

Publication number
CN110889311A
CN110889311A CN201811046027.9A CN201811046027A CN110889311A CN 110889311 A CN110889311 A CN 110889311A CN 201811046027 A CN201811046027 A CN 201811046027A CN 110889311 A CN110889311 A CN 110889311A
Authority
CN
China
Prior art keywords
image
module
data
coordinates
character recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811046027.9A
Other languages
Chinese (zh)
Inventor
白石
郭庆河
宋嘉琪
宫路
张怀朋
高海慧
石珍珍
王子芃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yingshisheng Information Technology Co ltd
Original Assignee
Shanghai Huairuo Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Huairuo Intelligent Technology Co Ltd filed Critical Shanghai Huairuo Intelligent Technology Co Ltd
Priority to CN201811046027.9A priority Critical patent/CN110889311A/en
Publication of CN110889311A publication Critical patent/CN110889311A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Accounting & Taxation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to a financial electronic fax document identification system which comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, and the image preprocessing module comprises an image correction module and an image noise reduction module. The invention also discloses a financial electronic fax document identification method, which comprises the processes of image preprocessing, image detection, image character identification, data combination and the like. The invention adopts artificial intelligence to carry out intelligent processing and identification on the pictures with noise in financial transactions, and has the characteristics of high identification efficiency and high quality.

Description

Financial electronic facsimile document identification system and method
Technical Field
The invention relates to a document identification system and a method, in particular to a financial electronic fax document identification system and a financial electronic fax document identification method, and belongs to the field of financial management.
Background
At present, fax documents in the financial field are manually input, so that the fax documents cannot be effectively input in a large batch in a peak period of a transaction, the whole transaction time is prolonged, and the transaction time is delayed, so that the financial field has urgent need for improving the efficiency. At present, the character recognition system on the market can only recognize characters in natural scenes, and cannot process pictures generated by equipment such as a fax machine and a scanner. The pictures generated by the fax machine and the scanner can generate noise different from natural scenes, most of transaction documents in the financial industry are pictures in a form, and the segmentation and detection of the form also make the existing character recognition system in the market useless.
Disclosure of Invention
The invention discloses a system and a method for identifying financial electronic fax documents, which disclose a new scheme, carry out intelligent processing and identification on images with noise in financial transactions by adopting artificial intelligence, and solve the problems of low identification efficiency and low quality caused by adopting an artificial or common character identification system in the existing scheme.
The invention relates to a financial electronic fax document identification system which comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the target detection module is used for detecting the regional content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the regional content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data.
The invention also discloses a financial electronic fax document identification method which is based on a financial electronic fax document identification system, the financial electronic fax document identification system comprises a server, the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character identification module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the method comprises the steps of ⑴ correcting the angle of a received image by the image correction module, denoising the image with the corrected angle by the image denoising module, adjusting the size and the channel attribute of the image subjected to denoising by the image correction module, storing the image into the preprocessed image database ⑵ detecting the preprocessed image by the target detection module by adopting a form detection neural network to obtain the region data type and the coordinates of the image, sending the region data type and the coordinates of the image to the character identification module, segmenting the preprocessed image into a plurality of small images by the character identification mental network by the ⑶ character identification module according to the received data type and the coordinates, then carrying out character identification by adopting a character identification mental network to obtain the text content and the coordinates of the image, and sending the text content and the coordinates to the data merging module ⑷ integrating the text content and the received data merging module into a communication format appointed by a communication protocol.
Furthermore, the preprocessed image database of the method is a memory database, the image preprocessing module stores the preprocessed image into a memory, and the target detection module and the character recognition module read the data information of the image from the memory.
Further, in step ⑴ of the method, the image correction module determines the image direction of the received image by using vgg16 convolutional neural network, and rotates the image into a forward image, the image noise reduction module changes the value of each pixel into the average value of the pixel and the area pixel forming the area by using an image averaging method, and the image correction module adjusts the noise-reduced image into a single channel and stores the single channel into the preprocessed image database for subsequent use.
Further, in step ⑵ of the method, the target detection module loads the detection model to perform target detection on the image, outputs the detected image region type and coordinates, the target detection module sorts the table coordinates, deletes the text coordinates coinciding with the table coordinates, avoids repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends the data of the json structure formed by the plurality of small images, the coordinates and the types to the character recognition module.
Further, in step ⑶ of the method, the character recognition module adopts vgg16 convolutional neural network to match with bidirectional lstm neural network, sequence decryption is performed by adopting ctc time sequence classification, character recognition is performed on the image according to context information, the character recognition module firstly loads a recognition model, analyzes the json file sent by the target detection module, then performs character recognition on the image specified in the json file, and after recognition is completed, coordinates and text content are correspondingly generated to generate json format data, and the json format data is sent to the data merging module.
Further, in step ⑷ of the method, the data merging module filters and deduplicates the data sent by the character recognition module, deletes the coordinates containing negative values and the text, filters the data, sorts the data, integrates the data according to the sort from small to large of the ordinate and the abscissa, and restores the coordinates corresponding to the text according to the position of the original image.
The system and the method for identifying the financial electronic fax document adopt artificial intelligence to carry out intelligent processing and identification on the financial transaction type pictures with noise, and have the characteristics of high identification efficiency and high quality.
Drawings
FIG. 1 is a flow chart of a method of financial electronic facsimile document identification.
Fig. 2 is a flow chart of image preprocessing.
Fig. 3 is a flow chart of object detection.
Fig. 4 is a flow chart of text recognition.
FIG. 5 is a flow chart of data consolidation.
Detailed Description
As shown in FIG. 1, the financial electronic facsimile document identification system of the invention comprises a server, the server comprises an image preprocessing module, a preprocessed image database, an object detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the object detection module is used for detecting the regional content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the regional content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data. According to the scheme, the intelligent processing and recognition of the pictures with noise in the financial transaction class are carried out by adopting artificial intelligence, and the recognition efficiency and quality of the financial electronic fax class documents are greatly improved.
The invention also discloses a financial electronic fax document identification method, which is based on a financial electronic fax document identification system, and comprises a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character identification module and a data merging module, the image preprocessing module comprises an image correction module and an image noise reduction module, the method comprises the steps of correcting the received image by the image correction module at an angle, reducing the noise of the image with the angle corrected by the image noise reduction module, adjusting the size and channel attribute of the image subjected to noise reduction by the image correction module, storing the image into the preprocessed image database, detecting the preprocessed image by the target detection module by a form detection neural network by the ⑵ target detection module to obtain the regional data type and the coordinates of the image, sending the regional data type and the coordinates of the image to the character identification module, segmenting the preprocessed image into a plurality of small images by the character identification mental network by the ⑶ character identification module according to the received data type and the coordinates, carrying out character identification by a text content and coordinates of the received text and the received by the text identification module according to an intelligent region data sequence, and carrying out an intelligent artificial recognition protocol for carrying out fast financial transaction and increasing the efficiency of the financial electronic fax documents.
In order to acquire the file generated by preprocessing more quickly, the preprocessing image database of the method is a memory database, the image preprocessing module stores the preprocessed image into a memory, and the target detection module and the character recognition module read the data information of the image from the memory.
In order to implement the image preprocessing process, as shown in fig. 2, in step ⑴ of the method according to this embodiment, an image correction module determines an image direction of a received image by using a vgg16 convolutional neural network, and rotates the image into a forward image, an image noise reduction module changes a value of each pixel into an average value of the pixel and a region pixel constituting a region by using an image averaging method, and the image correction module adjusts the noise-reduced image into a single channel and stores the single channel in a preprocessed image database for subsequent use.
In order to implement the target detection process of the image, as shown in fig. 3, in step ⑵ of the method according to this embodiment, the target detection module loads the detection model to perform target detection on the image, outputs the type and coordinates of the detected image region, the target detection module sorts the table coordinates, deletes the text coordinates coinciding with the table coordinates, avoids repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends data of a json structure formed by the plurality of small images, the coordinates, and the type to the character recognition module.
In order to implement the text recognition process of the image, as shown in fig. 4, in step ⑶ of the method, a vgg16 convolutional neural network is adopted by a text recognition module to cooperate with a bidirectional lstm neural network, sequence decryption is performed by adopting ctc time sequence classification, text recognition is performed on the image according to context information, the text recognition module firstly loads a recognition model, analyzes a json file sent by a target detection module, then performs text recognition on an image specified in the json file, and after recognition is completed, coordinates and text content are correspondingly generated to generate json format data, and the json format data are sent to a data merging module.
In order to implement the merged output of the data, as shown in fig. 5, in step ⑷ of the method according to this embodiment, the data merging module filters and removes duplicate data sent by the character recognition module, first deletes coordinates and texts containing negative values, performs data filtering, then sorts the data, performs data integration according to the descending order of the ordinate and the abscissa, and restores coordinates corresponding to the texts according to the original image position.
The scheme uses mature tenserflow as neural network frame, can quickly implement weight calculation to mass data to implement character recognition of electronic facsimile, i.e. noise-reducing treatment to the picture noise produced by facsimile machine and scanner, then makes position detection and division to the form in the picture for making subsequent character recognition, as shown in figure 1, the whole flow of said scheme includes that after the picture is completely received, firstly, the picture is fed into image preprocessing module, the picture is preprocessed, including correcting picture angle, removing image noise, storing the processed file in internal memory, the image preprocessing module is connected with target detection module, ⑵ target detection module reads the picture in internal memory, the network is run to detect the form, the detected form is transferred into internal memory, and the text recognition module is combined with small picture detection module according to the character recognition data format, and the above-mentioned character recognition data is combined into small picture detection data, and the above-mentioned character recognition data is combined into small picture data.
The image preprocessing module is a processing entry system of the scheme, and after receiving the picture, firstly, the vgg16 network is used for judging the direction of the picture, and the picture is rotated into a forward file. And then, carrying out noise reduction on the image, wherein the noise reduction mainly adopts an image averaging method, the value of each pixel is changed into the average value of the area formed by the pixel and the field pixel, and finally, the image subjected to noise reduction is adjusted into a single channel and is put into an internal memory for subsequent use, and the module is connected with a target detection module.
The target detection module is a core module of the scheme, the target detection module of the scheme adopts an improved ctpn network, the anti-noise capability is enhanced, the target classification capability is added, and the network training speed is improved. In the aspect of training, five anchors are reduced, the sizes of other anchors are increased, and meanwhile, a softmax layer is added at the last of a network structure, so that the picture areas after returning are classified, and the type of the areas is distinguished to be text content or table content. And finally, converting the picture into a single channel for training, and accelerating the training speed. The reason for reducing anchors is that the image has been corrected, no multi-directional detection of text and tables is needed, and region types are added to remove duplicate text and tables. In the process, firstly, a model is loaded, target detection is carried out on an image, the model can output the detected region type and the region type, the model sorts table coordinates, then deletes text coordinates coincident with the table coordinates to avoid repeated cutting, finally, the current image is divided into a plurality of small images according to the remaining coordinates, and data of a json structure formed by the coordinates, the small images and.
The character recognition module is used for performing a character recognition function and converting character contents in the picture into character strings of the computer. This module uses vgg16 plus a bidirectional lstm network and finally sequence decryption by the ctc. The model is characterized in that bidirectional lstm and ctc are used, and the picture can be effectively subjected to character recognition according to context information. The module firstly loads a network model, analyzes a json file sent by a target detection module, and then performs character recognition on a designated picture in the json file. And the model reads the picture of the specified path and performs character recognition on the picture. And after the identification is finished, the coordinates correspond to the text content to generate data in a json format, and the data are sent to the data merging module.
The data merging module can filter and deduplicate the data sent by the character recognition module. Firstly, deleting coordinates and texts containing negative values, and filtering data. And then sorting the data according to the vertical coordinate and the horizontal coordinate from small to large. And finally, integrating data, and restoring the coordinate-text pair according to the position of the original image.
The scheme can quickly and accurately identify characters of various financial fax files. The system supports large-scale concurrent processing and multi-node deployment, can reduce noise of input pictures, and correctly identifies table regions, text regions and text contents. The system uses mature gpu operation, so that the character recognition of a large picture can be supported, and the processing speed is accelerated. After the system finishes processing, the identification type (background, table and text), the coordinate and the character content can be sent to a designated system. Based on the characteristics, compared with the existing similar schemes, the financial electronic fax document identification system and method have prominent substantive characteristics and remarkable progress. The system and method for identifying financial electronic fax documents is not limited to the disclosure in the specific embodiment, the technical solutions presented in the examples can be extended based on the understanding of those skilled in the art, and the simple alternatives made by those skilled in the art according to the present solution in combination with common general knowledge also belong to the scope of the present solution.

Claims (7)

1. The financial electronic fax document identification system is characterized by comprising a server, wherein the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image denoising module, the image correction module is used for correcting an input image, the image denoising module is used for denoising the input image, the preprocessed image database is used for storing and managing the preprocessed image, the target detection module is used for detecting the region content type and the coordinates of the preprocessed image, the character recognition module is used for carrying out character recognition on the region content of the preprocessed image, and the data merging module is used for processing and integrating the preprocessed and recognized data.
2. The financial electronic fax document identification method is based on a financial electronic fax document identification system, the financial electronic fax document identification system comprises a server, the server comprises an image preprocessing module, a preprocessed image database, a target detection module, a character recognition module and a data merging module, the image preprocessing module comprises an image correction module and an image noise reduction module, and the method is characterized by comprising the following steps of:
⑴ correcting the received image by the image correction module, denoising the image with the corrected angle by the image denoising module, and storing the image with the denoised size and channel attribute in the preprocessed image database by the image correction module;
⑵ the target detection module adopts the table detection neural network to detect the preprocessed image to obtain the area data type and the coordinate of the image, and sends the area data type and the coordinate to the character recognition module;
⑶ the character recognition module divides the preprocessed image into multiple small images according to the received data type and coordinates, and then performs character recognition by using a character recognition mental network to obtain the text content and coordinates of the image and sends the text content and coordinates to the data merging module;
⑷ the data merging module integrates the received text content and coordinates according to the region data type, and serializes them into the designated communication protocol format.
3. The method as claimed in claim 2, wherein the pre-processed image database is a memory database, the image pre-processing module stores the pre-processed image in the memory, and the object detection module and the text recognition module read data information of the image from the memory.
4. The method of claim 2, wherein in step ⑴, the image correction module determines the orientation of the received image by using vgg16 convolutional neural network to rotate the image to a positive direction, the image de-noising module changes the value of each pixel to the average of the pixel and the area pixels forming the area by using image averaging, and the image correction module adjusts the de-noised image to a single channel and stores the single channel in the pre-processed image database for subsequent use.
5. The method of claim 2, wherein in step ⑵, the object detection module loads a detection model to perform object detection on the image, outputs the type and coordinates of the detected image region, orders the table coordinates, deletes the text coordinates coinciding with the table coordinates to avoid repeated cutting, divides the current image into a plurality of small images according to the remaining coordinates, and sends the data of json structure composed of the plurality of small images, coordinates and types to the text recognition module.
6. The method as claimed in claim 2, wherein in step ⑶, the character recognition module employs vgg16 convolutional neural network in cooperation with bidirectional lstm neural network, sequence decryption is performed by ctc time-series classification, character recognition is performed on the image according to context information, the character recognition module first loads a recognition model, parses json file sent by the target detection module, then performs character recognition on the image specified in the json file, and after recognition, correspondingly generates json format data by coordinates and text content and sends the json format data to the data merging module.
7. The method as claimed in claim 2, wherein in step ⑷, the data merging module filters and deduplicates the data sent by the text recognition module, and deletes the negative coordinates and the text to filter the data, sorts the data, integrates the data according to the ascending order of the ordinate and the abscissa, and restores the corresponding coordinates of the text according to the position of the original.
CN201811046027.9A 2018-09-07 2018-09-07 Financial electronic facsimile document identification system and method Pending CN110889311A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811046027.9A CN110889311A (en) 2018-09-07 2018-09-07 Financial electronic facsimile document identification system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811046027.9A CN110889311A (en) 2018-09-07 2018-09-07 Financial electronic facsimile document identification system and method

Publications (1)

Publication Number Publication Date
CN110889311A true CN110889311A (en) 2020-03-17

Family

ID=69744700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811046027.9A Pending CN110889311A (en) 2018-09-07 2018-09-07 Financial electronic facsimile document identification system and method

Country Status (1)

Country Link
CN (1) CN110889311A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652157A (en) * 2020-06-04 2020-09-11 广东外语外贸大学 Dictionary entry extraction and identification method for low-resource languages and general languages
CN111860487A (en) * 2020-07-28 2020-10-30 天津恒达文博科技股份有限公司 Inscription marking detection and recognition system based on deep neural network
CN112215159A (en) * 2020-10-13 2021-01-12 苏州工业园区报关有限公司 International trade document splitting system based on OCR and artificial intelligence technology
CN112418204A (en) * 2020-11-18 2021-02-26 杭州未名信科科技有限公司 Text recognition method, system and computer medium based on paper document
CN112990110A (en) * 2021-04-20 2021-06-18 数库(上海)科技有限公司 Method for extracting key information from research report and related equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0785215A (en) * 1993-09-14 1995-03-31 Nippon Digital Kenkyusho:Kk Character recognizing device
JPH1166196A (en) * 1997-08-15 1999-03-09 Ricoh Co Ltd Document image recognition device and computer-readable recording medium where program allowing computer to function as same device is recorded
US9471833B1 (en) * 2012-04-03 2016-10-18 Intuit Inc. Character recognition using images at different angles
CN108416279A (en) * 2018-02-26 2018-08-17 阿博茨德(北京)科技有限公司 Form analysis method and device in file and picture

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0785215A (en) * 1993-09-14 1995-03-31 Nippon Digital Kenkyusho:Kk Character recognizing device
JPH1166196A (en) * 1997-08-15 1999-03-09 Ricoh Co Ltd Document image recognition device and computer-readable recording medium where program allowing computer to function as same device is recorded
US9471833B1 (en) * 2012-04-03 2016-10-18 Intuit Inc. Character recognition using images at different angles
CN108416279A (en) * 2018-02-26 2018-08-17 阿博茨德(北京)科技有限公司 Form analysis method and device in file and picture

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANGXILUNING: "chinese-ocr", 《GITHUB》 *
WENYUAN XUE, QINGYONG LI, ZHEN ZHANG, YULEI ZHAO, HAO WANG: "Table Analysis and Information Extraction", 《2018 IEEE 16TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 16TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 4TH INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652157A (en) * 2020-06-04 2020-09-11 广东外语外贸大学 Dictionary entry extraction and identification method for low-resource languages and general languages
CN111860487A (en) * 2020-07-28 2020-10-30 天津恒达文博科技股份有限公司 Inscription marking detection and recognition system based on deep neural network
CN111860487B (en) * 2020-07-28 2022-08-19 天津恒达文博科技股份有限公司 Inscription marking detection and recognition system based on deep neural network
CN112215159A (en) * 2020-10-13 2021-01-12 苏州工业园区报关有限公司 International trade document splitting system based on OCR and artificial intelligence technology
CN112418204A (en) * 2020-11-18 2021-02-26 杭州未名信科科技有限公司 Text recognition method, system and computer medium based on paper document
CN112990110A (en) * 2021-04-20 2021-06-18 数库(上海)科技有限公司 Method for extracting key information from research report and related equipment

Similar Documents

Publication Publication Date Title
CN110889311A (en) Financial electronic facsimile document identification system and method
US9898808B1 (en) Systems and methods for removing defects from images
CN108171104B (en) Character detection method and device
WO2020107866A1 (en) Text region obtaining method and apparatus, storage medium and terminal device
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
CN110276279B (en) Method for detecting arbitrary-shape scene text based on image segmentation
CN106355174B (en) Dynamic extraction method and system for key information of express bill
US11704925B2 (en) Systems and methods for digitized document image data spillage recovery
CN113901952A (en) Print form and handwritten form separated character recognition method based on deep learning
CN113139535A (en) OCR document recognition method
CN110490185A (en) One kind identifying improved method based on repeatedly comparison correction OCR card information
CN113642380A (en) Identification technology for wireless form
Lu et al. A shadow removal method for tesseract text recognition
Zhao et al. An effective binarization method for disturbed camera-captured document images
CN114399670A (en) Control method for extracting characters in pictures in 5G messages in real time
CN104715248B (en) A kind of recognition methods to email advertisement picture
CN114267035A (en) Document image processing method and system, electronic device and readable medium
CN113177556A (en) Text image enhancement model, training method, enhancement method and electronic equipment
CN112381088A (en) License plate recognition method and system for oil tank truck
CN111797838A (en) Blind denoising system, method and device for picture documents
CN111445433A (en) Method and device for detecting blank page and fuzzy page of electronic file
Saluja et al. Table Detection and Extraction using OpenCV and Novel Optimization Methods
Liu Degraded character recognition by image quality evaluation
CN111553317B (en) Anti-fake code acquisition method and device, computer equipment and storage medium
CN106056031A (en) Image segmentation algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230412

Address after: Room 3701, Building T2, Shenye Shangcheng (South District), No. 5001 Huanggang Road, Lianhua Yicun Community, Huafu Street, Futian District, Shenzhen City, Guangdong Province, 518035

Applicant after: Shenzhen yingshisheng Information Technology Co.,Ltd.

Address before: Room 823, 2 / F, 148 Lane 999, XINER Road, Baoshan District, Shanghai

Applicant before: Shanghai Huairuo Intelligent Technology Co.,Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317

RJ01 Rejection of invention patent application after publication