CN110263694A - A kind of bank slip recognition method and device - Google Patents

A kind of bank slip recognition method and device Download PDF

Info

Publication number
CN110263694A
CN110263694A CN201910510860.2A CN201910510860A CN110263694A CN 110263694 A CN110263694 A CN 110263694A CN 201910510860 A CN201910510860 A CN 201910510860A CN 110263694 A CN110263694 A CN 110263694A
Authority
CN
China
Prior art keywords
bill
identified
image
images
item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910510860.2A
Other languages
Chinese (zh)
Inventor
马文伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd, Taikang Online Property Insurance Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN201910510860.2A priority Critical patent/CN110263694A/en
Publication of CN110263694A publication Critical patent/CN110263694A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/125Finance or payroll
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/418Document matching, e.g. of document images

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

This application discloses a kind of bank slip recognition method and devices, the bank slip recognition method orients the note position of bill to be identified from images to be recognized first and identifies the bill type of the bill to be identified, determine the image of bill to be identified, then it is corrected according to image of the default ticket templates corresponding with the bill type to the bill to be identified, obtain correction bill images, the item to be identified in the correction bill images is extracted, the content information of the item to be identified is identified finally by character recognition technology.No matter on the bank slip recognition method and device scanned copy image, there are how many bill images, it can be based on deep learning algorithm and preset ticket templates and parameter identify each of scanned copy bill images respectively, recognition efficiency is high and accurate, can greatly promote the efficiency of finance reimbursement process work.

Description

A kind of bank slip recognition method and device
Technical field
The present invention relates to image recognition technologys, and more specifically, it relates to a kind of bank slip recognition method and devices.
Background technique
Finance reimbursement is the financial process that each Finance Department, incorporated business is frequently necessary to away, this needs financial staff in the process The bill (for example invoice, receipt) submitted to application reimbursement personnel is audited.In addition, application reimbursement personnel can will need to submit an expense account Bill it is irregular paste on blank sheet of paper, electronic image is then scanned into, for staying shelves.Since traditional financial staff audits The features such as there are human inputs for the working method of bill greatly, low efficiency, has developed the phase of some electronic recognition invoices in recent years Pass technology.
Bill images are done two in the process by the thought for generally using template to the electronic recognition algorithm of bill scanned copy at present Value pretreatment excludes the pixel interference outside bill paper to facilitate contour detecting, finally carries out text pretreatment and text Identification.However, the method for above-mentioned electronic recognition bill is usually applicable in just for the image scanning part for only including a bill, to reality The scene of multiple bills is posted on scanned copy in the work of border and is not suitable for.
Summary of the invention
In view of this, the present invention provides one kind, with overcome the problems, such as in the prior art due to.
To achieve the above object, the invention provides the following technical scheme:
A kind of bank slip recognition method, comprising:
The note position of bill to be identified is oriented from images to be recognized and identifies the bill of the bill to be identified Type determines the image of bill to be identified;
It is corrected, obtains according to image of the default ticket templates corresponding with the bill type to the bill to be identified To correction bill images;
Extract the item to be identified in the correction bill images;
The content information of the item to be identified is identified by character recognition technology.
Optionally, it the note position that bill to be identified is oriented from images to be recognized and identifies described to be identified The bill type of bill determines the image of bill to be identified, comprising:
Document identification technology based on deep learning determines described to be identified in conjunction with bank slip recognition model trained in advance The location information and bill type of bill to be identified in image, and determine based on the location information image of bill to be identified.
Optionally, it is described according to default ticket templates corresponding with the bill type to the image of the bill to be identified It is corrected, obtains correction bill images, comprising:
According to default ticket templates corresponding with the bill type, by images match and affine transformation technology to described The image of bill to be identified is corrected, and obtains correction bill images.
Optionally, the basis default ticket templates corresponding with the bill type, pass through images match and affine change It changes technology to be corrected the image of the bill to be identified, obtains correction bill images, comprising:
Using SIFT and RANSAC in conjunction with image matching algorithm determined from the image of the bill to be identified in advance The most like matching area of the ticket templates characteristic image first stored;
The affine transformation matrix for determining the matching area relative to the ticket templates characteristic image is calculated, and according to institute The correction that affine transformation matrix is rotated, scaled and/or is displaced to the image of the bill to be identified is stated, correction bill is obtained Image.
Optionally, the item to be identified extracted in the correction bill images, comprising:
According to position coordinates of the identification item predetermined in default ticket templates, in the image of the bill to be identified Upper extraction identifies item of image, obtains tentatively identifying item;
Line of text detection algorithm based on deep learning positions literal line position in the preliminary identification item, obtains target Item to be identified.
Optionally, the note position that bill to be identified is oriented from images to be recognized, comprising:
Document identification technology based on deep learning inputs images to be recognized using full convolutional neural networks, described in output The prediction probability figure of each pixel class in images to be recognized;
Based on the prediction probability figure of each pixel class, the class of each pixel is calculated by OTSU image segmentation algorithm Other attribute finally extracts foreground pixel profile and obtains note position information.
Optionally, after the content information for identifying the item to be identified by character recognition technology described, the method is also Include:
According to the Property Name of the item to be identified and the content information identified, to the content of the bill to be identified Carry out data structured output.
A kind of bank slip recognition device, comprising:
Location type determining module, for oriented from images to be recognized note position to be identified and identify it is described to The bill type for identifying bill, determines the image of bill to be identified;
Image correction module is used for according to default ticket templates corresponding with the bill type to the bill to be identified Image be corrected, obtain correction bill images;
Item extraction module is identified, for extracting the item to be identified in the correction bill images;
Content identifier module, for identifying the content information of the item to be identified by character recognition technology.
A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor Bank slip recognition method described in any one of the above.
A kind of electronic equipment, comprising:
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to execute any one of the above bank slip recognition via the executable instruction is executed Method.
It can be seen via above technical scheme that compared with prior art, the embodiment of the invention discloses a kind of bank slip recognitions Method and device, the bank slip recognition method orient note position and the identification of bill to be identified from images to be recognized first The bill type of the bill to be identified out, determines the image of bill to be identified, then according to corresponding with the bill type Default ticket templates are corrected the image of the bill to be identified, obtain correction bill images, extract the correction bill Item to be identified in image identifies the content information of the item to be identified finally by character recognition technology.The bank slip recognition side No matter there are how many bill images on method and device scanned copy image, deep learning algorithm and preset bill mould can be based on Plate and parameter identify that recognition efficiency is high and accurate to each of scanned copy bill images respectively, can greatly promote The efficiency of finance reimbursement process work.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of bank slip recognition method disclosed by the embodiments of the present invention;
Fig. 2 is image rectification flow diagram disclosed by the embodiments of the present invention;
Fig. 3 is the flow chart of item to be identified in extraction bill images disclosed by the embodiments of the present invention;
Fig. 4 is the flow chart of another kind bank slip recognition method disclosed in inventive embodiments;
Fig. 5 is a kind of structural schematic diagram of bank slip recognition device disclosed by the embodiments of the present invention;
Fig. 6 is the structural schematic diagram of image correction module disclosed by the embodiments of the present invention;
Fig. 7 is the structural schematic diagram of identification item extraction module disclosed by the embodiments of the present invention;
Fig. 8 is the structural schematic diagram of another kind bank slip recognition device disclosed in inventive embodiments.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Fig. 1 is a kind of flow chart of bank slip recognition method disclosed by the embodiments of the present invention, shown in Figure 1, the bill Recognition methods may include:
Step 101: orienting the note position of bill to be identified from images to be recognized and identify the ticket to be identified According to bill type, determine the image of bill to be identified.
Wherein, described bill type such as VAT invoice, quota invoice, official receipts etc..
Specifically, the document identification technology based on deep learning, can be used full convolutional neural networks, scan image is inputted, Each pixel class prediction probability figure is exported, then (OTSU algorithm is to be proposed by Japanese scholars OTSU in 1979 by OTSU A kind of pair of image carry out the highly effective algorithm of binaryzation) image segmentation algorithm is the category attribute for calculating each pixel, finally mention It takes foreground pixel profile to can be obtained note position information, the bill in scan image is positioned to realize.Above-mentioned side Method replaces the bills detection algorithms such as traditional edge detection based on image procossing, lines extraction, can shorten the identification of bill Time.Document identification technology based on deep learning identifies different types of bill, can be from the scanning of patch ticket complexity The position coordinates of bill to be identified are positioned in image.
Meanwhile the document identification technology based on deep learning, utilize the thought of transfer learning, it is only necessary to a small amount of bill sample Training pattern can stablize the bill type in identification scan image using trained model, avoid to a large amount of of the same race The dependence of the bill training sample of type.
In conjunction with above content, at one in the specific implementation, the ticket for orienting bill to be identified from images to be recognized It according to position and identifies the bill type of the bill to be identified, determines the image of bill to be identified, may include: based on depth The document identification technology of study determines the position of bill to be identified in images to be recognized in conjunction with bank slip recognition model trained in advance Confidence ceases and bill type, and the image of bill to be identified is determined based on the location information.
Wherein, the bank slip recognition model, which can be, pre- first passes through what the training of existing model trainer obtained.In training In the process, the image of target bill is first intercepted out, then input model training aids, bill images of the model trainer based on input Sample be trained, obtain bank slip recognition model.It should be noted that in order to guarantee to train obtained bank slip recognition model Accuracy of identification can input the image of multiple target bills to model trainer, increase the appearance of training sample in the training process Amount, can be with repetition training, constantly promotion precision, until satisfying the use demand.
It should be noted that in the present embodiment, the case where for including multiple bill images in scan image, the present embodiment Disclosed bank slip recognition method can identify the different bill images in a scan image respectively, and be identified respectively Processing.Specifically, in practical applications, for different bills, there is different bank slip recognition models, in a scan image In when including the identical or different bill images of multiple types, be based on deep learning algorithm, utilize the various bills of training in advance Identification model, can to scan image, region is identified in whole or in part, therefore can be gone out with Direct Recognition in scan image every One individual bill images, and be respectively processed.
In addition, identification image illustrates by taking scan image as an example in the present embodiment, but not limiting identification image only includes scanning Image, such as photo, color photocopying image etc. can be using in the present embodiment, and example interpretation is not intended to limit this The protection scope of application.
The process of above-mentioned identification bill type and bill positioning passes through training ticket on the whole based on the method for deep learning According to identification model, note position is directly oriented from scan image, realizes that a step of bill type and note position determines, it is real Existing process does not need to combine excessive algorithm, is simple and efficient.
After step 101,102 are entered step.
Step 102: according to default ticket templates corresponding with the bill type to the image of the bill to be identified into Row correction obtains correction bill images.
In the present embodiment, the region containing feature-rich in bill images can be detected by image matching technology, then Bill images are corrected, to evade the nonstandard problem of bill inclination during artificial patch ticket, reduce ticket contents Identify difficulty.Wherein, the feature-rich refer in image it is obvious, with significant or identification characteristic area, Including color characteristic, Gradient Features etc..
Specifically, it is described according to default ticket templates corresponding with the bill type to the image of the bill to be identified It is corrected, obtains correction bill images, may include: to be passed through according to default ticket templates corresponding with the bill type Images match and affine transformation technology are corrected the image of the bill to be identified, obtain correction bill images.
The basis default ticket templates corresponding with the bill type, pass through images match and affine transformation technology pair The image of the bill to be identified is corrected, and the detailed process for obtaining correction bill images may refer to Fig. 2, and Fig. 2 is this hair Image rectification flow diagram disclosed in bright embodiment, as shown in Fig. 2, image rectification process may include:
Step 201: using SIFT (Scale-invariant feature transform, Scale invariant features transform) (Random Sample Consensus includes the sample data set of abnormal data according to one group, calculates data with RANSAC Mathematical model parameter, obtain the algorithm of effective sample data) combine image matching algorithm, from the figure of the bill to be identified The matching area most like with pre-stored ticket templates characteristic image is determined as in.
Wherein, the ticket templates characteristic image refers to that in ticket templates image containing feature-rich, bill itself carries , not by printing character influenced and with higher identification region.
Step 202: the affine transformation matrix for determining the matching area relative to the ticket templates characteristic image is calculated, And the correction for being rotated, being scaled and/or being displaced according to image of the affine transformation matrix to the bill to be identified, it obtains Correct bill images.
The bill images finished are corrected, it is substantially consistent with ticket templates image.
After step 102,103 are entered step.
Step 103: extracting the item to be identified in the correction bill images.
In the present embodiment, can identification item coarse extraction mode based on predefined coordinate be based on deep learning text detection The thin extracting mode of identification item combine, be accurately positioned item to be identified.Text detection therein specifically can be literal line detection skill Art.
In specific implementation, can according to actual finance reimbursement situation and needs, the pre-set region to be identified, According to the position for wanting identification region above-mentioned in default ticket templates, the corresponding region in bill to be identified is intercepted, then carry out subsequent Identification.
Fig. 3 is mentioned as shown in connection with fig. 3 for the flow chart of item to be identified in extraction bill images disclosed by the embodiments of the present invention The item to be identified in bill images is taken to may include:
Step 301: according to position coordinates of the identification item predetermined in default ticket templates, in the ticket to be identified According to image on extract identification item of image, obtain tentatively identifying item.
Step 302: the line of text detection algorithm based on deep learning positions literal line position in the preliminary identification item, Obtain target item to be identified.
The above process, according to position coordinates of the identification item predetermined in default ticket templates, in bill to be identified Image on coarse extraction identify item of image, then can use the line of text detection algorithm CTPN based on deep learning (Connectionist Text Proposal Network) is accurately positioned the position of each literal line.Wherein, line of text detects Algorithm uses CTPN (connectionism text generation network) algorithm based on deep learning, is first mentioned with CNN (convolutional neural networks) Depth characteristic is taken, text box is then detected with the anchor of fixed width (anchor), and the corresponding feature with a line anchor It is linked to be sequence, is input in RNN (Recognition with Recurrent Neural Network), is finally classified or returned with full articulamentum, and by correct text Frame is merged into line of text, and this method by CNN and RNN seamless combination improves detection accuracy.
After step 103,104 are entered step.
Step 104: the content information of the item to be identified is identified by character recognition technology.
Specifically, the content information for identifying the item to be identified by character recognition technology, may include: based on deep The end-to-end text recognition algorithms of degree study identify the content information of the item to be identified.Text recognition algorithms, which use, is based on depth End-to-end CRNN (convolution loop neural network) algorithm of study, extracts characteristic sequence using convolutional layer from input picture, Circulation layer is constructed on convolutional network, for predicting the feature that convolutional layer exports, finally uses transcription layer by circulation layer Every frame prediction is converted into sequence label, and CNN in conjunction with RNN, is carried out joint training by a loss function by CRNN.The algorithm The input picture of random length can be handled, directly output full line Word-predictor is as a result, better than traditional based on monocase identification Text recognition algorithms.
In the present embodiment, the bank slip recognition algorithm is succinctly intuitive, small by nominal value influence of noise degree, hardly deposits In preset empirical value, algorithm robustness is stronger.It, can meanwhile no matter there are how many bill images on scanned copy image Enough each of scanned copy bill images are known respectively based on deep learning algorithm and preset ticket templates and parameter Not, recognition efficiency is high and accurate, can greatly promote the efficiency of finance reimbursement process work.
On the basis of the above disclosed embodiments of the present invention, Fig. 4 is another kind bank slip recognition disclosed in inventive embodiments The flow chart of method, as shown in figure 4, bank slip recognition method may include:
Step 401: orienting the note position of bill to be identified from images to be recognized and identify the ticket to be identified According to bill type, determine the image of bill to be identified.
Step 402: according to default ticket templates corresponding with the bill type to the image of the bill to be identified into Row correction obtains correction bill images.
Step 403: extracting the item to be identified in the correction bill images.
Step 404: the content information of the item to be identified is identified by character recognition technology.
Step 405: according to the Property Name of the item to be identified and the content information identified, to the ticket to be identified According to content carry out data structured output.
Data structured, which refers to, is combined a large amount of bill literal line recognition results according to its semantic information, and by finance Reimbursement requires classification output.Offset coordinates using predefined each identification item relative to ticket templates image, it is known that identification The corresponding attribute information of item can reach number to bank slip recognition result disaggregated classification in conjunction with the semantic information that line of text identifies According to the purpose of structuring.Specifically, can be according to the Property Name of item to be identified and the semantic information pair of the item to be identified Recognition result carries out structuring output.For example, the Property Name of item to be identified is the amount of money, identification content is " 10,000 thousand yuan of lands It is whole ", then output is to 16000 yuan of reimbursed sum.
Above-mentioned bank slip recognition method, can according to the semantic information of the Property Name of item to be identified and specific identification content, Automatic arranging relevant information structuring output, saves artificial statistics, substantially increases the treatment effeciency of reimbursement work.
For the various method embodiments described above, for simple description, therefore, it is stated as a series of action combinations, but Be those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because according to the present invention, certain A little steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know that, it is retouched in specification The embodiment stated belongs to preferred embodiment, and related actions and modules are not necessarily necessary for the present invention.
Method is described in detail in aforementioned present invention disclosed embodiment, diversified forms can be used for method of the invention Device realize that therefore the invention also discloses a kind of devices, and specific embodiment is given below and is described in detail.
Fig. 5 is a kind of structural schematic diagram of bank slip recognition device disclosed by the embodiments of the present invention, shown in Figure 5, bill Identification device 50 may include:
Location type determining module 501, for orienting note position to be identified from images to be recognized and identifying institute The bill type for stating bill to be identified determines the image of bill to be identified.
Specifically, the document identification technology based on deep learning, can be used full convolutional neural networks, scan image is inputted, Each pixel class prediction probability figure is exported, is then the category attribute for calculating each pixel by OTSU image segmentation algorithm, Finally extracting foreground pixel profile can be obtained note position information, position to realize to the bill in scan image. The above method replaces the bills detection algorithms such as traditional edge detection based on image procossing, lines extraction, can shorten bill Recognition time.Document identification technology based on deep learning identifies different types of bill, can be complicated from patch ticket Scan image in position the position coordinates of bill to be identified.
Meanwhile the document identification technology based on deep learning, utilize the thought of transfer learning, it is only necessary to a small amount of bill sample Training pattern can stablize the bill type in identification scan image using trained model, avoid to a large amount of of the same race The dependence of the bill training sample of type.
In conjunction with above content, at one in the specific implementation, the location type determining module 501 is particularly used in: being based on The document identification technology of deep learning determines bill to be identified in images to be recognized in conjunction with bank slip recognition model trained in advance Location information and bill type, and determine based on the location information image of bill to be identified.
It should be noted that in the present embodiment, the case where for including multiple bill images in scan image, the present embodiment Disclosed bank slip recognition method can identify the different bill images in a scan image respectively, and be identified respectively Processing.
Image correction module 502 is used for according to default ticket templates corresponding with the bill type to described to be identified The image of bill is corrected, and obtains correction bill images.
In the present embodiment, the region containing feature-rich in bill images can be detected by image matching technology, then Bill images are corrected, to evade the nonstandard problem of bill inclination during artificial patch ticket, reduce ticket contents Identify difficulty.
Specifically, described image correction module 502 can be used for: according to default bill mould corresponding with the bill type Plate is corrected the image of the bill to be identified by images match and affine transformation technology, obtains correction bill images.
The specific structure of described image correction module 502 may refer to Fig. 6, and Fig. 6 is image disclosed by the embodiments of the present invention The structural schematic diagram of correction module, as shown in fig. 6, image correction module 502 may include:
Images match module 601, the image matching algorithm for being combined using SIFT and RANSAC, from the ticket to be identified According to image in determine the matching area most like with pre-stored ticket templates characteristic image.
Wherein, the ticket templates characteristic image refers to that in ticket templates image containing feature-rich, bill itself carries , the region not influenced by printing character.
Affine transformation module 602 determines the matching area relative to the ticket templates characteristic image for calculating Affine transformation matrix, and rotated according to image of the affine transformation matrix to the bill to be identified, scaled and/or position The correction of shifting obtains correction bill images.
The bill images finished are corrected, it is substantially consistent with ticket templates image.
Item extraction module 503 is identified, for extracting the item to be identified in the correction bill images.
In the present embodiment, can identification item coarse extraction mode based on predefined coordinate be based on deep learning text detection The thin extracting mode of identification item combine, be accurately positioned item to be identified.Text detection therein specifically can be literal line detection skill Art.
In specific implementation, can according to actual finance reimbursement situation and needs, the pre-set region to be identified, According to the position for wanting identification region above-mentioned in default ticket templates, the corresponding region in bill to be identified is intercepted, then carry out subsequent Identification.
Fig. 7 is the structural schematic diagram of identification item extraction module disclosed by the embodiments of the present invention, shown in Figure 7, at one In example, identification item extraction module 503 may include:
Coarse extraction module 701, for the position coordinates according to identification item predetermined in default ticket templates, in institute It states and extracts identification item of image on the image of bill to be identified, obtain tentatively identifying item.
Thin extraction module 702, positions in the preliminary identification item for the line of text detection algorithm based on deep learning Literal line position obtains target item to be identified.
The above process, according to position coordinates of the identification item predetermined in default ticket templates, in bill to be identified Image on coarse extraction identify item of image, then can using based on deep learning line of text detection algorithm CTPN be accurately positioned The position of each literal line.
Content identifier module 504, for identifying the content information of the item to be identified by character recognition technology.
Specifically, the content identifier module 504 can be used for: the end-to-end text recognition algorithms identification based on deep learning The content information of the item to be identified.
In the present embodiment, the bank slip recognition device algorithm is succinctly intuitive, small by nominal value influence of noise degree, hardly deposits In preset empirical value, algorithm robustness is stronger.It, can meanwhile no matter there are how many bill images on scanned copy image Enough each of scanned copy bill images are known respectively based on deep learning algorithm and preset ticket templates and parameter Not, recognition efficiency is high and accurate, can greatly promote the efficiency of finance reimbursement process work.
On the basis of the above disclosed embodiments of the present invention, Fig. 8 is another kind bank slip recognition disclosed in inventive embodiments The structural schematic diagram of device, as shown in figure 8, bank slip recognition device 80 may include:
Location type determining module 501, for orienting note position and the knowledge of bill to be identified from images to be recognized Not Chu the bill to be identified bill type, determine the image of bill to be identified.
Image correction module 502 is used for according to default ticket templates corresponding with the bill type to described to be identified The image of bill is corrected, and obtains correction bill images.
Item extraction module 503 is identified, for extracting the item to be identified in the correction bill images.
Content identifier module 504, for identifying the content information of the item to be identified by character recognition technology.
Structuring output module 801, for according to the Property Name of the item to be identified and the content information identified, Data structured output is carried out to the content of the bill to be identified.
Specifically, can be tied according to the Property Name of item to be identified and the semantic information of the item to be identified to identification Fruit carries out structuring output.For example, the Property Name of item to be identified is the amount of money, identification content is " 10,000 thousand yuan of lands are whole ", then defeated Out to 16000 yuan of reimbursed sum.
Above-mentioned bank slip recognition device, can according to the semantic information of the Property Name of item to be identified and specific identification content, Automatic arranging relevant information structuring output, saves artificial statistics, substantially increases the treatment effeciency of reimbursement work.
Any one described bank slip recognition device in above-described embodiment includes processor and memory, above-described embodiment In location type determining module, image correction module, identification item extraction module, content identifier module, images match module, imitative It penetrates conversion module, coarse extraction module, thin extraction module, structuring output module etc. and is stored in memory as program module In, the above procedure module of storage in the memory is executed by processor to realize corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program module by kernel.Kernel can be set one Or it is multiple, the processing of return visit data is realized by adjusting kernel parameter.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM), memory includes at least one storage Chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processor Bank slip recognition method described in existing above-described embodiment.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operation Bank slip recognition method described in Shi Zhihang above-described embodiment.
Further, a kind of electronic equipment, including processor and memory are present embodiments provided.Wherein memory is used for The executable instruction of the processor is stored, the processor is configured to execute above-mentioned reality via the executable instruction is executed Apply bank slip recognition method described in example.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of bank slip recognition method characterized by comprising
The note position of bill to be identified is oriented from images to be recognized and identifies the bill type of the bill to be identified, Determine the image of bill to be identified;
It is corrected according to image of the default ticket templates corresponding with the bill type to the bill to be identified, obtains school Positive bill images;
Extract the item to be identified in the correction bill images;
The content information of the item to be identified is identified by character recognition technology.
2. bank slip recognition method according to claim 1, which is characterized in that described to orient from images to be recognized wait know The note position of other bill and the bill type for identifying the bill to be identified, determine the image of bill to be identified, comprising:
Document identification technology based on deep learning, in conjunction with bank slip recognition model trained in advance, determine in images to be recognized to It identifies the location information and bill type of bill, and determines the image of bill to be identified based on the location information.
3. bank slip recognition method according to claim 1, which is characterized in that described according to corresponding with the bill type Default ticket templates are corrected the image of the bill to be identified, obtain correction bill images, comprising:
According to default ticket templates corresponding with the bill type, by images match and affine transformation technology to described wait know The image of other bill is corrected, and obtains correction bill images.
4. bank slip recognition method according to claim 3, which is characterized in that the basis is corresponding with the bill type Default ticket templates, are corrected the image of the bill to be identified by images match and affine transformation technology, obtain school Positive bill images, comprising:
Using SIFT and RANSAC in conjunction with image matching algorithm determine from the image of the bill to be identified and deposit in advance The most like matching area of the ticket templates characteristic image of storage;
The affine transformation matrix for determining the matching area relative to the ticket templates characteristic image is calculated, and according to described imitative The correction that transformation matrix is rotated, scaled and/or is displaced to the image of the bill to be identified is penetrated, correction bill is obtained Picture.
5. bank slip recognition method according to claim 1, which is characterized in that described to extract in the correction bill images Item to be identified, comprising:
According to position coordinates of the identification item predetermined in default ticket templates, above mentioned in the image of the bill to be identified Identification item of image is taken, obtains tentatively identifying item;
Line of text detection algorithm based on deep learning positions literal line position in the preliminary identification item, obtains target and waits knowing Other item.
6. bank slip recognition method according to claim 1, which is characterized in that described to orient from images to be recognized wait know The note position of other bill, comprising:
Document identification technology based on deep learning inputs the images to be recognized using full convolutional neural networks, described in output The prediction probability figure of each pixel class in images to be recognized;
Based on the prediction probability figure of each pixel class, the classification category of each pixel is calculated by OTSU image segmentation algorithm Property, and foreground pixel profile is extracted to obtain note position information.
7. bank slip recognition method according to claim 1, which is characterized in that identify institute by character recognition technology described After the content information for stating item to be identified, the method also includes:
According to the Property Name of the item to be identified and the content information identified, the content of the bill to be identified is carried out Data structured output.
8. a kind of bank slip recognition device characterized by comprising
Location type determining module, for orienting note position to be identified from images to be recognized and identifying described to be identified The bill type of bill determines the image of bill to be identified;
Image correction module, for the figure according to default ticket templates corresponding with the bill type to the bill to be identified As being corrected, correction bill images are obtained;
Item extraction module is identified, for extracting the item to be identified in the correction bill images;
Content identifier module, for identifying the content information of the item to be identified by character recognition technology.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor Claim 1-7 described in any item bank slip recognition methods are realized when row.
10. a kind of electronic equipment characterized by comprising
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to require 1-7 described in any item via executing the executable instruction and carry out perform claim Bank slip recognition method.
CN201910510860.2A 2019-06-13 2019-06-13 A kind of bank slip recognition method and device Pending CN110263694A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910510860.2A CN110263694A (en) 2019-06-13 2019-06-13 A kind of bank slip recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910510860.2A CN110263694A (en) 2019-06-13 2019-06-13 A kind of bank slip recognition method and device

Publications (1)

Publication Number Publication Date
CN110263694A true CN110263694A (en) 2019-09-20

Family

ID=67918140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910510860.2A Pending CN110263694A (en) 2019-06-13 2019-06-13 A kind of bank slip recognition method and device

Country Status (1)

Country Link
CN (1) CN110263694A (en)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674815A (en) * 2019-09-29 2020-01-10 四川长虹电器股份有限公司 Invoice image distortion correction method based on deep learning key point detection
CN110751136A (en) * 2019-11-04 2020-02-04 北京亿信华辰软件有限责任公司武汉分公司 Method for extracting value-added tax invoice information
CN110956166A (en) * 2019-12-02 2020-04-03 中国银行股份有限公司 Bill marking method and device
CN111079531A (en) * 2019-11-12 2020-04-28 泰康保险集团股份有限公司 Data structured output method and device, electronic equipment and storage medium
CN111079571A (en) * 2019-11-29 2020-04-28 杭州数梦工场科技有限公司 Identification card information identification and edge detection model training method and device
CN111178353A (en) * 2019-12-16 2020-05-19 中国建设银行股份有限公司 Image character positioning method and device
CN111209856A (en) * 2020-01-06 2020-05-29 泰康保险集团股份有限公司 Invoice information identification method and device, electronic equipment and storage medium
CN111209827A (en) * 2019-12-31 2020-05-29 中国南方电网有限责任公司 OCR (optical character recognition) bill problem recognition method and system based on feature detection
CN111241955A (en) * 2020-01-03 2020-06-05 北京一览群智数据科技有限责任公司 Bill information extraction method and system
CN111242124A (en) * 2020-01-13 2020-06-05 支付宝实验室(新加坡)有限公司 Certificate classification method, device and equipment
CN111241974A (en) * 2020-01-07 2020-06-05 深圳追一科技有限公司 Bill information acquisition method and device, computer equipment and storage medium
CN111275037A (en) * 2020-01-09 2020-06-12 上海知达教育科技有限公司 Bill identification method and device
CN111291752A (en) * 2020-01-22 2020-06-16 山东浪潮通软信息科技有限公司 Invoice identification method, equipment and medium
CN111414917A (en) * 2020-03-18 2020-07-14 民生科技有限责任公司 Identification method of low-pixel-density text
CN111428599A (en) * 2020-03-17 2020-07-17 北京公瑾科技有限公司 Bill identification method, device and equipment
CN111461099A (en) * 2020-03-27 2020-07-28 重庆农村商业银行股份有限公司 Bill identification method, system, equipment and readable storage medium
CN111461100A (en) * 2020-03-31 2020-07-28 重庆农村商业银行股份有限公司 Bill identification method and device, electronic equipment and storage medium
CN111553334A (en) * 2020-04-21 2020-08-18 招商局金融科技有限公司 Questionnaire image recognition method, electronic device, and storage medium
CN111546804A (en) * 2020-04-08 2020-08-18 远光软件股份有限公司 Automatic original bill pasting method and device
CN111639566A (en) * 2020-05-19 2020-09-08 浙江大华技术股份有限公司 Method and device for extracting form information
CN111680679A (en) * 2020-06-03 2020-09-18 重庆数道科技有限公司 Automatic document identification method based on OCR
CN111832423A (en) * 2020-06-19 2020-10-27 北京邮电大学 Bill information identification method, device and system
CN111914835A (en) * 2020-07-04 2020-11-10 中信银行股份有限公司 Bill element extraction method and device, electronic equipment and readable storage medium
CN111967391A (en) * 2020-08-18 2020-11-20 清华大学 Text recognition method and computer-readable storage medium for medical laboratory test reports
CN111967395A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Bank bill identification method and device
CN112800848A (en) * 2020-12-31 2021-05-14 中电金信软件有限公司 Structured extraction method, device and equipment of information after bill identification
CN112818951A (en) * 2021-03-11 2021-05-18 南京大学 Ticket identification method
WO2021123209A1 (en) * 2019-12-18 2021-06-24 Carrus Gaming Method for segmenting an input image showing a document containing structured information
CN113033269A (en) * 2019-12-25 2021-06-25 华为技术服务有限公司 Data processing method and device
CN113052161A (en) * 2021-04-30 2021-06-29 中国银行股份有限公司 Method, device and equipment for identifying bank bill text
WO2021151274A1 (en) * 2020-05-20 2021-08-05 平安科技(深圳)有限公司 Image file processing method and apparatus, electronic device, and computer readable storage medium
CN113408516A (en) * 2021-06-25 2021-09-17 京东数科海益信息科技有限公司 Bill recognition device and method
CN113569650A (en) * 2021-06-29 2021-10-29 上海红檀智能科技有限公司 Unmanned aerial vehicle autonomous inspection positioning method based on electric power tower label identification

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN108664897A (en) * 2018-04-18 2018-10-16 平安科技(深圳)有限公司 Bank slip recognition method, apparatus and storage medium
CN108921166A (en) * 2018-06-22 2018-11-30 深源恒际科技有限公司 Medical bill class text detection recognition method and system based on deep neural network
CN109658584A (en) * 2018-12-14 2019-04-19 泰康保险集团股份有限公司 A kind of bill bank slip recognition method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN108664897A (en) * 2018-04-18 2018-10-16 平安科技(深圳)有限公司 Bank slip recognition method, apparatus and storage medium
CN108921166A (en) * 2018-06-22 2018-11-30 深源恒际科技有限公司 Medical bill class text detection recognition method and system based on deep neural network
CN109658584A (en) * 2018-12-14 2019-04-19 泰康保险集团股份有限公司 A kind of bill bank slip recognition method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ZHI TIAN ET AL.: "Detecting Text in Natural Image with Connectionist Text Proposal Network", 《COMPUTER VISION – ECCV 2016》 *
中国税务学会: "《2016年全国税收理论研讨文集》", 31 December 2017 *
韩九强,杨磊: "《数字图像处理 基于XAVIS组态软件》", 30 September 2018, 西安交通大学出版社 *
黄志文: "基于深度学习的***自动识别***的设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674815A (en) * 2019-09-29 2020-01-10 四川长虹电器股份有限公司 Invoice image distortion correction method based on deep learning key point detection
CN110751136A (en) * 2019-11-04 2020-02-04 北京亿信华辰软件有限责任公司武汉分公司 Method for extracting value-added tax invoice information
CN111079531A (en) * 2019-11-12 2020-04-28 泰康保险集团股份有限公司 Data structured output method and device, electronic equipment and storage medium
CN111079571A (en) * 2019-11-29 2020-04-28 杭州数梦工场科技有限公司 Identification card information identification and edge detection model training method and device
CN110956166A (en) * 2019-12-02 2020-04-03 中国银行股份有限公司 Bill marking method and device
CN111178353A (en) * 2019-12-16 2020-05-19 中国建设银行股份有限公司 Image character positioning method and device
FR3105529A1 (en) * 2019-12-18 2021-06-25 Idemia Identity & Security France A method of segmenting an input image representing a document containing structured information
WO2021123209A1 (en) * 2019-12-18 2021-06-24 Carrus Gaming Method for segmenting an input image showing a document containing structured information
CN113033269B (en) * 2019-12-25 2023-08-25 华为技术服务有限公司 Data processing method and device
CN113033269A (en) * 2019-12-25 2021-06-25 华为技术服务有限公司 Data processing method and device
CN111209827A (en) * 2019-12-31 2020-05-29 中国南方电网有限责任公司 OCR (optical character recognition) bill problem recognition method and system based on feature detection
CN111209827B (en) * 2019-12-31 2023-07-14 中国南方电网有限责任公司 Method and system for OCR (optical character recognition) bill problem based on feature detection
CN111241955A (en) * 2020-01-03 2020-06-05 北京一览群智数据科技有限责任公司 Bill information extraction method and system
CN111241955B (en) * 2020-01-03 2023-05-16 北京一览群智数据科技有限责任公司 Bill information extraction method and system
CN111209856B (en) * 2020-01-06 2023-10-17 泰康保险集团股份有限公司 Invoice information identification method and device, electronic equipment and storage medium
CN111209856A (en) * 2020-01-06 2020-05-29 泰康保险集团股份有限公司 Invoice information identification method and device, electronic equipment and storage medium
CN111241974A (en) * 2020-01-07 2020-06-05 深圳追一科技有限公司 Bill information acquisition method and device, computer equipment and storage medium
CN111241974B (en) * 2020-01-07 2023-10-27 深圳追一科技有限公司 Bill information acquisition method, device, computer equipment and storage medium
CN111275037A (en) * 2020-01-09 2020-06-12 上海知达教育科技有限公司 Bill identification method and device
CN111275037B (en) * 2020-01-09 2021-06-08 上海知达教育科技有限公司 Bill identification method and device
CN111242124A (en) * 2020-01-13 2020-06-05 支付宝实验室(新加坡)有限公司 Certificate classification method, device and equipment
CN111242124B (en) * 2020-01-13 2023-10-31 支付宝实验室(新加坡)有限公司 Certificate classification method, device and equipment
CN111291752A (en) * 2020-01-22 2020-06-16 山东浪潮通软信息科技有限公司 Invoice identification method, equipment and medium
CN111428599B (en) * 2020-03-17 2023-10-20 北京子敬科技有限公司 Bill identification method, device and equipment
CN111428599A (en) * 2020-03-17 2020-07-17 北京公瑾科技有限公司 Bill identification method, device and equipment
CN111414917A (en) * 2020-03-18 2020-07-14 民生科技有限责任公司 Identification method of low-pixel-density text
CN111461099A (en) * 2020-03-27 2020-07-28 重庆农村商业银行股份有限公司 Bill identification method, system, equipment and readable storage medium
CN111461100A (en) * 2020-03-31 2020-07-28 重庆农村商业银行股份有限公司 Bill identification method and device, electronic equipment and storage medium
CN111546804B (en) * 2020-04-08 2021-03-23 远光软件股份有限公司 Automatic original bill pasting method and device
CN111546804A (en) * 2020-04-08 2020-08-18 远光软件股份有限公司 Automatic original bill pasting method and device
CN111553334A (en) * 2020-04-21 2020-08-18 招商局金融科技有限公司 Questionnaire image recognition method, electronic device, and storage medium
CN111639566A (en) * 2020-05-19 2020-09-08 浙江大华技术股份有限公司 Method and device for extracting form information
WO2021151274A1 (en) * 2020-05-20 2021-08-05 平安科技(深圳)有限公司 Image file processing method and apparatus, electronic device, and computer readable storage medium
CN111680679A (en) * 2020-06-03 2020-09-18 重庆数道科技有限公司 Automatic document identification method based on OCR
CN111832423A (en) * 2020-06-19 2020-10-27 北京邮电大学 Bill information identification method, device and system
CN111914835A (en) * 2020-07-04 2020-11-10 中信银行股份有限公司 Bill element extraction method and device, electronic equipment and readable storage medium
CN111967395A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Bank bill identification method and device
CN111967391A (en) * 2020-08-18 2020-11-20 清华大学 Text recognition method and computer-readable storage medium for medical laboratory test reports
CN112800848A (en) * 2020-12-31 2021-05-14 中电金信软件有限公司 Structured extraction method, device and equipment of information after bill identification
CN112818951A (en) * 2021-03-11 2021-05-18 南京大学 Ticket identification method
CN112818951B (en) * 2021-03-11 2023-11-21 南京大学 Ticket identification method
CN113052161A (en) * 2021-04-30 2021-06-29 中国银行股份有限公司 Method, device and equipment for identifying bank bill text
CN113408516A (en) * 2021-06-25 2021-09-17 京东数科海益信息科技有限公司 Bill recognition device and method
CN113569650A (en) * 2021-06-29 2021-10-29 上海红檀智能科技有限公司 Unmanned aerial vehicle autonomous inspection positioning method based on electric power tower label identification

Similar Documents

Publication Publication Date Title
CN110263694A (en) A kind of bank slip recognition method and device
CN109492643B (en) Certificate identification method and device based on OCR, computer equipment and storage medium
Luo et al. Moran: A multi-object rectified attention network for scene text recognition
Yuliang et al. Detecting curve text in the wild: New dataset and new solution
WO2019238063A1 (en) Text detection and analysis method and apparatus, and device
RU2695489C1 (en) Identification of fields on an image using artificial intelligence
US8494273B2 (en) Adaptive optical character recognition on a document with distorted characters
CN108846385B (en) Image identification and correction method and device based on convolution-deconvolution neural network
CN110874618B (en) OCR template learning method and device based on small sample, electronic equipment and medium
Bulatovich et al. MIDV-2020: a comprehensive benchmark dataset for identity document analysis
CN111353491B (en) Text direction determining method, device, equipment and storage medium
CN109784342A (en) A kind of OCR recognition methods and terminal based on deep learning model
CN108681735A (en) Optical character recognition method based on convolutional neural networks deep learning model
Caldeira et al. Industrial optical character recognition system in printing quality control of hot-rolled coils identification
CN110119460A (en) Image search method, device and electronic equipment
CN109271980A (en) A kind of vehicle nameplate full information recognition methods, system, terminal and medium
WO2021232670A1 (en) Pcb component identification method and device
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN112464845A (en) Bill recognition method, equipment and computer storage medium
CN112241727A (en) Multi-ticket identification method and system and readable storage medium
CN112115907A (en) Method, device, equipment and medium for extracting structured information of fixed layout certificate
RU2633182C1 (en) Determination of text line orientation
CN111462388A (en) Bill inspection method and device, terminal equipment and storage medium
CN110796145A (en) Multi-certificate segmentation association method based on intelligent decision and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190920