CN109791641A - Obtain the system and method for lacking the repeating transmission of electronic document of necessary data - Google Patents

Obtain the system and method for lacking the repeating transmission of electronic document of necessary data Download PDF

Info

Publication number
CN109791641A
CN109791641A CN201780060455.4A CN201780060455A CN109791641A CN 109791641 A CN109791641 A CN 109791641A CN 201780060455 A CN201780060455 A CN 201780060455A CN 109791641 A CN109791641 A CN 109791641A
Authority
CN
China
Prior art keywords
electronic document
data
template
repeating transmission
necessary data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201780060455.4A
Other languages
Chinese (zh)
Inventor
N·古兹曼
I·萨夫特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vatbox Ltd
Original Assignee
Vatbox Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/361,934 external-priority patent/US20170154385A1/en
Application filed by Vatbox Ltd filed Critical Vatbox Ltd
Publication of CN109791641A publication Critical patent/CN109791641A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/04Payment circuits
    • G06Q20/047Payment circuits using payment protocols involving electronic receipts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F5/00Methods or arrangements for data conversion without changing the order or content of the data handled
    • G06F5/01Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/04Billing or invoicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/123Tax preparation or submission

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

It is a kind of to obtain the system and method for lacking the repeating transmission of electronic document of necessary data.The described method includes: being the electronic document drawing template establishment, wherein the template is structured data sets, which includes the parameter transaction that at least one is determined based at least partly unstructured data;It is that at least one demand inquires at least one data source based on the template;Based on the template and at least one demand, determine whether the electronic document at least lacks a part of necessary data;Partial data is fetched when determining that the electronic document lacks at least part of necessary data;It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes electronic document, and indicates to retransmit the request of electronic document relevant to the partial data;And request electronic document will be retransmitted and be sent to repeater server.

Description

Obtain the system and method for lacking the repeating transmission of electronic document of necessary data
Cross reference to related applications
This application claims in the preferential of on August 5th, 2016 U.S. Provisional Applications submit, Serial No. 62/371,237 Power.The application is also submitting, Serial No. 15/361,934 Pending U.S. Patent Application on November 28th, 2016 simultaneously Cip application.The content of above-mentioned application is incorporated herein by reference.
Technical field
The present invention generally relates to the system that monitoring includes expense in electronic document, more specifically to It is automatic to retransmit the expense proof electronic document for lacking necessary data.
Background technique
Data related with the business such as invoice and purchase order data are managed as enterprise relies on technology more and more, Successful key is had become with the appropriate system of verify data for suitably managing.For large enterprise, enterprise The data volume used daily all can be very big.Therefore, these data are unpractical to manual review with verifying.However, record text Difference between part may cause significant problem to enterprise, for example, failing suitably to report income to the tax bureau.
Some solutions are for automatic identification scanning file (such as invoice and receipt) or other unstructured electronic documents Information in (for example, non-structured text file).In the letter and other features for accurately recognizing and identifying electronic document These aspects, this solution usually face the challenge.In addition, the content quality decline of the unstructured electronic document of input is logical It often will lead to higher error rate.Therefore, (i.e. gem-pure image) simultaneously in the ideal case for existing image recognition technology It is not exclusively accurate, and when input picture is less clear, the accuracy of these technologies often sharply declines.In addition, losing Or other incomplete data may cause and mistake occur when then using data.Many existing solutions, which can not identify, loses The data of mistake, unless, such as the field that structural data is concentrated is incomplete.
In addition, existing image recognition solution possibly can not accurately identify some or all spcial character (such as "!""@""#""$"" % " " & " etc.).For example, some existing image recognition solutions may be inaccurately by scanning Dash in receipt is identified as digital " 1 ".In another example, some existing image recognition solutions cannot identify all Such as spcial character of dollar mark (), day metasymbol.
In addition, these solutions may face the challenge preparing identification data for subsequent for the use of.It is specific next It says, many such solutions or output is generated with unstructured format or can only be specific in input electronic document Structuring output could be generated in the case where formatting for image identification system identification.The usual nothing of unstructured output generated Method is effectively handled.Particularly, this unstructured output may include duplicate keys, and after may be included in front of use and needing The data of continuous processing.
During employing, commerce services and quotient that enterprise all over the world usually can all spend a large amount of money to buy in employee On product.In most cases, these trade it is available return, can also be with because enterprise can recycle value-added tax (VAT) Qualified expense is deducted in subsidiary company income tax (CIT).These expenses should be reported to the related tax authority, at least to return Receive the part refund of institute's payment.
In many cases, according to the regulation of specific jurisdiction, enterprise needs to provide and pay with related Expense voucher, such as receipt, invoice, and the explanation of cost category and the amount of money is provided.To the tax authority provide report on expenses and Corresponding evidence, to recycle value-added tax and/or to deduct and related corporation income tax of trading.
If evidence does not include one or more requisite items, for example, vendor name, supplier address, supplier ID, This evidence possibly can not be used for value-added tax recycling and/or corporation income tax deduction by date, total value etc., enterprise.In order to solve this One problem, enterprise need to put into a large amount of time and resource, can just obtain comprehensive value-added tax recycling and/or corporation income tax is supported Deduction.One welcome but expensive solution is that a Certified Public Accountants Firm is employed to ask to handle this important finance Topic.
Although some existing solutions provide and allow enterprise to collect and analyze to pay costs related data with employee Technology, but these data use be still limited.Particularly, these solutions lack in effectively and accurately identification The aspect of the file of necessary data may face the challenge.It is identified as lacking in addition, such solution will not be retransmitted automatically The file of necessary data.
Therefore, expense evidence is retransmitted by providing an effective ways, to provide the limitation that can overcome the prior art Solution will be advantageous.
Summary of the invention
The summary of the invention of several exemplary embodiments disclosed herein is as follows.The content of present invention is provided to these embodiments Basic comprehension not exclusively limits the scope of the present disclosure to provide conveniently for reader.The content of present invention is not to all imaginations Embodiment extensive overview ot, the purpose is to both not know the crucial or conclusive element of all embodiments, also do not describe appoint Meaning aspect or range in all aspects.Its sole purpose is some general of the one or more embodiments of presentation in simplified form It reads, as the place mat in greater detail proposed later.For convenience, terms used herein " some embodiments " can be with For referring to single embodiment disclosed in book or multiple embodiments.
Some embodiments disclosed herein include the method for obtaining the repeating transmission for the electronic document for lacking necessary data, should Electronic document includes at least the non-structured data in part.This method comprises: being the electronic document drawing template establishment, the wherein template For structured data sets, which includes that at least one is joined based on the transaction that at least partly unstructured data determines Number;It is that at least one requires to inquire at least one data source based on the template, wherein at least one requirement limits electronic document Necessary data;Based on the template and at least one requirement, determine whether the electronic document lacks the necessary data extremely Few a part;Partial data is fetched when determining that the electronic document lacks at least part of the necessary data, wherein this is complete Entire data has mended the necessary data;It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes the electronics File, and indicate a request to retransmit the electronic document about the partial data;And repeating transmission request electronic document is sent out It send to repeater server.
Some embodiments disclosed by the invention further include non-transitory computer-readable medium, in the non-transitory computer Instruction is stored in readable medium, described instruction makes processing circuit execute the repeating transmission for obtaining the electronic document for lacking necessary data Processing, the electronic document include at least partly non-structured data, the processing include: for the electronic document drawing template establishment, In the template be structured data sets, the structured data sets include at least one be based at least partly unstructured data determination Parameter transaction;It is that at least one requires to inquire at least one data source based on the template, wherein at least one requirement limits The necessary data of electronic document;Based on the template and at least one requirement, it is described required to determine whether the electronic document lacks At least part of data;Partial data is fetched when determining that electronic document lacks at least part of the necessary data, In the partial data mended the necessary data;It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes Electronic document, and indicate a request to retransmit the electronic document about the partial data;And electronics is requested into the repeating transmission File is sent to repeater server.
Some embodiments disclosed herein further include the system for obtaining the repeating transmission for the electronic document for lacking necessary data, The electronic document includes at least the non-structured data in part.The system includes: processing circuit;And memory body, the memory body packet Containing instruction, when executing the instruction by the processing circuit, which is configured as: for the electronic document drawing template establishment, wherein The template is structured data sets, which includes that at least one is determined based at least partly unstructured data Parameter transaction;It is that at least one requires to inquire at least one data source based on the template, wherein at least one requirement limits electricity The necessary data of subfile 0;Based on the template and at least one requirement, determine whether the electronic document lacks necessary data At least partially;Partial data is fetched when determining that the electronic document lacks at least part of the necessary data, wherein should Partial data has mended necessary data;It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes electronic document, And indicate a request to retransmit electronic document about the partial data;And request electronic document will be retransmitted and be sent to repeater clothes Business device.
Detailed description of the invention
Subject of the present invention is particularly pointed out and is clearly demonstrated in the claim at this paper conclusion part.With The lower detailed description combined with attached drawing can make the preceding aim of presently disclosed embodiment and other targets, feature and excellent Point becomes apparent.
Fig. 1 is the network for describing multiple embodiments disclosed herein;
Fig. 2 is the schematic diagram according to the electronic document repeater of an embodiment;
Fig. 3 is the flow chart according to the method for the repeating transmission electronic document of an embodiment;
Fig. 4 is the flow chart according to the method based at least one electronic document creation data set in an embodiment.
Specific embodiment
It is important that, it shall be noted that embodiment disclosed herein is only multiple advantageous uses of innovative approach herein Example.In general, made statement does not limit in multiple embodiments claimed in the description of the present application Any one.In addition, some state the feature for being likely to be suited for some creativeness, but other features are not suitable for it.Ordinary circumstance Under, unless otherwise indicated, singular element can be plural number, and vice versa, and without loss of generality.In the accompanying drawings, similar number Word indicates similar part in multiple attached drawings.
Multiple embodiments disclosed herein include the method and system for retransmitting the electronic document for lacking necessary data.? In one embodiment, data set is created based on the electronic document of information relevant to transaction is indicated.Based on the electronic document number The structured stencil of transaction attribute is created according to collection.It is based upon the template of the electronic document creation, retrieves electronic document call.Root It according to the template and electronic document call, determines whether the electronic document does not include necessary data, if do not included, generates Retransmit request electronic document.It is described to retransmit the finger that request electronic document may include application form, electronic document and the necessary data Show.The repeating transmission request electronic document may also include the necessary data extracted from database.
Embodiment disclosed herein allows automatic identification to lack the electronic document of necessary data, and can obtain these electronics text The re-transmitted version of part.More specifically, disclosed embodiment includes providing structured data sets template for the electronic document, from And allow to fetch documentary evidence based on unstructured, semi-structured or other electronic evidences for lacking known structure.For example, herein Disclosed embodiment can be used for effectively analyzing the scan image of receipts of transactions, lacks and must count to allow more accurately to identify According to evidence specific part.
Fig. 1 shows example network Figure 100 for describing multiple open embodiments.Network 100 includes retransmitting request Generator 120, client device 130, multiple data source 140-1 to 140-m are (below merely for simple purpose, referred to as data Source 140 and referred to collectively as data source 140), database 150 and repeater server 160, they are communicated by network 110 Connection.Network 110 can be (but are not limited to) wireless, honeycomb or cable network, local area network (LAN), wide area network (WAN), metropolitan area Net (MAN), internet, WWW (WWW), similar network and any combination thereof.
For example, client device 130 can be smart phone, mobile phone, laptop, personal computer, plate Computer, wearable computing devices etc..In an example embodiment, client device 130 is used by the employee of enterprise, and is configured To send the electronic document that may need to retransmit to repeating transmission request generator 120.
Each data source 140 stores data relevant to electronic document call.The data are possibly including, but not limited to limit The rule of these requirements.In an example embodiment, these requirements include tax regulatory requirements, based on proof electronic document (such as receipt or invoice) indicates the information needed for recycling or deducting.Tax regulatory requirements are by the vertical of different jurisdictions The rule and regulations that method committee member formulates, therefore, rule and regulations between different jurisdictions are usually different.
Database 150 is a data storage bank, wherein comprising can be used for retransmitting the complete of undesirable electronic document The data of full page sheet.For example, such data may include Business Name, CompanyAddress, company appreciation tax ID number etc..When electronics text When part lacks necessary data, the necessary data can be retransmitted in request from obtaining and being included in database 150, thus to repeating transmission Device provides required information.
Repeater server 160, which can be configured to receive, retransmits request electronic document, and repeating transmission request electronic document includes Application form, includes necessary data in the electronic document of repeating transmission etc. at repeating transmission request.In an example embodiment, repeater Server 160 can be the businessman of creation original e-document (electronic document retransmitted) or the server of other sellers.
In one embodiment, request generator 120 is retransmitted to be configured as receiving electronic document from client device 130. The electronic document can be proof electronic document, such as, but not limited to electronic invoice, electronic receipt etc..For example, the electronics File can be the image of the scanned copy for the receipt that the supplier for showing and providing commodity or service for company personnel signs and issues.? In another embodiment, it can will retransmit request generator 120 and be configured to from another source (such as, but not limited to corporate resources system System (not shown), another data source etc.) retrieval electronic document.
In one embodiment, retransmit request generator 120 be configured as based on using received electronic document machine The parameter transaction drawing template establishment that device vision is identified.It is configured as creating based on electronic document for this purpose, retransmitting request generator 120 Data set is built, which includes at least partly lacking the data of known structure (for example, unstructured data, semi-structured Data or structural data with unknown structure).Retransmitting request generator 120 can be further configured to be known using optical character Not (OCR) or other image procossings determines the data in electronic document.Therefore, retransmitting request generator 120 may include knowing Other processor is communicatively connected to recognition processor (for example, recognition processor 235 in Fig. 2).
In one embodiment, request generator 120 is retransmitted to be configured as analyzing created data set to identify and be somebody's turn to do The indicated relevant parameter transaction of transaction in electronic document.The template of each creation is a structured data sets, including Identified parameter transaction for transaction.Parameter transaction includes at least provider identifier, and further includes but be not limited to buy Square title, total transaction amount (such as price), the value-added tax amount of money, trade date, loco.Provider identifier can be, example Such as, the title of supplier, supplier's id number, value-added tax ID etc..
For example using non-structured data, the electronic document for determining that needs are retransmitted using structured stencil can for comparison More effectively and accurately judge.Specifically, just for the relevant portion in the structured stencil of electronic document (for example, mould The part for including in the specific fields of plate) it is regular to analyze requirement, to reduce the quantity of the application example of each rule, and subtract Less due to mistake caused by these rules to be applied to the data unrelated with each rule.In addition, for example relative to scanning text The image of part, it is less from being extracted in electronic document and being organized into memory body necessary to the data of template.
In one embodiment, it is based upon the template that the electronic document received is created, request generator 120 will be retransmitted It is configured to inquire one or more data sources 140 for electronic document call.For this purpose, the inquiry can be based on institute's drawing template establishment Data in one or more scheduled fields.Specifically, it can be selected that the inquiry should be sent to it according to the data in scheduled field Data source 140.For example, if multiple data elements indicate that the supplier is to retransmit request positioned at the hotel of France and generate Device 120 can inquire website relevant to European Union (EU) to understand these requirements.
For example, in order to obtain refund or tax reduction, these requirements include must include data in electronic document.Difference department The requirement of method administrative area may be different, and be possibly including, but not limited to buyer's title, vendor name, total transaction amount or Combination of the above etc..These requirements can store in the form of rules, and wherein these rules are applicable in created template Data, to determine in electronic document whether include which necessary data necessary data and determination include.
In one embodiment, request generator 120 is retransmitted to be configured as multiple data element and the demand fetched It is compared.For example, these demands may indicate that buyer ID must be included in electronic document, and based on and template ratio Compared with determining what the buyer ID was the absence of.In certain embodiments, when required data are included in electronic document but this is required Data be unclear or when otherwise indicating, electronic document may be confirmed as lacking necessary data.For example, invoice Certain parts (such as letter, number etc.) of excalation and the data element in expense evidence are unclear.
In another embodiment, repeating transmission request generator 120, which can be configured to fetch, indicates the non-structural of the demand Change electronic document.For this purpose, retransmitting request generator 120 can be configured to execute these demand electronic documents OCR to extract this A little demands.
In one embodiment, when at least part for determining electronic document lacks required data, repeating transmission request Generator 120 is configured to extract the data for improving necessary data from database 150.For example, if buyer address is known Not Wei necessary data lacking in electronic document, then retransmitting request generator 120 can be from database 150 with extracting the buyer Location.According to another embodiment, retransmitting request generator 120 be can be configured as from other source (not shown) extraction data, Such as website, data storage bank etc..Partial data can be one or more parameter transactions based on the template created and be mentioned It takes, for example, transaction identifiers can be used to identify data relevant to the affairs shown in electronic document and appropriate complete Entire data can determine in the data of identification.
In embodiment, it retransmits request generator 120 and is configurable to generate repeating transmission request electronic document.Retransmit request electronics File can be or be possibly including, but not limited to application form, partial data or both of the above.Application form may include the electronic document And the request for retransmitting updated electronic document.In certain embodiments, application form may include or indicate complete number According to.For example, the partial data based on buyer identifier number necessary to invoice, it includes the hair which, which generates, The application form of ticket and a request, the request require to retransmit corresponding complete expense evidence.
In one embodiment, repeating transmission request generator 120, which can be configured to retransmit, requests electronic document to be sent to weight Device server 160 is sent out, and receives the electronic document retransmitted from the repeater server 160.In certain embodiments, repeating transmission is asked Seeking survival to grow up to be a useful person 120 can be configured to select the repeating transmission request electronic document should from multiple repeater server (not shown) The repeater server 160 being sent to.Repeater server 160 can be based on, for example, showing in electronic document to be retransmitted Distributor identification accord with (for example, such as " whole seller ID " field of the template in creation shown in), select.
Fig. 2 is the exemplary diagram according to the repeating transmission request generator 120 of one embodiment.Retransmitting request generator 120 includes It is connected to the processing circuit 210 of memory body 215, memory 220 and network interface 240.In another embodiment, request is retransmitted The component of generator 120 can be communicatively coupled by bus 250.
Processing circuit 210 can be used as one or more hardware logic components and circuit to realize.It is, for example, possible to use The type of hardware logic elements includes field programmable gate array (FPGA), specific integrated circuit (ASIC), Application Specific Standard Product (ASSP), system on chip (SOC), general purpose microprocessor, microcontroller, digital signal processor (DSP) etc., or can hold Row calculates or any other hardware logic component of other information processing.
Memory body 215 can be volatile (for example, RAM etc.), non-volatile (such as ROM, flash memory etc.) or its Combination.In one configuration, the computer-readable instruction for realizing one or more embodiments disclosed herein can store In memory 220.
In another embodiment, memory body 215 is configured as storage software.Software is interpreted broadly to any class The instruction of type, either software, firmware, middleware, microcode, hardware description language or other.Instruction may include code (example Such as, source code format, binary code form, executable code format or any other suitable code format).When by one Or multiple processors make processing circuit 210 execute a variety of processing described herein when executing instruction.Specifically, when instruction quilt When execution, these instructions make processing circuit 210 request the repeating transmission for lacking the electronic document of necessary data, as described herein.
Memory 220 can be magnetic memory, optical memory etc., and may be implemented as such as flash memory or its His memory body, CD-ROM, digital versatile disc (DVD) or any other medium that can be used for storing useful information.
OCR processor 230 can include but is not limited to be configured as the pattern that identification unstructured data concentrates, feature or The feature and/or pattern identification processor (RP) 235 of the two.Specifically, in embodiment, OCR processor 230 is configured as At least identify the character in unstructured data.It includes the data for verifying required data of trading that the character of identification, which can be used for creating, Collection.
Network interface 240 allow to retransmit request generator 120 and business system 130, database 150, the source Web 150 or its Combination is communicated, and for for example collecting metadata, fetches data, storing data etc..
It should be appreciated that embodiment described herein specific structure shown in Fig. 2 is not limited to, and without departing from disclosed Scope of embodiments in the case where other structures also can be used.
Fig. 3 is exemplary process diagram 300, and it illustrates the electronics texts that the request according to one embodiment lacks necessary data The method of the repeating transmission of part.In one embodiment, this method can be executed by repeating transmission request generator 120.
At S310, data set is created based on including the electronic document of data relevant to transaction.The electronic document It can include but is not limited to unstructured data, semi-structured data, with structure do not expect or unpub or both Structural data.In one embodiment, S310 can also include analyzed using optical character identification (OCR) electronic document with It determines the data in electronic document, identifies the critical field in data, identify the value in data, or combinations thereof.Below for Fig. 4 It further describes based on electronic document and creates data set.
For example, the electronic document can be the image of the invoice of display scanning.Electronic document can be from customer equipment (example Such as, the equipment for submitting the organization employees of invoice to use as transaction evidence) it receives.Alternatively, can be from database or corporate resources Electronic document is fetched in system.
On S320, data set is analyzed.In one embodiment, analysis data set can include but is not limited to really Determine parameter transaction, the parameter transaction is, but is not limited to, for example, at least entity identifier (for example, consumer enterprise mark, Businessman enterprise mark or both), with the related information of trading (for example, date, time, price, the class of the commodity of sale or service Type etc.) or the two.In another embodiment, analysis data set can also include being traded based on data set identify.At one In example embodiment, parameter transaction includes at least provider identifier.
At S330, it is based on data set drawing template establishment.Template can be but not limited to include multiple fields data structure. These fields may include identified parameter transaction.The field can be predefined.
Due to the structuring essence of the template of creation, make processing speed faster from electronic document drawing template establishment.For example, phase For lacking the data set of this structure, inquiry is executed in structured data sets and the execution efficiency of processing operation can be higher. In addition, the information from electronic document is organized into structured data sets, can reduce significantly for saving in electronic document Amount of storage necessary to the information for including.Electronic document is usually image, is more deposited compared to the data set needs comprising identical information Store up space.For example, indicating that the data set of the data from 100000 image electronic files can be used as data record and be stored in In text file.The size of such a text file will be significantly smaller than the size of 100000 width images.
At S340, retransmits request generator 120 and inquire one or more data sources with the template acquisition electricity based on creation Subfile requirement.In one embodiment, which can number in one or more scheduled fields based on institute's drawing template establishment According to.For example, these requirements can be tax affairs report requirement, and it can be used as one or more necessary datas that limit (for example, mould The relevant data of the scheduled field of plate) requirement rules fetch.
At S350, the requirement based on the template and acquisition determines whether electronic document lacks necessary data, if lacked It is few, then continue to execute S360;Otherwise, executive termination.In one embodiment, S350 may include being applied to be created by requirement rules In the data for modeling plate.For example, when necessary data be lose, it is imperfect, unclear or when otherwise indicating, it may be determined that The electronic document lacks necessary data.
At S360, when determining that electronic document lacks necessary data, retrieval is for providing the complete number of necessary data According to.The partial data can be fetched based on the one or more of the necessary data and the parameter transaction that lack, wherein institute The transaction of the electronic document, such as the combination of transaction identifiers or time and buyer can be uniquely identified by stating parameter transaction.
At S370, generates and retransmit request electronic document.In one embodiment, repeating transmission request electronic document is an electricity Sub- application form or including an electronic form, the electronic form are based on parameter transaction, necessary data, partial data or more Every combination producing.The electronic form may also include the electronic document for lacking necessary data.The electronic form can refer to It is bright to be used to create the necessary data and partial data for retransmitting electronic document, and may further indicate that asking for the repeating transmission electronic document It asks.
At S380, electronic document can be requested to be sent to repeating transmission server repeating transmission generated, for example, with needing to retransmit The relevant supplier of transaction of electronic document or the server of businessman.
Fig. 4 is the exemplary process diagram of S310, and which depict create data based on electronic document according to one embodiment The method of collection.
At S410, the electronic document is obtained.Obtain electronic document may include but be not limited to receive electronic document (for example, Receive the image of scanning) or electronic document is fetched (for example, fetching from consumption business system, business enterprise system or database The electronic document).
On S420, which is analyzed.The analysis can include but is not limited to know using optical character (OCR) does not determine the character in the electronic document.
On S430, the critical field and value in the electronic document are determined according to the above analysis.The critical field can To include but is not limited to name and address, date, currency, the commodity of sale or service, transaction ID, the invoice number of businessman Deng.A electronic document may include unnecessary details, and these details are not to be regarded as key value.For example, businessman Trade mark may be it is unwanted, therefore, it is not a key value.In embodiment, a critical field column can be predefined Table, and extract and the matched data block of these critical fielies.Then, clean processing is carried out to ensure that information is accurately shown.Example Such as, if OCR generates the data of one " 1211212005 ", this data is converted to 12/12/2005 number by cleaning processing According to.Another example is that " Mosden " will be changed to if title is expressed as " Mo $ den ".Cleaning treatment can be used external Information resources (such as dictionary, calendar etc.) Lai Zhihang.
In another embodiment, check whether the data block of extraction is complete.For example, if recognizing Merchant name but lacking Few seller addresses, then the critical field of seller addresses is incomplete.It executes and attempts that the primary key value of missing is enabled completely to locate Reason.This trial may include inquiry external system and database, and the invoice correlation analyzed before, or inquiry it is above because The combination of element.The external system and database may include operation list, Universial Product Code (UPC) database, wrap up and sending with charge free And tracking system etc..In one embodiment, S430 brings the full set of predefined keywords section and its respective value.
At S440, structured data sets are generated.The data set of generation includes identified critical field and value.
It should be understood that not limiting these elements generally for the element for using " first ", " second " etc. specified herein Quantity or sequence.On the contrary, it is as the two or more elements of differentiation or an element that these usually used are specified herein Multiple examples.Therefore, the reference of the first and second elements is not meant to that two elements or the first element only can be used It must be located at before second element in some way.In addition, unless otherwise indicated, one group of element includes one or more elements.
As it is used herein, adding a bulleted list after phrase "at least one", it is meant that can be used alone this Any one of bulleted list, or any combination of two of them or more than two listed items can be used.For example, such as One system of fruit is described as including " at least one of A, B and C ", and system may include an independent A;An independent B;It is single An only C;A and B combination;B and C in combination;A and C in combination;Or A, B and C in combination.
Multiple embodiments disclosed herein can with hardware, firmware, software or in which any combination realize.In addition, It is preferably the application program being specifically presented on program storage unit (PSU) or computer-readable medium, program storage by software realization What unit or computer-readable medium were combined by part or certain equipment and/or equipment group.Application program can upload to packet On the machine for including any appropriate framework, and executed by the machine.Preferably, which realizes on a computer platform, computer Platform has the hardware such as one or more central processing unit (" CPU "), memory body and input/output interface.Computer Platform can also include operating system and micro-instruction code.Various processing and functions described herein can be micro-instruction code A part or application program a part either their any combination, regardless of this computer or processor whether have by It is clearly shown, these micro-instruction codes or application program can be executed by CPU.In addition, various other external units can be with It is connected to computer platform, such as additional data storage cell and print unit.In addition, non-transitory computer readable medium is Any computer-readable medium other than temporary propagation signal.
In the purpose that herein cited all examples and condition expression are all for explanation, to help disclosed in reader's understanding Embodiment principle and the concept transmitted of inventor, further to develop the technology, and should be understood to be not limited to this The example and condition specifically quoted a bit.In addition, principle, aspect and the statement of embodiment of all about open embodiment herein with And wherein specific example, it is intended to include the structural equivalents and functional equivalent in these statements.In addition, this kind of equivalent packet The equivalent for including the equivalent being currently known and the following exploitation can execute any member of identical function that is, regardless of structure Element.

Claims (15)

1. a kind of obtain the method for lacking the repeating transmission of electronic document of necessary data, the electronic document includes at least partly non-knot Structure data, which comprises
For the electronic document drawing template establishment, wherein the template is structured data sets, the structured data sets include base In at least one parameter transaction that at least partly unstructured data determines;
It is that at least one requires to inquire at least one data source based on the template, wherein at least one described requirement defines institute State the necessary data of electronic document;
Based on the template and at least one described requirement, determine whether the electronic document lacks the necessary data at least A part;
Partial data is fetched when determining that the electronic document lacks at least part of the necessary data, wherein described complete Data have mended the necessary data;
It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes the electronic document, and indicate that one asks It asks to retransmit the electronic document about the partial data;And
Repeating transmission request electronic document is sent to repeater server.
2. according to the method described in claim 1, further include:
At least one critical field and at least one value are identified in the electronic document;
Create data set based on the electronic document, wherein the data set created include at least one described key field and it is described extremely A few value;And
The data set of creation is analyzed, wherein determining at least one parameter transaction based on the analysis.
3. according to the method described in claim 2, wherein identifying at least one described critical field and at least one value further include:
The electronic document is analyzed with the data in the determination electronic document;
Based on predefined critical field list, at least part of determining data is extracted, wherein the data of the determination At least part matches at least one critical field in the predefined critical field list.
4. according to the method described in claim 3, wherein analyzing the electronic document further include:
Optical character identification is executed to the electronic document.
5. according to the method described in claim 1, wherein corresponding at least one the described electronic document call template When at least part is at least one of following situations: losing, is imperfect and unclear, the electronic document, which lacks, to be counted According at least part.
6. according to the method described in claim 1, wherein the repeating transmission request electronic document is electronic form.
7. according to the method described in claim 1, wherein the electronic document is the electronic identification file that supplier signs and issues, wherein The repeater server is the server of supplier.
8. a kind of non-transitory computer-readable medium, has the instruction stored thereon, described instruction causes a processing circuit to hold Row obtains the processing for lacking the repeating transmission of electronic document of necessary data, and the electronic document includes at least partly unstructured number According to the processing includes:
For the electronic document drawing template establishment, wherein the template is structured data sets, the structured data sets include extremely A few parameter transaction determined based at least partly unstructured data;
It is that at least one requires to inquire at least one data source based on the template, wherein at least one described requirement defines institute State the necessary data of electronic document;
Based on the template and at least one described requirement, determine whether the electronic document lacks the necessary data at least A part;
Partial data is fetched when determining that the electronic document lacks at least part of the necessary data, wherein described complete Data have mended the necessary data;
It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes the electronic document, and indicate that one asks It asks to retransmit the electronic document about the partial data;And
Repeating transmission request electronic document is sent to repeater server.
9. a kind of obtain the system for lacking the repeating transmission of electronic document of necessary data, the electronic document includes at least partly non-knot Structure data, the system comprises:
Processing circuit;With
Memory body, the memory body include instruction, and described instruction by the processing circuit when being executed, by system configuration are as follows:
For the electronic document drawing template establishment, wherein the template is structured data sets, the structured data sets include base In at least one parameter transaction that at least partly unstructured data determines;
It is that at least one requires to inquire at least one data source based on the template, wherein at least one described requirement defines institute State the necessary data of electronic document;
Based on the template and at least one described requirement, determine whether the electronic document at least lacks the necessary data A part;
Partial data is fetched when determining that the electronic document at least lacks a part of the necessary data, wherein described complete Data have mended the necessary data;
It generates and retransmits request electronic document, wherein repeating transmission request electronic document includes the electronic document, and indicate that one asks It asks to retransmit the electronic document about the partial data;And
Repeating transmission request electronic document is sent to repeater server.
10. system according to claim 9, wherein the system is configured to:
At least one critical field and at least one value are identified in the electronic document;
Create data set based on the electronic document, wherein the data set created include at least one described key field and it is described extremely A few value;And
The data set of creation is analyzed, wherein determining at least one described parameter transaction based on the analysis.
11. system according to claim 10, wherein the system is configured to:
The electronic document is analyzed with the data in the determination electronic document;
Based on predefined critical field list, at least part of determining data is extracted, wherein the data of the determination At least part matches at least one critical field in the predefined critical field list.
12. system according to claim 11, wherein the system is configured to:
Optical character identification is executed to the electronic document.
13. system according to claim 9, wherein template corresponding at least one described electronic document call is at least When a part is at least one of following situations: losing, is imperfect and unclear, the electronic document, which lacks, described must count According at least part.
14. system according to claim 9, wherein repeating transmission request electronic document is electronic form.
15. system according to claim 9, wherein the electronic document is the electronic identification file that supplier signs and issues, Described in repeater server be supplier server.
CN201780060455.4A 2016-08-05 2017-08-04 Obtain the system and method for lacking the repeating transmission of electronic document of necessary data Pending CN109791641A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201662371237P 2016-08-05 2016-08-05
US62/371,237 2016-08-05
US15/361,934 US20170154385A1 (en) 2015-11-29 2016-11-28 System and method for automatic validation
US15/361,934 2016-11-28
PCT/US2017/045497 WO2018027133A1 (en) 2016-08-05 2017-08-04 Obtaining reissues of electronic documents lacking required data

Publications (1)

Publication Number Publication Date
CN109791641A true CN109791641A (en) 2019-05-21

Family

ID=61074139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780060455.4A Pending CN109791641A (en) 2016-08-05 2017-08-04 Obtain the system and method for lacking the repeating transmission of electronic document of necessary data

Country Status (3)

Country Link
EP (1) EP3494530A4 (en)
CN (1) CN109791641A (en)
WO (1) WO2018027133A1 (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070050356A1 (en) * 2005-08-23 2007-03-01 Amadio William J Query construction for semantic topic indexes derived by non-negative matrix factorization
CN101118597A (en) * 2006-07-31 2008-02-06 富士通株式会社 Form processing method, form processing device, and computer product
US20080229187A1 (en) * 2002-08-12 2008-09-18 Mahoney John J Methods and systems for categorizing and indexing human-readable data
CN102124476A (en) * 2008-01-22 2011-07-13 碳流公司 Carbon credit workflow system
JP2013016097A (en) * 2011-07-06 2013-01-24 Daiwa Institute Of Research Business Innovation Ltd Account transfer processing system
CN104699714A (en) * 2013-12-09 2015-06-10 北大方正集团有限公司 Method and device for transferring files of book edition format into files of EPUB format
CN104965907A (en) * 2015-06-30 2015-10-07 小米科技有限责任公司 Structured object generation method and apparatus
US20150356174A1 (en) * 2014-06-06 2015-12-10 Wipro Limited System and methods for capturing and analyzing documents to identify ideas in the documents
CN106598930A (en) * 2016-12-29 2017-04-26 南威软件股份有限公司 Electronic certificate processing method based on layout file

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8774516B2 (en) * 2009-02-10 2014-07-08 Kofax, Inc. Systems, methods and computer program products for determining document validity

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080229187A1 (en) * 2002-08-12 2008-09-18 Mahoney John J Methods and systems for categorizing and indexing human-readable data
US20070050356A1 (en) * 2005-08-23 2007-03-01 Amadio William J Query construction for semantic topic indexes derived by non-negative matrix factorization
CN101118597A (en) * 2006-07-31 2008-02-06 富士通株式会社 Form processing method, form processing device, and computer product
CN102124476A (en) * 2008-01-22 2011-07-13 碳流公司 Carbon credit workflow system
JP2013016097A (en) * 2011-07-06 2013-01-24 Daiwa Institute Of Research Business Innovation Ltd Account transfer processing system
CN104699714A (en) * 2013-12-09 2015-06-10 北大方正集团有限公司 Method and device for transferring files of book edition format into files of EPUB format
US20150356174A1 (en) * 2014-06-06 2015-12-10 Wipro Limited System and methods for capturing and analyzing documents to identify ideas in the documents
CN104965907A (en) * 2015-06-30 2015-10-07 小米科技有限责任公司 Structured object generation method and apparatus
CN106598930A (en) * 2016-12-29 2017-04-26 南威软件股份有限公司 Electronic certificate processing method based on layout file

Also Published As

Publication number Publication date
EP3494530A1 (en) 2019-06-12
EP3494530A4 (en) 2020-04-15
WO2018027133A1 (en) 2018-02-08

Similar Documents

Publication Publication Date Title
US11062132B2 (en) System and method for identification of missing data elements in electronic documents
WO2017091829A1 (en) System and method for automatic generation of reports based on electronic documents
US11138372B2 (en) System and method for reporting based on electronic documents
US20170323006A1 (en) System and method for providing analytics in real-time based on unstructured electronic documents
US20190236127A1 (en) Generating a modified evidencing electronic document including missing elements
US20180011846A1 (en) System and method for matching transaction electronic documents to evidencing electronic documents
CN109791537A (en) Electronic document is supplemented into complete system and method
US20170323157A1 (en) System and method for determining an entity status based on unstructured electronic documents
US20180025225A1 (en) System and method for generating consolidated data for electronic documents
US20180046663A1 (en) System and method for completing electronic documents
WO2017201012A1 (en) Providing analytics in real-time based on unstructured electronic documents
CN109791641A (en) Obtain the system and method for lacking the repeating transmission of electronic document of necessary data
US10387561B2 (en) System and method for obtaining reissues of electronic documents lacking required data
US20180025438A1 (en) System and method for generating analytics based on electronic documents
CN109791540A (en) The system and method reported based on electronic document
EP3494531A1 (en) System and method for generating consolidated data for electronic documents
US20170169519A1 (en) System and method for automatically verifying transactions based on electronic documents
WO2018132655A2 (en) System and method for optimizing reissuance of electronic documents
EP3417383A1 (en) Automatic verification of requests based on electronic documents
WO2018034941A1 (en) System and method for generating analytics based on electronic documents
US20170193609A1 (en) System and method for automatically monitoring requests indicated in electronic documents
US20170323395A1 (en) System and method for creating historical records based on unstructured electronic documents
EP3430584A1 (en) System and method for automatically verifying transactions based on electronic documents
EP3491554A1 (en) Matching transaction electronic documents to evidencing electronic
EP3458971A1 (en) System and method for automatically monitoring requests indicated in electronic documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20201106