CN112036145A - Financial statement identification method and device, computer equipment and readable storage medium - Google Patents

Financial statement identification method and device, computer equipment and readable storage medium Download PDF

Info

Publication number
CN112036145A
CN112036145A CN202010905770.6A CN202010905770A CN112036145A CN 112036145 A CN112036145 A CN 112036145A CN 202010905770 A CN202010905770 A CN 202010905770A CN 112036145 A CN112036145 A CN 112036145A
Authority
CN
China
Prior art keywords
subject
financial
matching
information
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010905770.6A
Other languages
Chinese (zh)
Inventor
崔子龙
徐晏君
周圆
胡玉琛
赵斌伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Financial Leasing Co Ltd
Original Assignee
Ping An International Financial Leasing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Financial Leasing Co Ltd filed Critical Ping An International Financial Leasing Co Ltd
Priority to CN202010905770.6A priority Critical patent/CN112036145A/en
Publication of CN112036145A publication Critical patent/CN112036145A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/183Tabulation, i.e. one-dimensional positioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a method and a device for identifying financial statements, computer equipment and a readable storage medium, wherein the method comprises the following steps: receiving a to-be-processed financial newspaper picture of a to-be-processed financial newspaper; identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, wherein the financial report document comprises format information, subject information and financial report data information; inputting the format information and the subject information into a subject matching model, and outputting a matching result, wherein the matching result comprises a matching subject of the to-be-processed financial report and a confidence level of the matching subject; and determining a target subject corresponding to the to-be-processed financial report from the matched subjects according to the confidence level, and generating a target financial report based on the target subject and the financial report data information. Therefore, the recognition precision of the financial statements is improved, and the financial statements can be maintained conveniently by a user. In addition, the invention also relates to a block chain technology, and the target financial report can be stored in the block chain.

Description

Financial statement identification method and device, computer equipment and readable storage medium
Technical Field
The embodiment of the invention relates to the field of reports, in particular to a financial report identification method, a financial report identification device, computer equipment and a readable storage medium.
Background
The financial statement is an accounting statement which reflects the fund and profit status of the enterprise or budget unit in a certain period. The types, formats and reporting requirements of financial statements in China are all specified by a uniform accounting system, and enterprises are required to regularly report the financial statements. At present, at the end of a report period, national industry enterprises need to compile capital balance tables, special funds and special fund shifting tables, capital borrowing and special borrowing tables and other capital reports, profit reports such as profit tables, product sales profit detail tables and the like respectively; the national business enterprises need to submit capital balance tables, operation condition tables, special fund tables and the like. The financial statements include balance sheet, profit sheet, cash flow sheet or financial condition change sheet, additional sheet and notes.
The importance of the financial reports to the financing lease company is self-evident throughout the entire life cycle of the financing lease. The common financial report platform on the market only achieves unified management on the templates of the financial reports, and further performs OCR recognition on the paper-version financial reports.
Disclosure of Invention
In view of this, an object of the embodiments of the present invention is to provide a method and an apparatus for identifying a financial statement, a computer device and a readable storage medium, which improve the identification accuracy and facilitate the financial maintenance for a user.
In order to achieve the above object, an embodiment of the present invention provides a method for identifying a financial statement, including:
receiving a to-be-processed financial newspaper picture of a to-be-processed financial newspaper;
identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, wherein the financial report document comprises format information, subject information and financial report data information;
inputting the format information and the subject information into a subject matching model, and outputting a matching result, wherein the matching result comprises a matching subject of the to-be-processed financial report and a confidence level of the matching subject;
and determining a target subject corresponding to the to-be-processed financial report from the matched subjects according to the confidence level, and generating a target financial report based on the target subject and the financial report data information.
Further, the identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, where the financial report document includes format information, subject information, and financial report data information, and the identifying includes:
preprocessing the to-be-processed financial newspaper to obtain a standard picture;
performing layout identification on the standard picture to obtain a corresponding financial newspaper layout, wherein the layout information comprises the financial newspaper layout and a layout name;
performing character recognition on the standard picture to obtain a plurality of field information, wherein the field information comprises the format name, subject information and financial and newspaper data information;
and performing layout recovery on the financial newspaper format according to the field information, and checking to obtain an identified financial newspaper document.
Further, the inputting the format information and the subject information into a subject matching model, and outputting a matching result, where the matching result includes a matching subject of the to-be-processed financial report and a confidence level of the matching subject includes:
inputting the format information and the subject information into a subject matching model so as to obtain a subject template library matched with the format information according to the format information through the subject matching model;
segmenting the subject information according to the granularity of the characters to obtain character information corresponding to each subject, wherein the subject information comprises a plurality of subjects;
matching the character information corresponding to each subject according to a preset inverted index table so as to obtain a subject candidate set corresponding to each subject from the subject template library;
and matching each subject with a standard subject in the subject candidate set corresponding to the subject, and outputting a matching result, wherein the matching result comprises the matching subject of the to-be-processed financial report and the confidence level of the matching subject.
Further, before matching the character information corresponding to each subject according to a preset inverted index to obtain a subject candidate set corresponding to each subject, the method includes:
acquiring a plurality of standard subjects;
and establishing a plurality of inverted index tables by taking the first characters of the plurality of standard subjects as key words, and storing the inverted index tables into the subject template library.
Further, the matching each subject with a standard subject in the subject candidate set corresponding to the subject, and outputting a matching result, where the matching result includes the confidence level of the matching subject of the to-be-processed financial report and the matching subject, and includes:
calculating similarity values of standard subjects in the subject candidate sets corresponding to the subjects according to a similarity algorithm;
and taking the standard subjects with the similarity values larger than a preset threshold value as matching subjects, and determining the confidence level of the matching subjects according to the similarity values of the matching subjects.
Further, the method further comprises:
acquiring a plurality of financial and newspaper data;
carrying out data analysis on the target financial report and the financial report data to obtain an analysis result;
and generating early warning information based on the analysis result.
Further, the method further comprises:
and storing the target financial report into a block chain.
In order to achieve the above object, an embodiment of the present invention further provides an apparatus for identifying a financial statement, including:
the receiving module is used for receiving the to-be-processed financial newspaper pictures of the to-be-processed financial newspaper;
the identification module is used for identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, wherein the financial report document comprises format information, subject information and financial report data information;
the matching module is used for inputting the format information and the subject information into a subject matching model and outputting a matching result, wherein the matching result comprises a matching subject of the to-be-processed financial report and a confidence level of the matching subject;
and the determining and generating module is used for determining a target subject corresponding to the to-be-processed financial report from the matched subjects according to the confidence level and generating the target financial report based on the target subject and the financial report data information.
In order to achieve the above object, an embodiment of the present invention further provides a computer device, where the computer device includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the computer program, when executed by the processor, implements the steps of the identification method for financial statements as described above.
To achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, where the computer program is executable by at least one processor to cause the at least one processor to execute the steps of the identification method of financial statements as described above.
According to the financial statement identification method and device, the computer equipment and the readable storage medium, the format information, the subject information and the financial statement data information are obtained by identifying the to-be-processed financial statement picture, and then the format information, the subject information and the financial statement data information are input into the subject matching model to carry out subject matching through the subject matching model, so that subjects are further corrected, the identification precision is improved, and the financial statement maintenance is convenient for a user.
Drawings
FIG. 1 is a flowchart of a first embodiment of a financial statement identification method according to the present invention.
Fig. 2 is a flowchart of step S120 according to an embodiment of the present invention.
Fig. 3 is a flowchart of step S140 according to an embodiment of the present invention.
Fig. 4 is a flowchart of step S147 in the first embodiment of the present invention.
Fig. 5 is a flowchart of step S144 according to an embodiment of the present invention.
Fig. 6 is a flowchart of step S170 according to an embodiment of the present invention.
FIG. 7 is a schematic diagram of program modules of a second embodiment of the apparatus for identifying financial statements of the present invention.
Fig. 8 is a schematic diagram of a hardware structure of a third embodiment of the computer apparatus according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Referring to fig. 1, a flowchart illustrating steps of a method for identifying a financial statement according to a first embodiment of the present invention is shown. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed. The following description is made by way of example with the computer device 2 as the execution subject. The details are as follows.
And S100, receiving the to-be-processed financial newspaper picture of the to-be-processed financial newspaper.
Specifically, the to-be-processed financial reports are financial reports needing to be identified, the to-be-processed financial report pictures can be scanned or photos containing the to-be-processed financial reports, and the to-be-processed financial reports in the to-be-processed financial report pictures are shot more neatly and tidily.
Step S120, identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, wherein the financial report document comprises format information, subject information and financial report data information.
Specifically, the financial report to be processed is identified through an OCR technology, a plurality of pieces of field information are identified, and the plurality of pieces of field information are analyzed to obtain the type, name, subject and data of the financial report. The format of the to-be-processed financial newspaper can be identified according to the obtained financial newspaper type, and then the to-be-processed financial newspaper is matched with a built-in financial newspaper template library, and the financial newspaper formats of different types or standard financial newspaper templates are different, so that the corresponding financial newspaper type is obtained; the financial report comprises type information, and after a plurality of field information is obtained through character recognition, the type information is matched with fields of various standard financial report types to determine the type of the financial report; or by manual identification by the user. The subject information and the financial report data information can be obtained by identifying through key values, and a correlation relationship is established. And identifying a large title to obtain the corresponding financial report name, wherein the format information comprises a financial report format, a financial report type and a financial report name.
Exemplarily, referring to fig. 2, the step S120 includes:
and step S121, preprocessing the to-be-processed financial newspaper to obtain a standard picture.
Specifically, image input: for different image formats, different storage formats and different compression modes exist. Pretreatment: the method mainly comprises binarization, noise removal, inclination correction and the like. Binarization: most of pictures shot by a camera are color images, the information content of the color images is huge, the contents of the pictures can be simply divided into foreground and background, in order to enable a computer to recognize characters more quickly and better, the color images need to be processed first, so that only the foreground information and the background information of the pictures can be processed, the foreground information can be simply defined to be black, the background information is white, and the binary image is formed. Noise removal: the definition of noise can be different for different documents, and denoising is carried out according to the characteristics of the noise, namely noise removal. And (3) inclination correction: since the general users are free to photograph the document, the photographed picture is inevitably inclined, which requires the character recognition software to correct. And obtaining a standard picture after the pretreatment.
And S122, performing layout identification on the standard picture to obtain a corresponding financial newspaper format, wherein the format information comprises the financial newspaper format and a format name.
Specifically, the document pictures are segmented, and the process of line segmentation is called layout analysis. After layout analysis, matching can be performed through built-in templates, and the financial newspaper templates of different types are different, so that a financial newspaper format is obtained, such as a common enterprise.
Step S123, performing character recognition on the standard picture to obtain a plurality of field information, wherein the field information comprises the format name, subject information and financial and newspaper data information.
Specifically, character recognition can be carried out through template matching and feature extraction, the difficulty of feature extraction is greatly influenced due to the influence of factors such as displacement of characters, stroke thickness, broken pen, adhesion and rotation, character recognition is carried out on a standard picture through a character recognition model, due to the limitation of photographing conditions, character adhesion and broken pen are often caused, the performance of a recognition system is greatly limited, and character cutting function of character recognition software is needed. The extracted features can be identified through a classifier, the classifier classifies the features and outputs which character the features are correspondingly identified into. And determining to identify the segmented characters by using a spectral clustering algorithm, a K neighbor algorithm and a K value parameter space automatic search algorithm, identifying which character is identified, obtaining format names, subject information and financial and newspaper data information after identification, and correcting an identification result according to the relation of specific language context.
And step S124, performing layout restoration on the financial newspaper format according to the field information, and checking to obtain an identified financial newspaper document.
Specifically, the recognized characters are still arranged like the original document picture, the paragraphs are unchanged, the positions are unchanged, and the sequence is unchanged, and the recognized characters are output to a word document, a pdf document and the like, wherein the process is called layout recovery, and a financial report document is obtained after recovery is completed. The fields of the subjects can be filled, and then the financial and newspaper data information is directly filled into the target template.
Illustratively, when the identified field content is not in the system financial report template library, the template can be automatically refined and supplemented.
Step S140, inputting the format information and the subject information into a subject matching model, and outputting a matching result, wherein the matching result comprises the matching subject of the to-be-processed financial report and the confidence level of the matching subject.
Specifically, the format information comprises a financial report format and a financial report name, and the financial report format, the financial report name and field information of subject information are used as input parameters and input into the subject matching model; after the subject matching model receives the three transmitted parameters, acquiring a subject template set of a financial and newspaper template corresponding to the format information, and performing similarity operation on corresponding subjects to generate a subject matching result; and the subject matching model outputs the matching result in a dictionary form, and the output content comprises the matched subjects, the confidence level and the subjects to be selected. The financial report template is a financial report format obtained by analysis after OCR recognition, such as a common enterprise and the like; the name of the financial report is the name of the financial report to be processed, such as a cash flow table, an asset and debt table, a profit table and the like; the subjects are subjects identified by the OCR technology, such as receivable bills. The step is to carry out rough identification of the subjects of the financial reports, namely, to arrange the characters identified by the OCR.
Exemplarily, referring to fig. 3, the step S140 includes:
step S141, inputting the format information and the subject information into a subject matching model, and acquiring a subject template library matched with the format information according to the format information through the subject matching model.
Specifically, the format information comprises a financial report format and a financial report type, and field information of a financial report template, a financial report name and subject information is used as input parameters and input into the subject matching model; after the subject matching model receives the three input parameters, a subject template set of the financial and newspaper template corresponding to the format information is obtained, and the subject template set comprises a plurality of standard subjects.
Step S143, the subject information is segmented according to the word granularity, so as to obtain character information corresponding to each subject, and the subject information comprises a plurality of subjects.
Specifically, each subject in the subject information is segmented into at least two characters.
And step S145, matching the character information corresponding to each subject according to a preset inverted index table so as to obtain a subject candidate set corresponding to each subject from the subject template library.
Specifically, the character information of each subject is matched with the subject template library according to an inverted index table, standard characters of standard subjects acquired from the standard template are prestored in the inverted index table, and each standard subject is numbered. If the standard characters corresponding to a certain standard template are matched with the characters of the subjects, the standard characters are recalled corresponding to the standard subjects. Similarity matching can be carried out on the basis of character similarity algorithms of NLP technologies such as the longest common substring, an edit distance algorithm and a cosine similarity algorithm. For example: when the editing distance algorithm is used, the number of times that the characters of the subjects are edited is calculated to obtain the standard characters of the standard subjects, the number of times of modification is the similarity, the smaller the number of times, the more similar the characters are, the editing operation comprises the steps of replacing one character with another character, inserting one character and deleting one character. For another example: when the longest common substring is used, the lengths of the character strings of the subjects which are the same as the length of the character strings in the standard characters of the standard subjects are found in sequence, and the longer the length is, the greater the similarity is. And then finding the serial number corresponding to the standard subject in the inverted index table to be associated with the matched subject. The templates corresponding to the numbers can be queried in the inverted index table according to the sequence of the numbers, the templates contain subject character information, and if the templates contain subjects, the standard subjects corresponding to the numbers are recalled from the subject template library to obtain a subject candidate set.
And S147, matching each subject with a standard subject in the subject candidate set corresponding to the subject, and outputting a matching result, wherein the matching result comprises the matching subject of the to-be-processed financial report and the confidence level of the matching subject.
Specifically, the similarity of each subject and the standard subject corresponding to each subject is calculated according to a similarity algorithm, and the confidence level is determined according to the similarity. The matching result comprises the matching subject, the confidence level of the matching subject and the subject to be selected. The matched subjects are financial and newspaper subjects corresponding to the subjects identified by the OCR technology; confidence level: matching accuracy, such as exact matching, algorithm matching, fuzzy matching, manual intervention, and the like; subject to be selected: and when the accurate matching and the algorithm matching cannot be carried out, outputting the matching subjects to be selected, and extracting when manual intervention is carried out.
Matching subject example: {
A': [ 'A', 1, [ ] ];
a ratio of 'B': [ 'B', 2, [ ] ];
a unit of 'C': [ 'propyl', 3, [ 'large', 'small', 'multi', 'small' ];
a ratio of 'D': [ 'Ding', 4, [ 'big', 'Small', 'Multi', 'Small' ]
}
Description of the drawings: A. b, C, D is the financial and newspaper subjects recognized by OCR, A, B, C and D represent matching subjects, 1, 2, 3 and 4 represent confidence levels of recognition, when recognition is not accurate (3 and 4), C and D represent corresponding subjects with the largest similarity, and the contents in [ 'big', 'small', 'many', 'few' ] represent the subjects to be selected which are to be confirmed manually.
Exemplarily, referring to fig. 4, the step S147 includes:
step S147A, calculating a similarity value of each subject with respect to a standard subject in the subject candidate set corresponding to the subject according to a similarity calculation method.
Specifically, the similarity calculation method calculates the similarity of each subject with the standard subject in the subject candidate set corresponding to the subject, and is not limited to calculating the similarity by using algorithms such as euclidean distance, pearson correlation coefficient, or cosine similarity.
Step S147B, taking the standard subject with the similarity value greater than the preset threshold as a matching subject, and determining the confidence level of the matching subject according to the similarity value of the matching subject.
Specifically, a preset threshold of the similarity value is preset, the similarity values larger than the preset threshold are classified into confidence levels, and the confidence levels are determined according to the similarity values.
Illustratively, referring to fig. 5, before the step S145, a step S144 is included:
step S144A, a plurality of standard subjects are acquired.
Specifically, each standard enterprise financial report template is obtained according to the standard financial report type (enterprise type), and all subjects of a single standard template are extracted
Step S144B, establishing multiple inverted index tables for the multiple standard subjects with the first character as the keyword, and storing the multiple inverted index tables into the subject template library.
Specifically, all subjects of the standard template establish inverted indexes by using the technology of NLP and search engine according to the granularity of words of the standard subjects, and in turn, an inverted index table of each standard template can be generated.
And step S160, determining a target subject corresponding to the to-be-processed financial report from the matched subjects according to the confidence level, and generating a target financial report based on the target subject and the financial report data information.
Specifically, if the confidence level is high or medium-high, corresponding to 1 and 2 in the above example, the matched standard subject is taken as the target subject; if the confidence level is low level or middle low level, corresponding to 3 and 4 in the above example, the matched standard subjects are used as the subjects to be selected, which indicates that the subjects are not completely matched with the standard subjects, the recommended matched standard subjects are given according to the confidence levels of the subjects and the standard subjects, so that a plurality of fuzzy matching standard subjects (such as [ 'large', 'small', 'more', 'less') are obtained, and then the fuzzy matching standard subjects are verified manually to select the target subjects. And after the target subject is selected, refilling the target subject into the financial report document to obtain the target financial report.
Exemplarily, referring to fig. 6, the method further includes step S170:
in step S171, a plurality of financial instrument data are acquired.
Specifically, the financial report data of other listed companies in the same industry, namely other financial report data, is pulled.
And step S172, carrying out data analysis on the target financial report and the financial report data to obtain an analysis result.
Specifically, the target financial report is compared with financial report data, and customized financial report analysis rules and early warning triggering conditions are supported according to the dimensions of the industry characteristics of each company, the current economic environment, the business data of enterprises and the like. And automatically analyzing the financial report data of a certain company in real time at each stage of financing lease transaction with the company, including stages before, during and after the lease and the like, so as to obtain an analysis result.
Step S173, generating early warning information based on the analysis result.
Specifically, an early warning signal is given based on the result of data analysis to guide business personnel to carry out risk identification and limit evaluation in time, and the financial reports of enterprises are intelligently analyzed and early warned in the full life cycle activity of financing lease.
Illustratively, the method further comprises:
and storing the target financial report into a block chain.
Specifically, uploading the target financial report to the blockchain can ensure the safety and the fair transparency of the target financial report to the user. The user device may download the summary information from the blockchain to verify that the target financial instrument has been tampered with. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Example two
Referring to fig. 7, a schematic diagram of program modules of a second embodiment of the device for identifying financial statements of the present invention is shown. In this embodiment, the identification apparatus 20 of the financial statements may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to implement the present invention and implement the identification method of the financial statements. The program module referred to in the embodiment of the present invention refers to a series of computer program instruction segments capable of performing specific functions, and is more suitable for describing the execution process of the identification apparatus 20 of financial statements in the computer device 2 than the program itself. The following description will specifically describe the functions of the program modules of the present embodiment:
the receiving module 200 is configured to receive a to-be-processed financial report picture of a to-be-processed financial report.
Specifically, the to-be-processed financial reports are financial reports needing to be identified, the to-be-processed financial report pictures can be scanned or photos containing the to-be-processed financial reports, and the to-be-processed financial reports in the to-be-processed financial report pictures are shot more neatly and tidily.
The identification module 202 is configured to identify the to-be-processed financial report picture to obtain a financial report document included in the to-be-processed financial report picture, where the financial report document includes format information, subject information, and financial report data information.
Specifically, the financial report to be processed is identified through an OCR technology, a plurality of pieces of field information are identified, and the plurality of pieces of field information are analyzed to obtain the type, name, subject and data of the financial report. The format of the to-be-processed financial newspaper can be identified according to the obtained financial newspaper type, and then the to-be-processed financial newspaper is matched with a built-in financial newspaper template library, and the financial newspaper formats of different types or standard financial newspaper templates are different, so that the corresponding financial newspaper type is obtained; the financial report comprises type information, and after a plurality of field information is obtained through character recognition, the type information is matched with fields of various standard financial report types to determine the type of the financial report; or by manual identification by the user. The subject information and the financial report data information can be obtained by identifying through key values, and a correlation relationship is established. And identifying a large title to obtain the corresponding financial report name, wherein the format information comprises a financial report format, a financial report type and a financial report name.
Illustratively, the identification module 202 is further configured to:
and preprocessing the to-be-processed financial newspaper to obtain a standard picture.
Specifically, image input: for different image formats, different storage formats and different compression modes exist. Pretreatment: the method mainly comprises binarization, noise removal, inclination correction and the like. Binarization: most of pictures shot by a camera are color images, the information content of the color images is huge, the contents of the pictures can be simply divided into foreground and background, in order to enable a computer to recognize characters more quickly and better, the color images need to be processed first, so that only the foreground information and the background information of the pictures can be processed, the foreground information can be simply defined to be black, the background information is white, and the binary image is formed. Noise removal: the definition of noise can be different for different documents, and denoising is carried out according to the characteristics of the noise, namely noise removal. And (3) inclination correction: since the general users are free to photograph the document, the photographed picture is inevitably inclined, which requires the character recognition software to correct. And obtaining a standard picture after the pretreatment.
And performing layout identification on the standard picture to obtain a corresponding financial newspaper layout, wherein the layout information comprises the financial newspaper layout and a layout name.
Specifically, the document pictures are segmented, and the process of line segmentation is called layout analysis. After layout analysis, matching can be performed through built-in templates, and the financial newspaper templates of different types are different, so that a financial newspaper format is obtained, such as a common enterprise.
And performing character recognition on the standard picture to obtain a plurality of field information, wherein the field information comprises the format name, subject information and financial and newspaper data information.
Specifically, character recognition can be carried out through template matching and feature extraction, the difficulty of feature extraction is greatly influenced due to the influence of factors such as displacement of characters, stroke thickness, broken pen, adhesion and rotation, character recognition is carried out on a standard picture through a character recognition model, due to the limitation of photographing conditions, character adhesion and broken pen are often caused, the performance of a recognition system is greatly limited, and character cutting function of character recognition software is needed. The extracted features can be identified through a classifier, the classifier classifies the features and outputs which character the features are correspondingly identified into. And determining to identify the segmented characters by using a spectral clustering algorithm, a K neighbor algorithm and a K value parameter space automatic search algorithm, identifying which character is identified, obtaining format names, subject information and financial and newspaper data information after identification, and correcting an identification result according to the relation of specific language context.
And performing layout recovery on the financial newspaper format according to the field information, and checking to obtain an identified financial newspaper document.
Specifically, the recognized characters are still arranged like the original document picture, the paragraphs are unchanged, the positions are unchanged, and the sequence is unchanged, and the recognized characters are output to a word document, a pdf document and the like, wherein the process is called layout recovery, and a financial report document is obtained after recovery is completed. The fields of the subjects can be filled, and then the financial and newspaper data information is directly filled into the target template.
Illustratively, when the identified field content is not in the system financial report template library, the template can be automatically refined and supplemented.
The matching module 204 is configured to input the format information and the subject information into a subject matching model, and output a matching result, where the matching result includes a matching subject of the to-be-processed financial report and a confidence level of the matching subject.
Specifically, the format information comprises a financial report format and a financial report name, and the financial report format, the financial report name and field information of subject information are used as input parameters and input into the subject matching model; after the subject matching model receives the three transmitted parameters, acquiring a subject template set of a financial and newspaper template corresponding to the format information, and performing similarity operation on corresponding subjects to generate a subject matching result; and the subject matching model outputs the matching result in a dictionary form, and the output content comprises the matched subjects, the confidence level and the subjects to be selected. The financial report template is a financial report format obtained by analysis after OCR recognition, such as a common enterprise and the like; the name of the financial report is the name of the financial report to be processed, such as a cash flow table, an asset and debt table, a profit table and the like; the subjects are subjects identified by the OCR technology, such as receivable bills. The step is to carry out rough identification of the subjects of the financial reports, namely, to arrange the characters identified by the OCR.
Illustratively, the matching module 204 is further configured to:
inputting the format information and the subject information into a subject matching model so as to obtain a subject template library matched with the format information according to the format information through the subject matching model.
Specifically, the format information comprises a financial report format and a financial report type, and field information of a financial report template, a financial report name and subject information is used as input parameters and input into the subject matching model; after the subject matching model receives the three input parameters, a subject template set of the financial and newspaper template corresponding to the format information is obtained, and the subject template set comprises a plurality of standard subjects.
And segmenting the subject information according to the granularity of the characters to obtain character information corresponding to each subject, wherein the subject information comprises a plurality of subjects.
Specifically, each subject in the subject information is segmented into at least two characters.
And matching the character information corresponding to each subject according to a preset inverted index table so as to obtain a subject candidate set corresponding to each subject from the subject template library.
Specifically, the character information of each subject is matched with the subject template library according to an inverted index table, standard characters of standard subjects acquired from the standard template are prestored in the inverted index table, and each standard subject is numbered. If the standard characters corresponding to a certain standard template are matched with the characters of the subjects, the standard characters are recalled corresponding to the standard subjects. Similarity matching can be carried out on the basis of character similarity algorithms of NLP technologies such as the longest common substring, an edit distance algorithm and a cosine similarity algorithm. For example: when the editing distance algorithm is used, the number of times that the characters of the subjects are edited is calculated to obtain the standard characters of the standard subjects, the number of times of modification is the similarity, the smaller the number of times, the more similar the characters are, the editing operation comprises the steps of replacing one character with another character, inserting one character and deleting one character. For another example: when the longest common substring is used, the lengths of the character strings of the subjects which are the same as the length of the character strings in the standard characters of the standard subjects are found in sequence, and the longer the length is, the greater the similarity is. And then finding the serial number corresponding to the standard subject in the inverted index table to be associated with the matched subject. The templates corresponding to the numbers can be queried in the inverted index table according to the sequence of the numbers, the templates contain subject character information, and if the templates contain subjects, the standard subjects corresponding to the numbers are recalled from the subject template library to obtain a subject candidate set.
And matching each subject with a standard subject in the subject candidate set corresponding to the subject, and outputting a matching result, wherein the matching result comprises the matching subject of the to-be-processed financial report and the confidence level of the matching subject.
Specifically, the similarity of each subject and the standard subject corresponding to each subject is calculated according to a similarity algorithm, and the confidence level is determined according to the similarity. The matching result comprises the matching subject, the confidence level of the matching subject and the subject to be selected. The matched subjects are financial and newspaper subjects corresponding to the subjects identified by the OCR technology; confidence level: matching accuracy, such as exact matching, algorithm matching, fuzzy matching, manual intervention, and the like; subject to be selected: and when the accurate matching and the algorithm matching cannot be carried out, outputting the matching subjects to be selected, and extracting when manual intervention is carried out.
Matching subject example: {
A': [ 'A', 1, [ ] ];
a ratio of 'B': [ 'B', 2, [ ] ];
a unit of 'C': [ 'propyl', 3, [ 'large', 'small', 'multi', 'small' ];
a ratio of 'D': [ 'Ding', 4, [ 'big', 'Small', 'Multi', 'Small' ]
}
Description of the drawings: A. b, C, D is the financial and newspaper subjects recognized by OCR, A, B, C and D represent matching subjects, 1, 2, 3 and 4 represent confidence levels of recognition, when recognition is not accurate (3 and 4), C and D represent corresponding subjects with the largest similarity, and the contents in [ 'big', 'small', 'many', 'few' ] represent the subjects to be selected which are to be confirmed manually.
And the determining and generating module 206 is configured to determine a target subject corresponding to the to-be-processed financial report from the matching subjects according to the confidence level, and generate a target financial report based on the target subject and the financial report data information.
Specifically, if the confidence level is high or medium-high, corresponding to 1 and 2 in the above example, the matched standard subject is taken as the target subject; if the confidence level is low level or middle low level, corresponding to 3 and 4 in the above example, the matched standard subjects are used as the subjects to be selected, which indicates that the subjects are not completely matched with the standard subjects, the recommended matched standard subjects are given according to the confidence levels of the subjects and the standard subjects, so that a plurality of fuzzy matching standard subjects (such as [ 'large', 'small', 'more', 'less') are obtained, and then the fuzzy matching standard subjects are verified manually to select the target subjects. And after the target subject is selected, refilling the target subject into the financial report document to obtain the target financial report.
EXAMPLE III
Fig. 8 is a schematic diagram of a hardware architecture of a computer device according to a third embodiment of the present invention. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like. As shown in fig. 8, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a financial statement identification device 20, which are communicatively connected to each other through a system bus. Wherein:
in this embodiment, the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like provided on the computer device 2. Of course, the memory 21 may also comprise both internal and external memory units of the computer device 2. In this embodiment, the memory 21 is generally used for storing an operating system installed on the computer device 2 and various application software, such as the program code of the identification apparatus 20 of the financial statement in the second embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to run the program code stored in the memory 21 or process data, for example, run the identification device 20 of the financial statement, so as to implement the identification method of the financial statement according to the first embodiment.
The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is generally used for establishing communication connection between the server 2 and other electronic devices. For example, the network interface 23 is used to connect the server 2 to an external terminal via a network, establish a data transmission channel and a communication connection between the server 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like. It is noted that fig. 8 only shows the computer device 2 with components 20-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.
In this embodiment, the identification device 20 of the financial statement stored in the memory 21 can be further divided into one or more program modules, and the one or more program modules are stored in the memory 21 and executed by one or more processors (in this embodiment, the processor 22) to complete the identification device of the financial statement of the present invention. The specific functions of the program modules 200 and 206 have been described in detail in the second embodiment, and are not described herein again. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like. Receiving the operation instructions of the user on the display screen of the computer device to achieve the steps of the identification method of the financial statement, executing the processor 22 according to the operation instructions, and executing the computer program stored on the computer readable storage medium 21. The steps of the identification method of the financial statement herein may be the steps of the identification method of the financial statement in the first embodiment.
Example four
The present embodiments also provide a computer-readable storage medium, which may be non-volatile or volatile. Such as flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., having stored thereon a computer program that when executed by a processor implements the corresponding functions. The computer-readable storage medium of this embodiment stores a computer program, and when the computer program is executed by a processor, the processor is caused to execute the step device of the method for identifying a financial statement of the first embodiment.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for identifying financial statements is characterized by comprising the following steps:
receiving a to-be-processed financial newspaper picture of a to-be-processed financial newspaper;
identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, wherein the financial report document comprises format information, subject information and financial report data information;
inputting the format information and the subject information into a subject matching model, and outputting a matching result, wherein the matching result comprises a matching subject of the to-be-processed financial report and a confidence level of the matching subject;
and determining a target subject corresponding to the to-be-processed financial report from the matched subjects according to the confidence level, and generating a target financial report based on the target subject and the financial report data information.
2. The method for identifying financial statements as claimed in claim 1, wherein said identifying said to-be-processed financial statement picture to obtain a financial statement document contained in said to-be-processed financial statement picture, said financial statement document including format information, subject information and financial statement data information comprises:
preprocessing the to-be-processed financial newspaper to obtain a standard picture;
performing layout identification on the standard picture to obtain a corresponding financial newspaper layout, wherein the layout information comprises the financial newspaper layout and a layout name;
performing character recognition on the standard picture to obtain a plurality of field information, wherein the field information comprises the format name, subject information and financial and newspaper data information;
and performing layout recovery on the financial newspaper format according to the field information, and checking to obtain an identified financial newspaper document.
3. The method for identifying financial statements as claimed in claim 1, wherein said inputting the format information and the subject information into a subject matching model, and outputting a matching result, wherein the matching result comprises the matching subject of the financial statement to be processed and the confidence level of the matching subject comprises:
inputting the format information and the subject information into a subject matching model so as to obtain a subject template library matched with the format information according to the format information through the subject matching model;
segmenting the subject information according to the granularity of the characters to obtain character information corresponding to each subject, wherein the subject information comprises a plurality of subjects;
matching the character information corresponding to each subject according to a preset inverted index table so as to obtain a subject candidate set corresponding to each subject from the subject template library;
and matching each subject with a standard subject in the subject candidate set corresponding to the subject, and outputting a matching result, wherein the matching result comprises the matching subject of the to-be-processed financial report and the confidence level of the matching subject.
4. The method for identifying financial statements according to claim 3, wherein before matching the character information corresponding to each subject according to a preset inverted index to obtain a subject candidate set corresponding to each subject, the method comprises:
acquiring a plurality of standard subjects;
and establishing a plurality of inverted index tables by taking the first characters of the plurality of standard subjects as key words, and storing the inverted index tables into the subject template library.
5. The method for identifying financial statements according to claim 3, wherein said matching each subject with a standard subject in the subject candidate set corresponding to the subject and outputting a matching result, wherein the matching result including the matching subject of the to-be-processed financial report and the confidence level of the matching subject comprises:
calculating similarity values of standard subjects in the subject candidate sets corresponding to the subjects according to a similarity algorithm;
and taking the standard subjects with the similarity values larger than a preset threshold value as matching subjects, and determining the confidence level of the matching subjects according to the similarity values of the matching subjects.
6. A method of identifying a financial statement according to claim 1, further comprising:
acquiring a plurality of financial and newspaper data;
carrying out data analysis on the target financial report and the financial report data to obtain an analysis result;
and generating early warning information based on the analysis result.
7. A method of identifying a financial statement according to claim 1, further comprising:
and storing the target financial report into a block chain.
8. An apparatus for identifying financial statements, comprising:
the receiving module is used for receiving the to-be-processed financial newspaper pictures of the to-be-processed financial newspaper;
the identification module is used for identifying the to-be-processed financial report picture to obtain a financial report document contained in the to-be-processed financial report picture, wherein the financial report document comprises format information, subject information and financial report data information;
the matching module is used for inputting the format information and the subject information into a subject matching model and outputting a matching result, wherein the matching result comprises a matching subject of the to-be-processed financial report and a confidence level of the matching subject;
and the determining and generating module is used for determining a target subject corresponding to the to-be-processed financial report from the matched subjects according to the confidence level and generating the target financial report based on the target subject and the financial report data information.
9. A computer device, characterized in that it comprises a memory, a processor, said memory having stored thereon a computer program executable on said processor, said computer program, when executed by said processor, implementing the steps of the identification method of financial statements according to any one of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which is executable by at least one processor to cause the at least one processor to perform the steps of the method for identification of a financial statement according to any one of claims 1-7.
CN202010905770.6A 2020-09-01 2020-09-01 Financial statement identification method and device, computer equipment and readable storage medium Pending CN112036145A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010905770.6A CN112036145A (en) 2020-09-01 2020-09-01 Financial statement identification method and device, computer equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010905770.6A CN112036145A (en) 2020-09-01 2020-09-01 Financial statement identification method and device, computer equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN112036145A true CN112036145A (en) 2020-12-04

Family

ID=73590870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010905770.6A Pending CN112036145A (en) 2020-09-01 2020-09-01 Financial statement identification method and device, computer equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN112036145A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113094446A (en) * 2021-03-22 2021-07-09 北京三行科技有限公司 Subject information extraction method oriented to financial statement image
CN113094447A (en) * 2021-03-22 2021-07-09 北京三行科技有限公司 Structured information extraction method oriented to financial statement image
CN113158988A (en) * 2021-05-19 2021-07-23 上海云从企业发展有限公司 Financial statement processing method and device and computer readable storage medium
CN113569549A (en) * 2021-07-26 2021-10-29 平安资产管理有限责任公司 Report conversion processing method and device, computer equipment and readable storage medium
CN113627351A (en) * 2021-08-12 2021-11-09 达而观信息科技(上海)有限公司 Method and device for matching financial and newspaper subjects, computer equipment and storage medium
CN113672739A (en) * 2021-07-28 2021-11-19 达而观智能(深圳)有限公司 Data extraction method for image format financial and newspaper document
CN117235233A (en) * 2023-10-24 2023-12-15 之江实验室 Automatic financial report question-answering method and device based on large model

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113094446A (en) * 2021-03-22 2021-07-09 北京三行科技有限公司 Subject information extraction method oriented to financial statement image
CN113094447A (en) * 2021-03-22 2021-07-09 北京三行科技有限公司 Structured information extraction method oriented to financial statement image
CN113158988A (en) * 2021-05-19 2021-07-23 上海云从企业发展有限公司 Financial statement processing method and device and computer readable storage medium
CN113158988B (en) * 2021-05-19 2024-04-05 上海云从企业发展有限公司 Financial statement processing method, device and computer readable storage medium
CN113569549A (en) * 2021-07-26 2021-10-29 平安资产管理有限责任公司 Report conversion processing method and device, computer equipment and readable storage medium
CN113672739A (en) * 2021-07-28 2021-11-19 达而观智能(深圳)有限公司 Data extraction method for image format financial and newspaper document
CN113627351A (en) * 2021-08-12 2021-11-09 达而观信息科技(上海)有限公司 Method and device for matching financial and newspaper subjects, computer equipment and storage medium
CN113627351B (en) * 2021-08-12 2024-01-30 达观数据有限公司 Matching method, device, computer equipment and storage medium for financial accounting subjects
CN117235233A (en) * 2023-10-24 2023-12-15 之江实验室 Automatic financial report question-answering method and device based on large model
CN117235233B (en) * 2023-10-24 2024-06-11 之江实验室 Automatic financial report question-answering method and device based on large model

Similar Documents

Publication Publication Date Title
CN112036145A (en) Financial statement identification method and device, computer equipment and readable storage medium
WO2022134588A1 (en) Method for constructing information review classification model, and information review method
US9626555B2 (en) Content-based document image classification
CN110751041A (en) Certificate authenticity verification method, system, computer equipment and readable storage medium
CN112257613B (en) Physical examination report information structured extraction method and device and computer equipment
US9286526B1 (en) Cohort-based learning from user edits
CN112036295B (en) Bill image processing method and device, storage medium and electronic equipment
CN111858977B (en) Bill information acquisition method, device, computer equipment and storage medium
CN114998920B (en) Supply chain financial file management method and system based on NLP semantic recognition
CN111858942A (en) Text extraction method and device, storage medium and electronic equipment
US20230138491A1 (en) Continuous learning for document processing and analysis
CN113469005A (en) Recognition method of bank receipt, related device and storage medium
CN113408536A (en) Bill amount identification method and device, computer equipment and storage medium
US9378428B2 (en) Incomplete patterns
CN116798061A (en) Bill auditing and identifying method, device, terminal and storage medium
CN113408446B (en) Bill accounting method and device, electronic equipment and storage medium
US11335108B2 (en) System and method to recognise characters from an image
CN115880702A (en) Data processing method, device, equipment, program product and storage medium
CN113936130A (en) Document information intelligent acquisition and error correction method, system and equipment based on OCR technology
CN114818627A (en) Form information extraction method, device, equipment and medium
CN112132693A (en) Transaction verification method, transaction verification device, computer equipment and computer-readable storage medium
CN112287763A (en) Image processing method, apparatus, device and medium
CN117971819B (en) Management method and system for automatically collecting stream data
CN117421487B (en) Multiple network information screening management system based on artificial intelligence
CN117493645B (en) Big data-based electronic archive recommendation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination