CN114201451A - Information auditing method and device - Google Patents

Information auditing method and device Download PDF

Info

Publication number
CN114201451A
CN114201451A CN202111544121.9A CN202111544121A CN114201451A CN 114201451 A CN114201451 A CN 114201451A CN 202111544121 A CN202111544121 A CN 202111544121A CN 114201451 A CN114201451 A CN 114201451A
Authority
CN
China
Prior art keywords
result
classification result
information
auditing
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111544121.9A
Other languages
Chinese (zh)
Inventor
杨博
张勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202111544121.9A priority Critical patent/CN114201451A/en
Publication of CN114201451A publication Critical patent/CN114201451A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses an information auditing method and device, wherein a file to be processed in a preset file form is obtained, information to be classified in a picture format contained in a secondary directory of the file to be processed is extracted, and the information to be classified in the picture format is filed and classified to obtain a classification result; and the classification result is used for indicating whether the file type of the information to be classified is complete, and if the classification result is the classification result containing the complete file type, the classification result is audited through a preset audit rule to obtain and display the audit result. According to the scheme, the files do not need to be audited in a manual auditing mode, the integrity of the files is verified only by archiving, classifying and other operations, the files with the integrity are audited in a targeted mode through configurable preset auditing rules, and the efficiency and the accuracy of auditing the files are improved.

Description

Information auditing method and device
Technical Field
The present application relates to the field of information auditing technologies, and in particular, to an information auditing method and apparatus.
Background
With the development of information technology, the application of information auditing in daily life is more and more extensive. For example, in the product type selection test process, the files of the manufacturer are generally submitted in a paper form, and the paper-form files relate to some fixed terms, information items required to be filled in by the manufacturer, and the like.
In the prior art, because there are many reference and measurement manufacturers and various types of submitted files, the contents of fixed terms of the submitted files, information items required to be filled by the manufacturers and the like need to be checked one by one, and the file data is checked in a manual checking mode, so that the efficiency of checking the file data is low, and the accuracy of checking the file data is reduced.
Disclosure of Invention
In view of this, the present application discloses an information auditing method and apparatus, aiming to improve the efficiency and accuracy of auditing files.
In order to achieve the purpose, the technical scheme is as follows:
the first aspect of the present application discloses an information auditing method, which includes:
acquiring a file to be processed in a preset file form;
extracting information to be classified in a picture format contained in a secondary directory of the file to be processed;
filing and classifying the information to be classified in the picture format to obtain a classification result; the classification result is used for indicating whether the file type of the information to be classified is complete or not;
and if the classification result is a classification result containing a complete file type, auditing the classification result through a preset auditing rule to obtain and display an auditing result.
Preferably, the archiving and classifying the information to be classified to obtain the classification result includes:
acquiring each file type corresponding to the information to be classified;
matching each file type through a preset classified file type;
if the preset classification file type is matched with each file type, determining the classification result of the information to be classified as the classification result containing the complete file type;
and if the preset classification file type is not matched with any one of the file types, determining that the classification result of the information to be classified is the classification result of the missing file type.
Preferably, the method further comprises the following steps:
and if the classification result is the classification result of the type of the missing file, generating missing file reminding information.
Preferably, if the classification result is a classification result including a complete file type, the classification result is audited through a preset audit rule, and an audit result is obtained and displayed, including:
if the classification result is a classification result containing a complete file type, extracting text contents in the classification result by a preset text extraction technology;
checking whether the text content in the classification result is consistent with the preset text content;
if the first result is consistent with the second result, generating and displaying a first examination result; the first auditing result is used for indicating a correct auditing result;
if not, generating and displaying a second examination result; and the second auditing result is used for indicating an incorrect auditing result.
Preferably, before the classifying result is a classifying result including a complete file type, the classifying result is audited through a preset audit rule, and an audit result is obtained and displayed, the method further includes:
and configuring a preset auditing rule.
The second aspect of the present application discloses an information auditing apparatus, the apparatus includes:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a file to be processed in a preset file form;
the extraction unit is used for extracting the information to be classified in the picture format contained in the secondary directory of the file to be processed;
the filing and classifying unit is used for filing and classifying the information to be classified in the picture format to obtain a classification result; the classification result is used for indicating whether the file type of the information to be classified is complete or not;
and the auditing unit is used for auditing the classification result through a preset auditing rule if the classification result is a classification result containing a complete file type, so as to obtain and display an auditing result.
Preferably, the archive classification unit includes:
the acquisition module is used for acquiring each file type corresponding to the information to be classified;
the matching module is used for matching each file type through a preset classified file type;
the first determining module is used for determining the classification result of the information to be classified as the classification result containing the complete file type if the preset classification file type is matched with each file type;
and the second determining module is used for determining that the classification result of the information to be classified is the classification result of the missing file type if the preset classification file type is not matched with any one of the file types.
Preferably, the method further comprises the following steps:
and the generating unit is used for generating missing file reminding information if the classification result is the classification result of the missing file type.
Preferably, the auditing unit includes:
the extraction module is used for extracting the text content in the classification result through a preset text extraction technology if the classification result is the classification result containing the complete file type;
the checking module is used for checking whether the text content in the classification result is consistent with the preset text content;
the first generation module is used for generating and displaying a first examination result if the first examination result is consistent with the first examination result; the first auditing result is used for indicating a correct auditing result;
the second generation module is used for generating and displaying a second examination result if the examination results are inconsistent; and the second auditing result is used for indicating an incorrect auditing result.
Preferably, the method further comprises the following steps:
and the configuration unit is used for configuring the preset auditing rule.
According to the technical scheme, the application discloses an information auditing method and device, a to-be-processed file in a preset file form is obtained, information to be classified in a picture format contained in a secondary directory of the to-be-processed file is extracted, the information to be classified in the picture format is filed and classified, and a classification result is obtained; and the classification result is used for indicating whether the file type of the information to be classified is complete, and if the classification result is the classification result containing the complete file type, the classification result is audited through a preset audit rule to obtain and display the audit result. According to the scheme, the files do not need to be audited in a manual auditing mode, the integrity of the files is verified only by archiving, classifying and other operations, the files with the integrity are audited in a targeted mode through configurable preset auditing rules, and the efficiency and the accuracy of auditing the files are improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is an architecture diagram of an information auditing system disclosed in an embodiment of the present application;
fig. 2 is a schematic flowchart of an information auditing method according to an embodiment of the present application;
FIG. 3 is a diagram illustrating the classification results disclosed in an embodiment of the present application;
FIG. 4 is a schematic diagram of the verification result disclosed in the embodiments of the present application;
fig. 5 is a schematic structural diagram of an information auditing apparatus disclosed in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device disclosed in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
It can be known from the background art that, because there are many manufacturers involved in the document examination and the documents submitted are of various kinds, the contents of the fixed terms of the submitted documents and the information items required to be filled by the manufacturers need to be checked one by performing a manual verification comparison, and the document data is verified in a manual verification manner, so that the efficiency of verifying the document data is low, and the accuracy of verifying the document data is also reduced.
In order to solve the above problem, an embodiment of the application discloses an information auditing method and apparatus. The files do not need to be audited in a manual auditing mode, the integrity of the files is verified only by archiving, classifying and other operations on the files, the files with the integrity are audited in a targeted mode through configurable preset auditing rules, and the efficiency and the accuracy of auditing the files are improved. The specific implementation is illustrated by the following examples.
The following first introduces an architecture diagram of an information auditing system to which the information auditing method and apparatus of the present application are applied, and referring to fig. 1, a specific architecture diagram of the information auditing system includes a front end 11, an information management service module 12, and an Optical Character Recognition (OCR) service module 13. Wherein, the front end 11 is provided with a viewing layer.
The data interaction process of the front end 11, the information management service module 12 and the OCR service module 13 is as follows:
the user uploads the file to be processed in the form of a preset file to the information management service module 12.
Wherein, the preset file form is the form of a folder. The folder is divided into two layers of directories (a primary directory and a secondary directory), wherein the primary directory is named by a project name; the second-level directory is named by company names of manufacturers, a plurality of parallel directories can be provided according to the number of the manufacturers, a template file in a txt format is arranged under the second-level directory, besides the manufacturer directory, namely the second-level directory is named according to the material types, and each type corresponds to one txt template file; paper materials (information to be classified) in a picture format scanned by a manufacturer are stored under each secondary catalog.
After the information management service module 12 obtains the file to be processed in the preset file form, the information to be classified in the picture format included in the secondary directory of the file to be processed is extracted.
After the user uploads the file to be processed, the information management service module 12 automatically scans the folder of the whole file to be processed, the manufacturer information under the project of the project information can be seen on the interface, the manufacturer information under the project is displayed in a table form, each manufacturer can click a document classification button, and the information management service module 12 can archive and classify the information to be classified in the picture format according to the file category to which the information belongs.
After the classification is completed, the information management service module 12 may check the integrity of the documents, i.e., check to see which type of documents are missing.
The information management service module 12 files and classifies the information to be classified in the picture format to obtain a classification result; the classification result is used for indicating whether the file type of the information to be classified is complete or not.
The specific information management service module 12 performs filing and classification on the information to be classified in the picture format, and the process of obtaining the classification result is as follows:
the information management service module 12 obtains each file type corresponding to the information to be classified.
The information management service module 12 matches each file type by presetting classified file types.
If the preset classification file type is matched with each file type, the information management service module 12 determines that the classification result of the information to be classified is a classification result containing a complete file type.
If the preset classification file type is not matched with any one of the file types, the information management service module 12 determines that the classification result of the information to be classified is the classification result of the missing file type.
If the classification result is the classification result of the missing file type, the information management service module 12 generates missing file reminding information.
If the classification result is a classification result containing a complete file type, the information management service module 12 examines the classification result according to the configured preset examination rule to obtain an examination result.
The pre-audit information management service module 12 configures the pre-processing operation of the audit rule: filtering or not filtering the seal in the picture, selecting which items to audit (fixed template checking, key information checking, identity recognition, seal recognition and the like).
After configuration is completed, a user can click a check button, the information management service module 12 performs auditing processing on the classification result, and the front end 11 displays the processing progress in the processing process. The whole operation can be carried out in batches by multiple manufacturers or can be carried out singly.
The specific information management service module 12 audits the classification result according to the configured preset audit rule, and the process of obtaining the audit result is as follows:
if the classification result is a classification result containing a complete file type, the information management service module 12 extracts the text content in the classification result through the OCR service module 13.
The OCR service module 13 uses the current common OCR technology, such as paddlehub and CV, to extract the text by combining with the model provided by the optimization of the working practice. The function is encapsulated as a service initiation.
The OCR service module 13 is provided with a paddlehub model library in which various models are stored. When the method is applied specifically, a model corresponding to an application scene is selected from a model library according to the application scene selected by a user (scenes such as fixed template checking, key information checking, identity recognition, seal recognition and the like), and parameters are adjusted.
The step of checking the classification result is to specifically send the information to be classified in the picture format to a corresponding model in the paddlehub model library through function call, and the model is converted into text content. And auditing according to a self-defined auditing rule after the text content is obtained.
The information management service module 12 checks whether the text content in the classification result is consistent with the preset text content.
If the first result is consistent with the first result, the information management service module 12 generates and displays a first audit result; the first audit result is used for indicating a correct audit result.
If the result is not consistent with the second result, the information management service module 12 generates and displays a second review result; the second audit result is used for indicating an incorrect audit result.
And starting a view service of the front end 11 through a Nodejs/http service in the view layer to provide a web interface for the browser, and displaying an auditing result through the web interface.
In the embodiment of the application, the files do not need to be audited in a manual auditing mode, the integrity of the files is verified only by archiving, classifying and other operations on the files, the files with the integrity are audited in a targeted mode through configurable preset auditing rules, and the efficiency and the accuracy of auditing the files are improved.
Referring to fig. 2, a schematic flow chart of an information auditing method disclosed in an embodiment of the present application is shown, where the information auditing method mainly includes the following steps:
s201: and acquiring a file to be processed in a preset file form.
In S201, a to-be-processed file in a preset file form uploaded by a user is acquired.
The file to be processed is an initial file which is not subjected to document classification.
The preset file form is the form of a folder. The folder is divided into two layers of directories (a primary directory and a secondary directory), wherein the primary directory is named by a project name; the second-level directory is named by company names of manufacturers, a plurality of parallel directories can be provided according to the number of the manufacturers, a template file in a txt format is arranged under the second-level directory, besides the manufacturer directory, namely the second-level directory is named according to the material types, and each type corresponds to one txt template file; and paper materials in a picture format scanned by manufacturers are stored under each secondary catalog.
And converting the paper material into a storable and processable information format according to the service, wherein the processable information format comprises project information, manufacturer information, paper information, labels, categories and the like.
S202: and extracting the information to be classified in the picture format contained in the secondary directory of the file to be processed.
Scanning a folder of a file to be processed to obtain item information of a primary directory and information to be classified of a secondary directory containing a picture format; the scanning result at least comprises project information and manufacturer information corresponding to the project information in the form of pictures.
And displaying the project information and the manufacturer information of the project to the user on a front-end interface. The vendor information under the project is presented in tabular form.
S203: filing and classifying the information to be classified in the picture format to obtain a classification result; and the classification result is used for indicating whether the file type of the information to be classified is complete, if the classification result is the classification result containing the complete file type, the step S204 is executed, and if the classification result is the classification result of the missing file type, the step S205 is executed.
Document completeness is archived so that a user can see which types of documents are missing.
The process of obtaining the classification result by filing and classifying the information to be classified is shown as A1-A4.
A1: and obtaining each file type corresponding to the information to be classified.
The file types comprise a product type evaluation discipline type, a company promissory and declaration type, a company promissory book type, a reference product promissory book type, a legal representative authorization book type, a business license type and the like.
A2: and matching each file type through the preset classified file types.
The preset classification file type is a preset classification file type and comprises a preset evaluation discipline type, a preset company promissory and declaration type, a preset company promissory book type, a preset product promissory book type for reference, a preset legal representative authorization book type, a preset business license type and the like.
Determination of the preset classification file type is not specifically limited in this application.
A3: and if the preset classification file type is matched with each file type, determining the classification result of the information to be classified as the classification result containing the complete file type.
For convenience of understanding, if the preset classification file type is matched with each file type, the process of determining the classification result of the information to be classified as the classification result including the complete file type is described, by way of example here:
for example, if the preset classification document type is a preset evaluation discipline type, a preset company commitment and declaration type, a preset company commitment book type, a preset product commitment book type for reference, a preset legal representative authorization book type and a preset business license type, and the document type corresponding to the information to be classified is a product-type evaluation discipline type, a company commitment and declaration type, a company commitment book type, a product commitment book type for reference, a legal representative authorization book type and a business license type, it is determined that the preset classification document type matches each document type, and the classification result of the information to be classified is a classification result including a complete document type.
A4: and if the preset classification file type is not matched with any one of the file types, determining that the classification result of the information to be classified is the classification result of the missing file type.
For convenience of understanding, the process of obtaining the classification result by filing and classifying the information to be classified in the picture format is described with reference to fig. 3. Fig. 3 is merely an exemplary diagram of the classification result.
For example, in fig. 3, the classification result of the complete document type includes a product type evaluation discipline, a company commitment and declaration, a company commitment book, a reference product commitment book, a corporate representative authorization book, and a business license. Interactive as a primary directory. If the information to be classified in the picture format is filed and classified to obtain a classification result, and the classification result is only the type of the product type selection evaluation discipline which is lacked, displaying information of 'the file is lacked and please check' at a position corresponding to the type of the product type selection evaluation discipline; and the documents of company commitment and declaration, the documents of company commitment book, the documents of product commitment book for reference and measurement, the documents of legal representative authorization book and the documents of business license are all complete, then the information of 'success in initial selection and classification, please check' is displayed at the position corresponding to the documents of all kinds.
S204: and auditing the classification result through a preset auditing rule to obtain an auditing result and displaying the auditing result.
After the information to be classified is filed and classified, the material auditing button is started, and the user can click the started material auditing button to carry out filing and classification on the classified information.
And configuring a preset auditing rule before the classification result is audited through the preset auditing rule.
The process of configuring the preset auditing rule is as follows:
filtering or not filtering the seal in the picture, selecting which items to audit (fixed template checking, key information checking, identity recognition, seal recognition and the like). After the configuration is completed, a check button can be clicked, the background can perform check processing, and the front end can display the processing progress in the processing process. The whole operation can be executed in batch by multiple manufacturers or can be executed singly.
Because the red color of the seal is different from the black color of the font, the seal in the picture is filtered or not filtered by realizing related logic functions through an open source computer vision library (opencv).
Configurable preset auditing rules: commonly used document review summaries are summarized into five categories: the method comprises the steps of fixed character template checking, key information checking (filling items), form information extraction, seal identification, certificate information auditing and the like, wherein an auditing scene can be flexibly configured for each type of document, for example, key information checking, seal identification and certificate information auditing can be configured for company personnel authorization books, key information such as company names, legal representative and authorized person identities and the like can be extracted, company printing notes, legal representative seals, authorized person signatures and the like can also be extracted, and interactive printing is carried out. The document types and the auditing methods which need to be audited are flexibly configured, and the configuration of business operation is realized. Each configuration item can be used as a service tool, which is also a core point of the information auditing system for adapting to service expansion.
The audit result (check result) first provides an overview of which category of documents has errors and the number of errors. Each type of file has a detailed checking report, the difference is displayed word by word, and the original picture can be displayed for comparison.
The classification result is audited through a preset audit rule, and the process of obtaining and displaying the audit result is shown as B1-B4.
B1: and extracting text contents in the classification result by a preset text extraction technology.
The preset text extraction technology can be an OCR technology or other text extraction technologies, and is set by technicians according to actual conditions, and the application is not particularly limited.
And key information such as text content key information check, seal identification, certificate information, company name, legal person representative, authorized person identity and the like in the classification result can be extracted, and contents such as company printing bill, legal person representative seal, authorized person signature and the like can be extracted.
B2: and checking whether the text content in the classification result is consistent with the preset text content.
Firstly, document pictures of the same type are converted into texts through a model. And splicing the texts into a full text, and comparing the text content of the document with the preset text content of the template to determine whether the text content in the classification result is consistent with the preset text content.
The preset text content may be in the form of a preset file, i.e., a folder. And converting the paper material into a preset file form which can be stored and processed according to the service, wherein the preset file form comprises project information, manufacturer information, paper information, label information, category information and the like.
B3: if the first result is consistent with the second result, generating and displaying a first examination result; the first audit result is used for indicating a correct audit result.
For example, if the item information of the text content in the classification result is the item information a and the item information of the preset text content is the item information a, it is determined that the text content in the classification result is consistent with the preset text content.
And after the text content in the classification result is determined to be consistent with the preset text content, displaying the verification result to the user through the front end.
B4: if not, generating and displaying a second examination result; the second audit result is used for indicating an incorrect audit result.
For convenience of understanding the process of whether the text content in the classification result is consistent with the preset text content, the description is given by way of example with reference to fig. 4, and fig. 4 is a schematic diagram of the verification result. Of which fig. 4 is only an exemplary diagram.
For example, in fig. 4, 16 checks in the types of the company commitments and statements fail, and information of "16 checks fail and please check" is generated; 8 check certificates in the types of the company promises fail to pass, and information of '8 check certificates fail to pass and please check' is generated; the method comprises the steps of detecting that 14 places of a product commitment book fail to check, and generating information of '14 places fail to check, please check'; the legal person generates information of '6 check fail and please check' in the type of the representative authorization book; since there is no match template in the license type, information "no match template, manual match" is generated.
All the check results in fig. 4 can be used to generate a webpage report corresponding to all the check results via the buttons of the webpage report. The license type can view the original through the view original button.
S205: and generating missing file reminding information.
Specifically, the missing file reminding information is generated, which can refer to the exemplary diagram in fig. 4.
For example, if a file such as a product type selection evaluation discipline is absent, reminding information of the absence of the file is generated at a position corresponding to the product type selection evaluation discipline.
The method and the device are simple to operate, the operation flow is merged and simplified as much as possible according to the service, the automatic scanning and management of the files are carried out in a file uploading mode, and the traditional operation mode of manually filling in form submission is not adopted. Greatly simplifying the operation flow.
By utilizing the system design idea of the micro-service, the background provides services through an Application Programming Interface (API), and the system functions can be easily integrated with other existing systems. The front end view can also be easily integrated with other systems by separating the front and rear ends.
In the embodiment of the application, the files do not need to be audited in a manual auditing mode, the integrity of the files is verified only by archiving, classifying and other operations on the files, the files with the integrity are audited in a targeted mode through configurable preset auditing rules, and the efficiency and the accuracy of auditing the files are improved.
Based on the information auditing method disclosed in the embodiment of fig. 2, an information auditing apparatus is also correspondingly disclosed in the embodiment of the present application, and as shown in fig. 5, the information auditing apparatus mainly includes an obtaining unit 501, an extracting unit 502, an archiving and classifying unit 503, and an auditing unit 504.
The acquiring unit 501 is configured to acquire a to-be-processed file in a preset file form.
The extracting unit 502 is configured to extract information to be categorized in a picture format included in a secondary directory of the file to be processed.
The filing and classifying unit 503 is configured to perform filing and classification on the information to be classified in the picture format to obtain a classification result; the classification result is used for indicating whether the file type of the information to be classified is complete or not.
And the auditing unit 504 is configured to, if the classification result is a classification result including a complete file type, audit the classification result according to a preset auditing rule to obtain an auditing result, and display the auditing result.
Further, the archive classification unit 503 includes an obtaining module, a matching module, a first determining module, and a second determining module.
And the acquisition module is used for acquiring each file type corresponding to the information to be classified.
And the matching module is used for matching each file type through the preset classified file types.
And the first determining module is used for determining the classification result of the information to be classified as the classification result containing the complete file type if the preset classification file type is matched with each file type.
And the second determining module is used for determining that the classification result of the information to be classified is the classification result of the missing file type if the preset classification file type is not matched with any one of the file types.
Further, the device also comprises a generating unit.
And the generating unit is used for generating missing file reminding information if the classification result is the classification result of the missing file type.
Further, the auditing unit 504 includes an extracting module, a verifying module, a first generating module, and a second generating module.
And the extraction module is used for extracting the text content in the classification result through a preset text extraction technology if the classification result is the classification result containing the complete file type.
And the checking module is used for checking whether the text content in the classification result is consistent with the preset text content.
The first generation module is used for generating and displaying a first examination result if the first examination result is consistent with the first examination result; the first audit result is used for indicating a correct audit result.
The second generation module is used for generating and displaying a second examination result if the examination results are inconsistent; the second audit result is used for indicating an incorrect audit result.
Further, the system also comprises a configuration unit.
And the configuration unit is used for configuring the preset auditing rule.
In the embodiment of the application, the files do not need to be audited in a manual auditing mode, the integrity of the files is verified only by archiving, classifying and other operations on the files, the files with the integrity are audited in a targeted mode through configurable preset auditing rules, and the efficiency and the accuracy of auditing the files are improved.
The embodiment of the application also provides a storage medium, wherein the storage medium comprises stored instructions, and when the instructions are executed, the equipment where the storage medium is located is controlled to execute the information auditing method.
The embodiment of the present application further provides an electronic device, which is shown in fig. 6, and specifically includes a memory 601, and one or more instructions 602, where the one or more instructions 602 are stored in the memory 601, and are configured to be executed by one or more processors 603 to execute the one or more instructions 602 to perform the above-mentioned information auditing method.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (10)

1. An information auditing method, characterized in that the method comprises:
acquiring a file to be processed in a preset file form;
extracting information to be classified in a picture format contained in a secondary directory of the file to be processed;
filing and classifying the information to be classified in the picture format to obtain a classification result; the classification result is used for indicating whether the file type of the information to be classified is complete or not;
and if the classification result is a classification result containing a complete file type, auditing the classification result through a preset auditing rule to obtain and display an auditing result.
2. The method according to claim 1, wherein the archiving and classifying the information to be classified to obtain the classification result comprises:
acquiring each file type corresponding to the information to be classified;
matching each file type through a preset classified file type;
if the preset classification file type is matched with each file type, determining the classification result of the information to be classified as the classification result containing the complete file type;
and if the preset classification file type is not matched with any one of the file types, determining that the classification result of the information to be classified is the classification result of the missing file type.
3. The method of claim 2, further comprising:
and if the classification result is the classification result of the type of the missing file, generating missing file reminding information.
4. The method according to claim 1, wherein if the classification result is a classification result including a complete file type, the classification result is audited through a preset audit rule, and the audit result is obtained and displayed, including:
if the classification result is a classification result containing a complete file type, extracting text contents in the classification result by a preset text extraction technology;
checking whether the text content in the classification result is consistent with the preset text content;
if the first result is consistent with the second result, generating and displaying a first examination result; the first auditing result is used for indicating a correct auditing result;
if not, generating and displaying a second examination result; and the second auditing result is used for indicating an incorrect auditing result.
5. The method according to claim 1, wherein before the classifying result is a classifying result including a complete file type, the classifying result is audited according to a preset audit rule, and the audit result is obtained and displayed, the method further comprises:
and configuring a preset auditing rule.
6. An information auditing apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a file to be processed in a preset file form;
the extraction unit is used for extracting the information to be classified in the picture format contained in the secondary directory of the file to be processed;
the filing and classifying unit is used for filing and classifying the information to be classified in the picture format to obtain a classification result; the classification result is used for indicating whether the file type of the information to be classified is complete or not;
and the auditing unit is used for auditing the classification result through a preset auditing rule if the classification result is a classification result containing a complete file type, so as to obtain and display an auditing result.
7. The apparatus of claim 6, wherein the archive classification unit comprises:
the acquisition module is used for acquiring each file type corresponding to the information to be classified;
the matching module is used for matching each file type through a preset classified file type;
the first determining module is used for determining the classification result of the information to be classified as the classification result containing the complete file type if the preset classification file type is matched with each file type;
and the second determining module is used for determining that the classification result of the information to be classified is the classification result of the missing file type if the preset classification file type is not matched with any one of the file types.
8. The apparatus of claim 7, further comprising:
and the generating unit is used for generating missing file reminding information if the classification result is the classification result of the missing file type.
9. The apparatus of claim 6, wherein the auditing unit comprises:
the extraction module is used for extracting the text content in the classification result through a preset text extraction technology if the classification result is the classification result containing the complete file type;
the checking module is used for checking whether the text content in the classification result is consistent with the preset text content;
the first generation module is used for generating and displaying a first examination result if the first examination result is consistent with the first examination result; the first auditing result is used for indicating a correct auditing result;
the second generation module is used for generating and displaying a second examination result if the examination results are inconsistent; and the second auditing result is used for indicating an incorrect auditing result.
10. The apparatus of claim 6, further comprising:
and the configuration unit is used for configuring the preset auditing rule.
CN202111544121.9A 2021-12-16 2021-12-16 Information auditing method and device Pending CN114201451A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111544121.9A CN114201451A (en) 2021-12-16 2021-12-16 Information auditing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111544121.9A CN114201451A (en) 2021-12-16 2021-12-16 Information auditing method and device

Publications (1)

Publication Number Publication Date
CN114201451A true CN114201451A (en) 2022-03-18

Family

ID=80654543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111544121.9A Pending CN114201451A (en) 2021-12-16 2021-12-16 Information auditing method and device

Country Status (1)

Country Link
CN (1) CN114201451A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203506A (en) * 2022-06-27 2022-10-18 海南电网有限责任公司信息通信分公司 Archive filing similarity calculation method based on multi-mode verification algorithm
CN115794733A (en) * 2022-11-11 2023-03-14 南京维拓科技股份有限公司 Design document management method in industrial design
CN116756089A (en) * 2023-08-21 2023-09-15 湖南云档信息科技有限公司 File archiving scheme forming method, system and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203506A (en) * 2022-06-27 2022-10-18 海南电网有限责任公司信息通信分公司 Archive filing similarity calculation method based on multi-mode verification algorithm
CN115794733A (en) * 2022-11-11 2023-03-14 南京维拓科技股份有限公司 Design document management method in industrial design
CN116756089A (en) * 2023-08-21 2023-09-15 湖南云档信息科技有限公司 File archiving scheme forming method, system and storage medium
CN116756089B (en) * 2023-08-21 2023-11-03 湖南云档信息科技有限公司 File archiving scheme forming method, system and storage medium

Similar Documents

Publication Publication Date Title
CN114201451A (en) Information auditing method and device
US10783367B2 (en) System and method for data extraction and searching
US8185503B2 (en) Document archival system
CN112085578A (en) Electronic invoice reimbursement system and electronic invoice holder device
CN110737630B (en) Method and device for processing electronic archive file, computer equipment and storage medium
CN110688349B (en) Document sorting method, device, terminal and computer readable storage medium
CN110888881A (en) Picture association method and device, computer equipment and storage medium
KR102039989B1 (en) Method and apparatus for extraciting text data from nonlinear text image
CN111126952A (en) Electronic file filing processing system and method
WO2016186137A1 (en) Accounting assistance system
CN109214362B (en) Document processing method and related equipment
CN110599319A (en) Automatic auditing method, device, terminal and storage medium
CN114445836A (en) Information auditing method and device combining RPA and AI and electronic equipment
CN109767239A (en) A kind of method and system for being verified to electronic invoice
CN117036073A (en) Invoice auditing and automatic reimbursement system based on Internet
CN116882380A (en) Document template generation method for text management system
CN116228265A (en) Invoice risk identification method, device and equipment
CN110991352A (en) File data examination method and device
KR102055920B1 (en) Method and system for providing online parts book service
CN116824604B (en) Financial data management method and system based on image processing
US11829706B1 (en) Document assembly with the help of training data
US8390836B2 (en) Automatic review of variable imaging jobs
KR101320630B1 (en) System and method of processing on internet for joining members of credit card chian via VAN agent
KR102418865B1 (en) A database building method to digitize for public literature files
CN111314314B (en) Method and system for verifying integrity of website download file

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination