CN117389960A - File parsing method, apparatus, device, storage medium and program product - Google Patents

File parsing method, apparatus, device, storage medium and program product Download PDF

Info

Publication number
CN117389960A
CN117389960A CN202311120050.9A CN202311120050A CN117389960A CN 117389960 A CN117389960 A CN 117389960A CN 202311120050 A CN202311120050 A CN 202311120050A CN 117389960 A CN117389960 A CN 117389960A
Authority
CN
China
Prior art keywords
file
field
target file
target
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311120050.9A
Other languages
Chinese (zh)
Inventor
马嘉琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202311120050.9A priority Critical patent/CN117389960A/en
Publication of CN117389960A publication Critical patent/CN117389960A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a file analysis method, a device, equipment, a storage medium and a program product, which relate to the field of file analysis and can be used in the field of financial science and technology or other related fields. The method comprises the following steps: according to the target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained; acquiring a plurality of key parameter fields of each sub-file; according to the multiple key parameter fields of each sub-file, analyzing each sub-file in parallel to obtain an analysis result of each sub-file; and determining the file analysis result of the target file according to the analysis result of each sub-file. According to the method, the target file to be analyzed is divided into the plurality of subfiles, and then each subfile is analyzed according to the key parameter field of each subfile, so that the file analysis result of the target file is obtained, namely, files in any format can be analyzed by adopting the mode without writing different analysis codes for files in different formats, and the analysis efficiency of the files is improved.

Description

File parsing method, apparatus, device, storage medium and program product
Technical Field
The present invention relates to the field of file parsing, and in particular, to a method, apparatus, device, storage medium, and program product for file parsing.
Background
With the rapid development of information technology, the interaction between various systems is increasing. Taking the financial industry as an example, the transmission of large quantities of financial data between different business systems by taking files as carriers is increasingly in the field.
In the related art, it is not necessary to parse the file after receiving various files, so as to ensure the accuracy of the file, or perform peer-to-peer operations of different files through file parsing.
However, the related art has a problem of low file parsing efficiency.
Disclosure of Invention
Accordingly, in view of the above-mentioned technical problems, it is necessary to provide a method, an apparatus, a device, a storage medium, and a program product for analyzing a file, which can divide a target file to be analyzed into a plurality of subfiles, and analyze each subfile according to key parameter fields of each subfile, so as to obtain a file analysis result of the target file, thereby improving file analysis efficiency.
In a first aspect, an embodiment of the present application provides a method for analyzing a file. The method comprises the following steps:
according to the target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained;
acquiring a plurality of key parameter fields of each sub-file;
according to the multiple key parameter fields of each sub-file, analyzing each sub-file in parallel to obtain an analysis result of each sub-file;
and determining the file analysis result of the target file according to the analysis result of each sub-file.
In one embodiment, according to a target file to be parsed, a plurality of subfiles corresponding to the target file are obtained, including:
acquiring the file size of a target file according to the target file to be analyzed;
and under the condition that the file size of the target file is detected to be larger than a preset threshold value, carrying out file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
In one embodiment, obtaining a plurality of key parameter fields for each subfile includes:
displaying a file parameter field configuration interface;
receiving a plurality of parameter fields input by a user in a file parameter field configuration interface;
each parameter field is determined as each key parameter field.
In one embodiment, obtaining a plurality of key parameter fields for each subfile includes:
extracting keywords of each subfile from each subfile;
according to each keyword, obtaining a target parameter corresponding to each keyword from a parameter storage library;
each target parameter is determined as each key parameter field.
In one embodiment, according to a plurality of key parameter fields of each sub-file, parsing each sub-file in parallel to obtain a parsing result of each sub-file includes:
according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field;
and determining each field value as the analysis result of each subfile.
In one embodiment, determining the file analysis result of the target file according to the analysis result of each sub-file includes:
and carrying out fusion processing on the analysis results of the subfiles according to the file sequence of the subfiles to obtain the file analysis result of the target file.
In one embodiment, the method further comprises:
under the condition that the completion of analysis of the target file is detected, acquiring a field mapping relation between the target file and the file to be compared; the field mapping relation represents the corresponding relation between a first parameter field in the target file and a second parameter field in the file to be compared;
and determining the comparison result of the target file and the file to be compared according to the field mapping relation.
In one embodiment, determining a comparison result of the target file and the file to be compared according to the field mapping relation includes:
acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation;
according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained;
and comparing each first field value with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
In a second aspect, an embodiment of the present application further provides a file parsing apparatus. The device comprises:
the file acquisition module is used for acquiring a plurality of subfiles corresponding to the target file according to the target file to be analyzed;
the field acquisition module is used for acquiring a plurality of key parameter fields of each sub-file;
the first determining module is used for analyzing each sub-file in parallel according to the key parameter fields of each sub-file to obtain an analysis result of each sub-file;
and the second determining module is used for determining the file analysis result of the target file according to the analysis result of each sub-file.
In a third aspect, embodiments of the present application further provide a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of any of the embodiments of the first aspect described above when the processor executes the computer program.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium. A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the embodiments of the first aspect described above.
In a fifth aspect, embodiments of the present application also provide a computer program product. A computer program product comprising a computer program which when executed by a processor performs the steps of any of the embodiments of the first aspect described above.
According to the file analysis method, the device, the equipment, the storage medium and the program product, the plurality of subfiles corresponding to the target file are obtained according to the target file to be analyzed, the plurality of key parameter fields of each subfile are further obtained, then the subfiles are analyzed in parallel according to the plurality of key parameter fields of each subfile, the analysis result of each subfile is obtained, and finally the file analysis result of the target file is determined according to the analysis result of each subfile. According to the method, the target file to be analyzed is divided into the plurality of subfiles, and then each subfile is analyzed according to the key parameter field of each subfile, so that the file analysis result of the target file is obtained, namely, files in any format can be analyzed by adopting the mode without writing different analysis codes for files in different formats, and the analysis efficiency of the files is improved.
Drawings
FIG. 1 is a diagram of an application environment for a file parsing method in one embodiment;
FIG. 2 is a flow chart of a file parsing method in one embodiment;
FIG. 3 is a flow diagram of a process for obtaining subfiles in one embodiment;
FIG. 4 is a flow chart illustrating determining the parsing result of each subfile in one embodiment;
FIG. 5 is a flow chart of comparing files in one embodiment;
FIG. 6 is a flow chart of comparing files according to another embodiment;
FIG. 7 is a flow chart of a file parsing method according to another embodiment;
FIG. 8 is a schematic diagram of a file parsing apparatus according to an embodiment;
fig. 9 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The file analysis method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. Optionally, a visual interface may be integrated in the terminal 102, for displaying a file parsing result of the target file. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
In one embodiment, as shown in fig. 2, a file parsing method is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following steps:
s201, according to a target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained.
In one implementation, a file splitting tool may be employed, for example, an online text file cutter may input the target file into a pre-trained model, and the model outputs a plurality of subfiles corresponding to the target file. Or, inputting the target file into an online text file cutting tool, and dividing the target file to obtain a plurality of subfiles corresponding to the target file.
In another implementation manner, the sub-files after the target file is segmented may be stored in the database in advance, and may be directly obtained from the database when in use; for example, the index may be performed in the database according to the file identifier of the target file, so as to obtain a plurality of subfiles corresponding to the target file.
S202, acquiring a plurality of key parameter fields of each sub-file.
One implementation way is to display a file parameter field configuration interface, receive a plurality of parameter fields input by a user in the file parameter field configuration interface, and determine each parameter field as each key parameter field. Optionally, when the key parameter fields of each sub-file need to be acquired, a file parameter field configuration interface is displayed for the user, so that the user inputs a plurality of fields in the file parameter field configuration interface, after the user finishes inputting, the server can acquire the plurality of parameter fields input by the user, and then each parameter field is used as each key parameter field.
Another implementation manner is to extract keywords of each sub-file from each sub-file, obtain target parameters corresponding to each keyword from a parameter storage library according to each keyword, and determine each target parameter as each key parameter field.
S203, analyzing each sub-file in parallel according to the key parameter fields of each sub-file to obtain an analysis result of each sub-file.
Optionally, for any sub-file, according to a plurality of key parameter fields of the sub-file, analyzing the sub-file to obtain an analysis result of the sub-file, and for other sub-files, obtaining the analysis result of each sub-file by adopting the method.
S204, determining the file analysis result of the target file according to the analysis result of each sub-file.
Optionally, the analysis results of the subfiles can be fused according to the file sequence of each subfile, so as to obtain the file analysis result of the target file.
It should be noted that, in the financial field, the scene of analyzing the file is increasing, for example, the user needs to acquire a transaction detail, after the client sends a transaction detail acquisition request to the server, the server acquires the transaction detail data of the user from the database, and forms the transaction detail data into a file to send the transaction detail file to the user; in order to ensure the accuracy of the data in the transaction detail file, the situation of no data loss and the like exists, the transaction detail file can be analyzed in the mode, and whether the transaction detail file is accurate or not is determined according to the analysis result.
According to the file analysis method provided by the embodiment of the application, a plurality of subfiles corresponding to the target file are obtained according to the target file to be analyzed, a plurality of key parameter fields of each subfile are further obtained, then the subfiles are analyzed in parallel according to the plurality of key parameter fields of each subfile, analysis results of the subfiles are obtained, and finally the file analysis results of the target file are determined according to the analysis results of the subfiles. According to the method, the target file to be analyzed is divided into the plurality of subfiles, and then each subfile is analyzed according to the key parameter field of each subfile, so that the file analysis result of the target file is obtained, namely, files in any format can be analyzed by adopting the mode without writing different analysis codes for files in different formats, and the analysis efficiency of the files is improved.
In general, for larger files, if the files are directly parsed, a lot of time is required, and in order to improve the efficiency of file parsing, the files may be divided first to obtain a plurality of relatively smaller subfiles. Based on this, in one embodiment, an alternative way of obtaining a plurality of subfiles corresponding to a target file is provided. As shown in fig. 3, the steps may be included as follows:
s301, obtaining the file size of the target file according to the target file to be analyzed.
S302, under the condition that the file size of the target file is detected to be larger than a preset threshold value, file segmentation processing is carried out on the target file, and a plurality of subfiles corresponding to the target file are obtained.
Optionally, the file splitting tool may be used to split the target file to obtain multiple subfiles corresponding to the target file if the file size of the target file is greater than the preset threshold.
In the embodiment of the application, the target file is segmented to obtain the plurality of subfiles corresponding to the target file, and then the plurality of subfiles are analyzed to determine the file analysis result of the target file, so that the file analysis efficiency is improved.
And analyzing each sub-file according to the key parameter fields, and acquiring the content corresponding to each key parameter field from each sub-file, wherein the content corresponding to each key parameter field forms the analysis result of each sub-file. Based on this, in one embodiment, an alternative way of determining the parsing results for each subfile is provided. As shown in fig. 4, the steps may be included as follows:
s401, according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field.
S402, determining each field value as the analysis result of each subfile.
Optionally, each sub-file is parsed according to each key parameter field, and the content corresponding to each key parameter field, that is, the field value corresponding to each key parameter field, is obtained from each sub-file, so that each field value is determined as the parsing result of each sub-file. For example, the key parameter fields include age, transaction amount, etc., and after each subfile is parsed according to the age and transaction amount of the key parameter fields, the obtained field values corresponding to each key parameter field may be 21 years old and 200 yuan, etc.
In the embodiment of the application, an optional way of determining the parsing result of each subfile is provided.
After the file analysis result is obtained for each file in the mode, the file analysis result of each file can be stored in a database, and when various requirements such as data analysis or file comparison are met, the data can be directly obtained from the database. Based on this, in one embodiment, an alternative way of comparing files is provided. As shown in fig. 5, the steps may be included as follows:
s501, under the condition that the completion of analysis of the target file is detected, acquiring a field mapping relation between the target file and the file to be compared.
S502, determining a comparison result of the target file and the file to be compared according to the field mapping relation.
In the embodiment of the application, the field mapping relationship represents a correspondence relationship between a first parameter field in the target file and a second parameter field in the file to be compared. For example, the first parameter field may include a phone number, a name, a transaction amount, etc., and the corresponding second parameter field may be a cell phone number, a name, a transaction amount, etc.
Optionally, under the condition that the analysis of the target file is detected to be completed, a field mapping relation between the target file and the file to be compared is obtained, so that the field mapping relation, a file analysis result of the target file stored in data and a file analysis result of the file to be compared can be input into a pre-trained model, and the comparison result of the target file and the file to be compared is output by the model.
In the embodiment of the application, the target file is compared with the file to be compared by introducing the field mapping relation, so that the efficiency of file comparison is improved.
The field mapping relation indicates a certain relation between the target file and the file to be compared, and the relation between field values corresponding to the fields can be found according to the field mapping relation when the target file and the file to be compared are compared, so that a comparison result of the target file and the file to be compared is obtained. Based on this, in one embodiment, an alternative way of comparing files is provided. As shown in fig. 6, the steps may be included as follows:
s601, acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation.
S602, according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained.
S603, comparing the first field values with the corresponding second field values to obtain a comparison result of the target file and the file to be compared.
Optionally, according to the field mapping relation, a plurality of first parameter fields and a plurality of corresponding second parameter fields are obtained first, then according to each first parameter field, a first field value corresponding to each first parameter field is obtained from a file analysis result of a target file in a database, and according to each second parameter field, a second field value corresponding to each second parameter field is obtained from a file analysis result of a file to be compared in the database, and finally each first field value is compared with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
In the embodiment of the application, an optional mode for determining the comparison result of the target file and the file to be compared is provided.
In addition, in one embodiment, the embodiment of the application also provides an alternative example of the file parsing method. As shown in connection with fig. 7, includes:
s701, according to the target file to be analyzed, acquiring the file size of the target file.
S702, under the condition that the file size of the target file is detected to be larger than a preset threshold value, performing file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
S703, a file parameter field configuration interface is displayed.
S704, receiving a plurality of parameter fields input by a user in a file parameter field configuration interface.
S705, each parameter field is determined as each key parameter field of each subfile.
S706, according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field.
S707, each field value is determined as the analysis result of each subfile.
S708, according to the file sequence of each sub-file, fusion processing is carried out on the analysis results of each sub-file, and the file analysis results of the target file are obtained.
The above processes of S701 to S708 may refer to the descriptions of the above method embodiments, and the implementation principle and technical effects are similar, and are not repeated herein.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a file analysis device for realizing the file analysis method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the file parsing device provided in the following may be referred to the limitation of the file parsing method hereinabove, and will not be repeated herein.
In one embodiment, as shown in fig. 8, there is provided a file parsing apparatus 1 including: a file acquisition module 10, a field acquisition module 20, a first determination module 30, and a second determination module 40, wherein:
the file acquisition module 10 is used for acquiring a plurality of subfiles corresponding to the target file according to the target file to be analyzed;
a field obtaining module 20, configured to obtain a plurality of key parameter fields of each subfile;
the first determining module 30 is configured to parse each sub-file in parallel according to the plurality of key parameter fields of each sub-file, so as to obtain a parsing result of each sub-file;
the second determining module 40 is configured to determine a file analysis result of the target file according to the analysis result of each sub-file.
In one embodiment, the file acquisition module 10 may be configured to:
acquiring the file size of a target file according to the target file to be analyzed; and under the condition that the file size of the target file is detected to be larger than a preset threshold value, carrying out file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
In one embodiment, the field acquisition module 20 may be configured to:
displaying a file parameter field configuration interface; receiving a plurality of parameter fields input by a user in a file parameter field configuration interface; each parameter field is determined as each key parameter field.
In one embodiment, the field obtaining module 20 is further configured to:
extracting keywords of each subfile from each subfile; according to each keyword, obtaining a target parameter corresponding to each keyword from a parameter storage library; each target parameter is determined as each key parameter field.
In one embodiment, the first determining module 30 may be configured to:
according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field; and determining each field value as the analysis result of each subfile.
In one embodiment, the second determining module 40 may be configured to:
and carrying out fusion processing on the analysis results of the subfiles according to the file sequence of the subfiles to obtain the file analysis result of the target file.
In one embodiment, the file parsing apparatus 1 is further configured to:
under the condition that the completion of analysis of the target file is detected, acquiring a field mapping relation between the target file and the file to be compared; the field mapping relation represents the corresponding relation between a first parameter field in the target file and a second parameter field in the file to be compared; and determining the comparison result of the target file and the file to be compared according to the field mapping relation.
In one embodiment, the file parsing apparatus 1 is further configured to:
acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation; according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained; and comparing each first field value with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
The modules in the file parsing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing file parsing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a file parsing method.
It will be appreciated by those skilled in the art that the structure shown in fig. 9 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:
according to the target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained;
acquiring a plurality of key parameter fields of each sub-file;
according to the multiple key parameter fields of each sub-file, analyzing each sub-file in parallel to obtain an analysis result of each sub-file;
and determining the file analysis result of the target file according to the analysis result of each sub-file.
In one embodiment, when the processor executes logic of a plurality of subfiles corresponding to a target file according to the target file to be parsed in the computer program, the following steps may be implemented:
acquiring the file size of a target file according to the target file to be analyzed; and under the condition that the file size of the target file is detected to be larger than a preset threshold value, carrying out file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
In one embodiment, when the processor executes logic in the computer program to obtain the plurality of key parameter fields for each subfile, the following steps may be implemented:
displaying a file parameter field configuration interface; receiving a plurality of parameter fields input by a user in a file parameter field configuration interface; each parameter field is determined as each key parameter field.
In one embodiment, when the processor executes logic in the computer program to obtain the plurality of key parameter fields for each subfile, the following steps may be implemented:
extracting keywords of each subfile from each subfile; according to each keyword, obtaining a target parameter corresponding to each keyword from a parameter storage library; each target parameter is determined as each key parameter field.
In one embodiment, when the processor executes logic in the computer program to analyze each sub-file in parallel according to the plurality of key parameter fields of each sub-file to obtain the analysis result of each sub-file, the following steps may be implemented:
according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field; and determining each field value as the analysis result of each subfile.
In one embodiment, when the processor executes logic in the computer program for determining the file analysis result of the target file according to the analysis result of each sub-file, the following steps may be implemented:
and carrying out fusion processing on the analysis results of the subfiles according to the file sequence of the subfiles to obtain the file analysis result of the target file.
In one embodiment, the processor when executing the computer program further performs the steps of:
under the condition that the completion of analysis of the target file is detected, acquiring a field mapping relation between the target file and the file to be compared; the field mapping relation represents the corresponding relation between a first parameter field in the target file and a second parameter field in the file to be compared; and determining the comparison result of the target file and the file to be compared according to the field mapping relation.
In one embodiment, when the processor executes logic of determining a comparison result of the target file and the file to be compared according to the field mapping relationship in the computer program, the following steps may be implemented:
acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation; according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained; and comparing each first field value with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
The principles and processes of implementing the above-mentioned embodiments of the computer device may be referred to the description of the embodiments of the file parsing method in the foregoing embodiments, which are not repeated herein.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:
according to the target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained;
acquiring a plurality of key parameter fields of each sub-file;
according to the multiple key parameter fields of each sub-file, analyzing each sub-file in parallel to obtain an analysis result of each sub-file;
and determining the file analysis result of the target file according to the analysis result of each sub-file.
In one embodiment, when logic for acquiring a plurality of subfiles corresponding to a target file is executed by a processor according to the target file to be parsed in the computer program, the following steps may be implemented:
acquiring the file size of a target file according to the target file to be analyzed; and under the condition that the file size of the target file is detected to be larger than a preset threshold value, carrying out file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
In one embodiment, the logic for obtaining the plurality of key parameter fields for each subfile in the computer program may be implemented as follows:
displaying a file parameter field configuration interface; receiving a plurality of parameter fields input by a user in a file parameter field configuration interface; each parameter field is determined as each key parameter field.
In one embodiment, the logic for obtaining the plurality of key parameter fields for each subfile in the computer program may be implemented as follows:
extracting keywords of each subfile from each subfile; according to each keyword, obtaining a target parameter corresponding to each keyword from a parameter storage library; each target parameter is determined as each key parameter field.
In one embodiment, according to the key parameter fields of each sub-file in the computer program, the sub-files are parsed in parallel, and when the logic for obtaining the parsing result of each sub-file is executed by the processor, the following steps may be implemented:
according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field; and determining each field value as the analysis result of each subfile.
In one embodiment, when the logic for determining the file analysis result of the target file is executed by the processor according to the analysis result of each sub-file in the computer program, the following steps may be implemented:
and carrying out fusion processing on the analysis results of the subfiles according to the file sequence of the subfiles to obtain the file analysis result of the target file.
In one embodiment, the computer program when executed by the processor further performs the steps of:
under the condition that the completion of analysis of the target file is detected, acquiring a field mapping relation between the target file and the file to be compared; the field mapping relation represents the corresponding relation between a first parameter field in the target file and a second parameter field in the file to be compared; and determining the comparison result of the target file and the file to be compared according to the field mapping relation.
In one embodiment, when logic for determining a comparison result of a target file and a file to be compared is executed by a processor according to a field mapping relationship in a computer program, the following steps may be implemented:
acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation; according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained; and comparing each first field value with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
The principles and processes of implementing the foregoing embodiments of the computer readable storage medium may be referred to in the foregoing embodiment of the file parsing method embodiment, which is not described herein.
In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of:
according to the target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained;
acquiring a plurality of key parameter fields of each sub-file;
according to the multiple key parameter fields of each sub-file, analyzing each sub-file in parallel to obtain an analysis result of each sub-file;
and determining the file analysis result of the target file according to the analysis result of each sub-file.
In one embodiment, when logic for acquiring a plurality of subfiles corresponding to a target file is executed by a processor according to the target file to be parsed in the computer program, the following steps may be implemented:
acquiring the file size of a target file according to the target file to be analyzed; and under the condition that the file size of the target file is detected to be larger than a preset threshold value, carrying out file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
In one embodiment, the logic for obtaining the plurality of key parameter fields for each subfile in the computer program may be implemented as follows:
displaying a file parameter field configuration interface; receiving a plurality of parameter fields input by a user in a file parameter field configuration interface; each parameter field is determined as each key parameter field.
In one embodiment, the logic for obtaining the plurality of key parameter fields for each subfile in the computer program may be implemented as follows:
extracting keywords of each subfile from each subfile; according to each keyword, obtaining a target parameter corresponding to each keyword from a parameter storage library; each target parameter is determined as each key parameter field.
In one embodiment, according to the key parameter fields of each sub-file in the computer program, the sub-files are parsed in parallel, and when the logic for obtaining the parsing result of each sub-file is executed by the processor, the following steps may be implemented:
according to each key parameter field, analyzing each sub-file in parallel to obtain a field value corresponding to each key parameter field; and determining each field value as the analysis result of each subfile.
In one embodiment, when the logic for determining the file analysis result of the target file is executed by the processor according to the analysis result of each sub-file in the computer program, the following steps may be implemented:
and carrying out fusion processing on the analysis results of the subfiles according to the file sequence of the subfiles to obtain the file analysis result of the target file.
In one embodiment, the computer program when executed by the processor further performs the steps of:
under the condition that the completion of analysis of the target file is detected, acquiring a field mapping relation between the target file and the file to be compared; the field mapping relation represents the corresponding relation between a first parameter field in the target file and a second parameter field in the file to be compared; and determining the comparison result of the target file and the file to be compared according to the field mapping relation.
In one embodiment, when logic for determining a comparison result of a target file and a file to be compared is executed by a processor according to a field mapping relationship in a computer program, the following steps may be implemented:
acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation; according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained; and comparing each first field value with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
The principles and processes of implementing the above-mentioned embodiments of the computer program product provided in the present invention may be referred to in the foregoing description of the embodiments of the file parsing method in the foregoing embodiments, and will not be repeated herein.
The data (including, but not limited to, data for parsing, data stored, data displayed, etc.) referred to in this application are information and data authorized or fully authorized by each party.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (12)

1. A method for parsing a file, the method comprising:
according to a target file to be analyzed, a plurality of subfiles corresponding to the target file are obtained;
acquiring a plurality of key parameter fields of each subfile;
according to the multiple key parameter fields of each sub-file, analyzing each sub-file in parallel to obtain an analysis result of each sub-file;
and determining the file analysis result of the target file according to the analysis result of each sub-file.
2. The method of claim 1, wherein the obtaining, according to the target file to be parsed, a plurality of subfiles corresponding to the target file includes:
acquiring the file size of a target file according to the target file to be analyzed;
and under the condition that the file size of the target file is detected to be larger than a preset threshold value, carrying out file segmentation processing on the target file to obtain a plurality of subfiles corresponding to the target file.
3. The method according to claim 1 or 2, wherein said obtaining a plurality of key parameter fields for each of said subfiles comprises:
displaying a file parameter field configuration interface;
receiving a plurality of parameter fields input by a user in the file parameter field configuration interface;
and determining each parameter field as each key parameter field.
4. The method according to claim 1 or 2, wherein said obtaining a plurality of key parameter fields for each of said subfiles comprises:
extracting keywords of each subfile from each subfile;
according to the keywords, obtaining target parameters corresponding to the keywords from a parameter storage library;
and determining each target parameter as each key parameter field.
5. The method according to claim 1 or 2, wherein the parsing each subfile in parallel according to the plurality of key parameter fields of each subfile to obtain the parsing result of each subfile includes:
according to the key parameter fields, analyzing the subfiles in parallel to obtain field values corresponding to the key parameter fields;
and determining each field value as a parsing result of each subfile.
6. The method according to claim 1 or 2, wherein determining the file parsing result of the target file according to the parsing result of each sub-file comprises:
and carrying out fusion processing on the analysis results of the subfiles according to the file sequence of the subfiles to obtain the file analysis results of the target file.
7. The method according to claim 1 or 2, characterized in that the method further comprises:
under the condition that the analysis of the target file is detected to be completed, acquiring a field mapping relation between the target file and the file to be compared; the field mapping relation represents a corresponding relation between a first parameter field in the target file and a second parameter field in the file to be compared;
and determining a comparison result of the target file and the file to be compared according to the field mapping relation.
8. The method of claim 7, wherein determining a comparison result of the target file and the file to be compared according to the field mapping relationship comprises:
acquiring a plurality of first parameter fields and a plurality of corresponding second parameter fields according to the field mapping relation;
according to each first parameter field and each corresponding second parameter field, a first field value corresponding to each first parameter field and a second field value corresponding to each second parameter field are obtained;
and comparing each first field value with each corresponding second field value to obtain a comparison result of the target file and the file to be compared.
9. A document parsing apparatus, the apparatus comprising:
the file acquisition module is used for acquiring a plurality of subfiles corresponding to the target file according to the target file to be analyzed;
the field acquisition module is used for acquiring a plurality of key parameter fields of each subfile;
the first determining module is used for analyzing each sub-file in parallel according to the key parameter fields of each sub-file to obtain an analysis result of each sub-file;
and the second determining module is used for determining the file analysis result of the target file according to the analysis result of each sub-file.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 8 when the computer program is executed.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 8.
12. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method of any one of claims 1 to 8.
CN202311120050.9A 2023-08-31 2023-08-31 File parsing method, apparatus, device, storage medium and program product Pending CN117389960A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311120050.9A CN117389960A (en) 2023-08-31 2023-08-31 File parsing method, apparatus, device, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311120050.9A CN117389960A (en) 2023-08-31 2023-08-31 File parsing method, apparatus, device, storage medium and program product

Publications (1)

Publication Number Publication Date
CN117389960A true CN117389960A (en) 2024-01-12

Family

ID=89469058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311120050.9A Pending CN117389960A (en) 2023-08-31 2023-08-31 File parsing method, apparatus, device, storage medium and program product

Country Status (1)

Country Link
CN (1) CN117389960A (en)

Similar Documents

Publication Publication Date Title
US20210342404A1 (en) System and method for indexing electronic discovery data
CN111506608A (en) Method and device for comparing structured texts
CN115795000A (en) Joint similarity algorithm comparison-based enclosure identification method and device
CN114579584A (en) Data table processing method and device, computer equipment and storage medium
CN113761185A (en) Main key extraction method, equipment and storage medium
CN117332766A (en) Flow chart generation method, device, computer equipment and storage medium
CN117251777A (en) Data processing method, device, computer equipment and storage medium
CN116226681A (en) Text similarity judging method and device, computer equipment and storage medium
CN111597336A (en) Processing method and device of training text, electronic equipment and readable storage medium
CN117389960A (en) File parsing method, apparatus, device, storage medium and program product
CN111651531A (en) Data import method, device, equipment and computer storage medium
CN118152504A (en) Unstructured data indexing method, device, apparatus, medium and program product
CN118394781A (en) Data query method, device, computer equipment and storage medium
CN117370339A (en) Report blood edge relationship processing method and device, computer equipment and storage medium
CN117236298A (en) Report data generation method, report data generation device, computer equipment, report data generation medium and program product
CN118409956A (en) Performance test method, apparatus, computer device, storage medium, and program product
CN116932677A (en) Address information matching method, device, computer equipment and storage medium
CN117290350A (en) Data synchronization method, apparatus, computer device, storage medium, and program product
CN117608587A (en) File analysis method, device and equipment
CN117271445A (en) Log data processing method, device, server, storage medium and program product
CN114238453A (en) Data difference comparison method and device, computer equipment and computer program product
CN115936312A (en) Electronic component evaluation method and device, computer equipment and storage medium
CN117194729A (en) Power data storage method, apparatus, device, storage medium, and program product
CN117131007A (en) Database table naming method, apparatus, computer device, storage medium and product
CN117312660A (en) Project pushing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination