CN112818937B - Excel file identification method and device, electronic equipment and readable storage medium - Google Patents

Excel file identification method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN112818937B
CN112818937B CN202110231358.5A CN202110231358A CN112818937B CN 112818937 B CN112818937 B CN 112818937B CN 202110231358 A CN202110231358 A CN 202110231358A CN 112818937 B CN112818937 B CN 112818937B
Authority
CN
China
Prior art keywords
column
cell data
excel file
row
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110231358.5A
Other languages
Chinese (zh)
Other versions
CN112818937A (en
Inventor
刘春生
吴森阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glodon Co Ltd
Original Assignee
Glodon Co Ltd
Filing date
Publication date
Application filed by Glodon Co Ltd filed Critical Glodon Co Ltd
Priority to CN202110231358.5A priority Critical patent/CN112818937B/en
Publication of CN112818937A publication Critical patent/CN112818937A/en
Application granted granted Critical
Publication of CN112818937B publication Critical patent/CN112818937B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to the technical field of list identification, and discloses an Excel file identification method, an Excel file identification device, electronic equipment and a readable storage medium. Wherein the method comprises the following steps: obtaining a target Excel file; analyzing the target Excel file to obtain cell data of the target Excel file; identifying the cell data, and determining a column name and/or a row name corresponding to the target Excel file; and determining column text data and/or row text data corresponding to the column name and/or the row name based on the column name and/or the row name. By implementing the method and the device, the automatic identification of the target Excel file is realized, and the problem of file import errors caused by inconsistent formats of the target Excel file and the template is avoided, so that the Excel file in any format can be successfully imported.

Description

Excel file identification method and device, electronic equipment and readable storage medium
Technical Field
The invention relates to the technical field of list identification, in particular to an identification method and device of Excel files, electronic equipment and a readable storage medium.
Background
Engineering costs typically involve a tender and a bidder, and the bidder typically needs to import into software an Excel-formatted listing provided by the tender for cost calculation. At present, when Excel file data is imported, software can only identify an Excel file with a specific template format, and if the format of the Excel file provided by a signer is exactly consistent with the template format which can be identified by the software, the Excel file can be imported directly. However, the software cannot automatically identify the rows and columns of the Excel file, when the format of the Excel file provided by the bidding party is inconsistent with the format of the template identifiable by the software, the bidding party is required to manually adjust the format of the Excel file of the bidding party to be consistent with the format of the template identifiable by the software, so that the importing of the Excel file data can be realized, otherwise, the importing error of the Excel file data can be caused.
Disclosure of Invention
In view of this, the embodiments of the present invention provide a method, an apparatus, an electronic device, and a readable storage medium for identifying an Excel file, so as to solve the problem that a file import error is caused because rows and columns of the Excel file cannot be automatically identified.
According to a first aspect, an embodiment of the present invention provides a method for identifying an Excel file, including the following steps: obtaining a target Excel file; analyzing the target Excel file to obtain cell data of the target Excel file; identifying the cell data, and determining a column name and/or a row name corresponding to the target Excel file; and determining column text data and/or row text data corresponding to the column name and/or the row name based on the column name and/or the row name.
According to the identification method of the Excel file, through analyzing the obtained target Excel file, each cell data in the target Excel file is obtained, each cell data is identified, and the column name and/or the row name corresponding to the target Excel file and the column text data and/or the row text data corresponding to the column name and/or the row name are determined. According to the method, the target Excel file can be identified without being imported according to a certain template format, column text data and row text data contained in the target Excel file can be determined by identifying column names and/or row names for the target Excel file in any format, automatic identification of the target Excel file is achieved, and the problem that file import errors are caused by inconsistent formats of the target Excel file and the template format is avoided, so that successful import of the Excel file in any format is achieved.
With reference to the first aspect, in a first implementation manner of the first aspect, the identifying the cell data, determining a column name and/or a row name corresponding to the target Excel file includes: matching the cell data based on a preset identifier, and judging whether the cell data meets a matching condition or not; and when the cell data meet the matching condition, judging that the target Excel file is successfully identified, and obtaining a column name and/or a row name corresponding to the preset identifier.
According to the identification method of the Excel file, provided by the embodiment of the invention, each cell data is matched through the preset identifier, whether the cell data meets the matching condition is judged, and when the cell meets the matching condition, the target Excel file is judged to be successfully identified, so that the column name and/or the row name corresponding to the preset identifier is obtained. Wherein the preset identifier is an identifier corresponding to a column name or a row name. Therefore, the automatic identification of the target Excel file in any format is realized, the problem of file import errors caused by inconsistent formats of the target Excel file and the template format is avoided, and the successful import of the target Excel file in any format is ensured.
With reference to the first implementation manner of the first aspect, in a second implementation manner of the first aspect, the cell data includes column cell data and row cell data, and when the preset identifier is a column identifier, the identifying the cell data based on the preset identifier, and determining whether the cell data meets a matching condition includes: acquiring an identification keyword and an exclusion keyword corresponding to the column identifier; judging whether the column cell data are matched with the identification key or not; when the column cell data is matched with the identification keyword, judging whether the column cell data is matched with the exclusion keyword or not; and when the column cell data is not matched with the exclusion keyword, judging that the column cell data meets a matching condition.
With reference to the second implementation manner of the first aspect, in a third implementation manner of the first aspect, when the preset identifier is a line identifier, the identifying the cell data based on the preset identifier, and determining whether the cell data meets a matching condition includes: determining current row cell data corresponding to the current row identifier based on the column cell data satisfying the matching condition; acquiring a preset condition corresponding to the current line identifier; judging whether the current line cell data meets the preset condition or not; when the current row cell data meets the preset conditions, judging that the current row cell data meets the matching conditions.
With reference to the third implementation manner of the first aspect, in a fourth implementation manner of the first aspect, when the preset identifier is a line identifier, the identifying the cell data based on the preset identifier, and determining whether the cell data meets a matching condition further includes: and when the current line cell data does not meet the preset condition, judging that the current line cell data fails to match, and jumping to the next line for continuous identification.
According to the identification method of the Excel file, provided by the embodiment of the invention, each piece of cell data is matched through the column identifier, the column cell data meeting the matching condition is determined, then the row cell data of each row can be determined based on the column cell data meeting the matching condition, the row cell data of each row are matched in sequence, whether the row cell meets the preset condition is judged in sequence, and the row cell data meets the matching condition when the row cell data meets the preset condition is judged. Therefore, the automatic identification of the target Excel file row and column in any format is realized, and the importing of the target Excel file in any format is satisfied.
With reference to the first aspect, in a fifth implementation manner of the first aspect, the method further includes: and responding to a selection instruction of the target Excel file tab, and determining that the target Excel file corresponds to the tab to be imported.
According to the identification method of the Excel file, provided by the embodiment of the invention, the to-be-imported tab and the corresponding cell data thereof are determined based on the selection instruction in response to the selection instruction of the target Excel file tab, so that the defect that the target Excel file is difficult to import partial data is overcome, and the flexible importing of the target Excel file is realized.
With reference to the first implementation manner of the first aspect, in a sixth implementation manner of the first aspect, the method further includes: displaying the column names and/or the row names corresponding to the identified target Excel file, and column text data and/or row text data column data and row data corresponding to the column names and/or the row names; an adjustment instruction responsive to the column name and/or the row name; and adjusting the column names and/or the row names based on the adjustment instruction.
According to the method for identifying the Excel file, provided by the embodiment of the invention, the column names and/or the row names corresponding to the identified target Excel file and the column text data and/or the row text data corresponding to the column names and/or the row names are displayed, so that a user can determine whether the identification result of the target Excel file is correct or not, and the blind introduction of the target Excel file to damage the existing engineering file is avoided. When the identification result is unreasonable, the user can manually adjust, and the electronic device can respond to the adjustment instruction of the column names and/or the row names and adjust the column names and/or the row names based on the adjustment instruction. Therefore, the secondary adjustment of the identification result can be realized, the original target Excel file is not required to be modified, and the importing efficiency of the Excel file is improved.
According to a second aspect, an embodiment of the present invention provides an apparatus for identifying an Excel file, including: the acquisition module is used for acquiring the target Excel file; the analysis module is used for analyzing the target Excel file to obtain cell data of the target Excel file; the identification module is used for identifying the cell data and determining a column name and/or a row name corresponding to the target Excel file; and the determining module is used for determining column text data and/or row text data corresponding to the column names and/or the row names based on the column names and/or the row names.
According to the identification device for the Excel file, the obtained target Excel file is analyzed to obtain the cell data in the target Excel file, the cell data are identified, and the column names and/or the row names corresponding to the target Excel file and the column text data and/or the row text data corresponding to the column names and/or the row names are determined. The device needs to introduce the target Excel file according to a certain template format to be identified, column text data and row text data contained in the target Excel file can be determined by identifying column names and/or row names for the target Excel file in any format, automatic identification of the target Excel file is achieved, and the problem that file introduction errors are caused by inconsistent formats of the target Excel file and the template format is avoided, so that successful introduction of the Excel file in any format is achieved.
According to a third aspect, an embodiment of the present invention provides an electronic device, including: the processor executes the computer instructions, thereby executing the method for identifying the Excel file according to the first aspect or any implementation manner of the first aspect.
According to a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause a computer to execute the method for identifying an Excel file according to the first aspect or any implementation manner of the first aspect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of identifying Excel files in accordance with embodiments of the present invention;
FIG. 2 is another flow chart of a method of identifying Excel files in accordance with embodiments of the present invention;
FIG. 3 is another flow chart of a method of identifying Excel files in accordance with embodiments of the present invention;
fig. 4 is a block diagram of a structure of an identification device of an Excel file according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
When Excel file data is imported, software can only identify the Excel file with a specific template format, and if the format of the Excel file provided by the signer is exactly consistent with the template format which can be identified by the software, the Excel file can be directly imported. However, the software cannot automatically identify the rows and columns of the Excel file, when the format of the Excel file provided by the bidding party is inconsistent with the format of the template identifiable by the software, the bidding party is required to manually adjust the format of the Excel file of the bidding party to be consistent with the format of the template identifiable by the software, so that the importing of the Excel file data can be realized, otherwise, the importing error of the Excel file data can be caused.
Based on the method, the column names and/or row names of the Excel files are automatically identified by analyzing the data of each cell of the Excel files, and the format of the Excel files is not required to be set, so that the automatic identification and successful importing of the row and column data of the Excel files in any format are realized.
According to an embodiment of the present invention, there is provided an embodiment of an identification method of an Excel file, it should be noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that herein.
In this embodiment, an Excel file identification method is provided, which may be used in electronic devices, such as mobile phones, computers, tablet computers, etc., fig. 1 is a flowchart of an Excel file identification method according to an embodiment of the present invention, and as shown in fig. 1, the flowchart includes the following steps:
s11, acquiring a target Excel file.
The target Excel file is building bill of materials data imported into the electronic device for cost calculation. Target Excel files are typically externally provided Excel files, e.g., the signer provides an Excel manifest. The user can import the target Excel file from the outside.
S12, analyzing the target Excel file to obtain cell data of the target Excel file.
The cell data is the specific content in the cell. The target Excel file contains text data of a plurality of rows and a plurality of columns, and the electronic device can analyze the target Excel file to obtain a plurality of cell data contained in the target Excel file. For example, if the target Excel file includes 3 rows and 3 columns, the electronic device may analyze the content in the 9 cells of the 3 rows and 3 columns respectively to determine the cell data corresponding to each cell, for example, specific cell contents such as "name", "unit", "cm", and the like.
S13, identifying cell data, and determining a column name and/or a row name corresponding to the target Excel file.
And the electronic equipment sequentially traverses the obtained cell data from top to bottom or from left to right to determine the header information, namely the column name and/or the row name, of the target Excel file. For example, there is only a column name or a row name for a one-dimensional table; for a two-dimensional table, then, there is both a column name and a row name. For a two-dimensional table, traversing cell data from left to right, and determining column names in a target Excel file; the traversal of the cell data can also be performed from top to bottom to determine the row names in the target Excel file. Typically, the target Excel file corresponds to a plurality of column names and/or a plurality of row names, and a column identification algorithm or a row identification algorithm corresponding to the column names and/or row names can be defined for different column names and/or row names. The core of the column recognition algorithm and the core of the row recognition algorithm are recognized by keyword matching through regular expressions.
S14, determining column text data and/or row text data corresponding to the column names and/or row names based on the column names and/or row names.
The column names in the target Excel file have their corresponding column text data, and the row names have their corresponding row text data. And determining column text data corresponding to the column names according to the identified column names after the column names of the target Excel file are identified, and similarly determining row text data corresponding to the column names according to the identified row names after the row names of the target Excel file are identified.
According to the identification method of the Excel file, through analyzing the obtained target Excel file, each cell data in the target Excel file is obtained, each cell data is identified, and the column name and/or the row name corresponding to the target Excel file, and the column text data and/or the row text data corresponding to the column name and/or the row name are determined. According to the method, the target Excel file can be identified without being imported according to a certain template format, column text data and row text data contained in the target Excel file can be determined by identifying column names and/or row names for the target Excel file in any format, automatic identification of the target Excel file is achieved, and the problem that file import errors are caused by inconsistent formats of the target Excel file and the template format is avoided, so that successful import of the Excel file in any format is achieved.
In this embodiment, an Excel file identification method is provided, which may be used in electronic devices, such as mobile phones, computers, tablet computers, etc., fig. 2 is a flowchart of an Excel file identification method according to an embodiment of the present invention, and as shown in fig. 2, the flowchart includes the following steps:
S21, acquiring a target Excel file. The detailed description is referred to the related description of the step S11 corresponding to the above embodiment, and will not be repeated here.
S22, analyzing the target Excel file to obtain cell data of the target Excel file. The detailed description is referred to the related description of the step S12 corresponding to the above embodiment, and will not be repeated here.
S23, identifying cell data, and determining a column name and/or a row name corresponding to the target Excel file.
Specifically, the step S23 may include the steps of:
And S231, matching the cell data based on the preset identifier, and judging whether the cell data meets the matching condition.
The cell data includes column cell data and row cell data, and the preset identifier is an algorithm identifier set to identify a row name or a column name, and may include a column identifier and a row identifier. Specifically, when identifying a column name, the column name may be used as a column identifier, i.e., an algorithm identifier of a column identification algorithm; when identifying a row name, the row name may be used as a row identifier, i.e. an algorithm identifier of a row identification algorithm. The cell data is matched based on the column identifier or the row identifier to determine the cell data corresponding to the column identifier or the row identifier, i.e., to determine whether the cell data satisfies the matching condition.
Specifically, when the preset identifier is a column identifier, the step S231 may include the steps of:
(1) The identification keyword and the exclusion keyword corresponding to the column identifier are acquired.
The identification key is the cell data containing the identification key, and the exclusion key is the cell data containing the identification key. The electronic device may add the identification key and the exclusion key to a column identification algorithm to perform identification of the column name. The identification key may be one or more, and is not particularly limited herein. It should be noted that the electronic device may automatically add the column name to the recognition keyword list.
(2) And judging whether the column cell data is matched with the identification key.
Traversing the identification keyword list, comparing each column of cell data from top to bottom, judging whether the identification keywords exist in the column of cell data, if yes, judging that the column of cell data is matched with the identification keywords, and executing the step (3); if the identification keywords do not exist in the column cell data, the matching fails, and the column cell data is judged to be unidentified.
For example, define a column name ", register column identification algorithm: containing "name" text. Defining a column name "unit", registering a column identification algorithm: containing "unit", "unit of measure" text. The content in the D3 cell of the Excel file is "item name", then the D column is identified as the "name" column. The content in the E3 cell is "unit", then the E column is identified as the "unit" column.
(3) And judging whether the column cell data is matched with the exclusion key.
When the column cell data is matched with the identification key word, further judging whether the column cell data matched with the identification key word contains an exclusion key word or not, namely judging whether the column cell data matched with the identification key word is matched with the exclusion key word or not, and when the column cell data matched with the identification key word does not contain the exclusion key word, judging that the column cell data is not matched with the exclusion key word, and executing the step (4); otherwise, it is determined that the column cell data is not identified.
(4) The column cell data is determined to satisfy the matching condition.
When the column cell data is not matched with the exclusion keyword, the column cell data does not contain the exclusion keyword, and it can be determined that the column cell data meets the matching condition, that is, the column identification of the target Excel file is successful.
Specifically, when the preset identifier is a line identifier, the step S231 may further include the steps of:
(5) Based on the column cell data satisfying the matching condition, current row cell data corresponding to the current row identifier is determined.
The row cell data is composed of the cell data of each column. And after determining the column cell data meeting the matching condition, determining all row cell data of the target Excel file. And taking the row identifier as an algorithm identifier of a row identification algorithm, wherein different rows correspond to different row identifiers, and determining the current row cell data corresponding to the row identifier from the row cell data of the target Excel file based on the current row identifier.
(6) And acquiring a preset condition corresponding to the current line identifier.
The preset condition is a corresponding line identification rule defined according to a specific cost service requirement, and the line identification rule is not particularly limited herein, and can be determined by a person skilled in the art according to an actual service requirement. For example, clearing a row, according to the requirement of the list service, there must be a name, a code and a unit, so the row identification rule (preset condition) may be: the contents of the name column, code column, and unit column of the row cell data cannot be empty.
(7) And judging whether the current line of cell data meets the preset condition.
Comparing the current line cell data with preset conditions, and determining whether the current line cell data meets the preset conditions. If the current row of cell data meets the preset condition, executing the step (8), otherwise, executing the step (9).
(8) And judging that the cell data of the current row meets the matching condition.
If the current line cell data meets the preset condition, the current line cell data meets the service requirement, and the current cell data can be judged to meet the matching condition.
(9) And judging that the matching of the cell data of the current row fails, and jumping to the next row for continuous identification.
If the current line cell data does not meet the preset condition, the current line cell data is not in accordance with the service requirement, and the matching of the current line cell data fails. Here, the electronic device does not stop the line recognition, but jumps to the line cell data corresponding to the next line to continue the recognition.
For example, define a row name "manifest", register a row identification algorithm: the "name" column of a row is not empty and the "unit" column is not empty. For the first row of the Excel file, the values D1 and E1 are both without data, so the first row is identified as "unidentified". For the second row, the content in D2 is "earthwork", the content in E2 is "m3", and neither is empty, thus identifying the second row as a "checklist".
S232, judging that the target Excel file is successfully identified, and obtaining a column name and/or a row name corresponding to the preset identifier.
When the cell data meet the matching condition, the successful identification of the column cell data and/or the row cell data of the target Excel file can be determined, and at the moment, the column name and/or the row name corresponding to the preset identifier can be obtained.
With the above configuration, the electronic device already has the capability of automatic recognition. The specific identification process is as follows:
(1) And (5) column identification. And traversing the target Excel file, wherein the traversing order is from top to bottom and from left to right according to the cell. And reading the data in each cell, and identifying by using each column identification algorithm. If the data of a certain cell is identified by a column identification algorithm of a certain column, the column where the cell is located is identified as a defined column name, and the column identification is not performed on all cells where the column is located. Since multiple identical column names are not allowed, the column corresponding to the column has already identified a column, and then the column identification algorithm is not invoked later. And identifying unidentified columns for unidentified columns until all column identification algorithms completely identify corresponding columns or Excel has traversed.
(2) The column identification described above may be used as a basis for row identification. And similarly, traversing the target Excel file again, and traversing the rows from top to bottom. When the line is traversed to a certain line, since all columns are already identified by the column identification, no traversing is needed for the cell data of all columns, the column cell data corresponding to the relevant column is taken out according to the column number, the line identification algorithm is used for identification, if the line is identified by the certain line identification algorithm, the line is identified as a defined line name, the line identification is stopped, and the line identification is automatically skipped to the next line for identification. If a line is not identified by any line identification algorithm, the line is identified as unidentified and the next line is still skipped to continue identification. Unlike column identification, row identification allows the same row name, so for each row of row cell data, all row identification algorithms are invoked each time to identify until all row cell data has been traversed.
S24, determining column text data and/or row text data corresponding to the column names and/or row names based on the column names and/or row names. The detailed description is referred to the related description of the step S14 corresponding to the above embodiment, and will not be repeated here.
According to the identification method of the Excel file, each piece of cell data is matched through the preset identifier, whether the piece of cell data meets the matching condition is judged, and when the piece of cell meets the matching condition, the target Excel file is judged to be successfully identified, and the column name and/or the row name corresponding to the preset identifier are obtained. Wherein the preset identifier is an identifier corresponding to a column name or a row name. Therefore, the automatic identification of the target Excel file in any format is realized, the problem of file import errors caused by inconsistent formats of the target Excel file and the template format is avoided, and the successful import of the target Excel file in any format is ensured.
In this embodiment, an Excel file identification method is provided, which may be used in electronic devices, such as mobile phones, computers, tablet computers, etc., fig. 3 is a flowchart of an Excel file identification method according to an embodiment of the present invention, and as shown in fig. 3, the flowchart includes the following steps:
S31, acquiring a target Excel file.
S32, analyzing the target Excel file to obtain cell data of the target Excel file. The detailed description is referred to the related description of the step S22 corresponding to the above embodiment, and will not be repeated here.
S33, identifying cell data, and determining a column name and/or a row name corresponding to the target Excel file. The detailed description is referred to the related description of the step S23 corresponding to the above embodiment, and will not be repeated here.
And S34, determining column text data and/or row text data corresponding to the column names and/or row names based on the column names and/or row names. The detailed description is referred to the related description of the step S24 corresponding to the above embodiment, and will not be repeated here.
And S35, displaying the column names and/or row names corresponding to the identified target Excel file, and column text data and/or row text data corresponding to the column names and/or row names.
The electronic device displays all data (column text data and row text data) of the target Excel file, the row recognition result and the column recognition result through the preview interface. Specifically, the electronic device may display a column identification result corresponding to the target Excel file at the uppermost part of the preview interface, if identified, display a defined column name, and if not identified, display not identified; the electronic setting can display a corresponding line identification result of the target Excel file at the leftmost side of the preview interface, if the line identification result is identified, a defined line name is displayed, if the line identification result is not identified, each line is provided with a check box, and the identified line is automatically checked by default.
S36, responding to the adjustment instruction of the column name and/or the row name.
The user can determine whether the identification result is correct or not by previewing the identification result of the target Excel file, and can manually adjust the identification result with error identification. The electronic device may then respond to the adjustment instructions for the column names and/or row names entered by the user. In particular, the method comprises the steps of,
S37, adjusting the column names and/or the row names based on the adjustment instruction.
And the electronic equipment adjusts the column name or the row name of the identified target Excel file by responding to an adjustment instruction input by a user. Specifically, for a column, the user may click on the uppermost recognition result, pop up a right-click menu, display all column names defined by the electronic device, click on a corresponding column name, and then the electronic device may redefine the column as a selected column name in response to the click operation. Since the same column name does not exist, other columns having the same column name outside the column will be automatically re-identified as unidentified. For the row, the user can click the left recognition result, pop up the right key menu, display all row names defined by the electronic device, click the corresponding row name, and then the electronic device can redefine the row as the selected row name in response to the clicking operation.
S38, responding to a selection instruction of the target Excel file tab, and determining that the target Excel file corresponds to the tab to be imported.
The selection instruction is a selection operation of the target Excel file tab input by a user, and the electronic device can respond to the selection instruction input by the user. For example, the selection instruction may be a tab check operation, and the electronic device may respond to the tab check operation of the user. Of course, the selection instruction may be other selection operations, which are not limited herein, and may be determined by those skilled in the art according to actual needs. The electronic device can determine the tab to be imported corresponding to the target Excel file according to the selection instruction.
Specifically, the preview interface can only display cell data and identification results of one tab at a time, and different tabs can be switched through a tab drop-down box of the target Excel file provided by the preview interface. When the page is imported, only the cell data of the currently selected page is imported, and if partial data of the currently selected page needs to be imported, the check box where the line needing no importing is located is selected.
Through automatic recognition, manual recognition, and data screening, the preview interface may present the actual cell data that performs importation. When the target Excel file is imported, the non-checked rows are skipped from the bottom to the next. And for the checked row, a record is newly added in a corresponding table of the database according to the identified row name, and the data of each field is read from each column corresponding to the row until the whole target Excel file is traversed.
According to the identification method of the Excel file, provided by the embodiment of the invention, the to-be-imported tab and the corresponding cell data thereof are determined based on the selection instruction in response to the selection instruction of the target Excel file tab, so that the defect that the target Excel file is difficult to import partial data is overcome, and the flexible importing of the target Excel file is realized. By displaying the column names and/or row names corresponding to the identified target Excel files and column text data and/or row text data corresponding to the column names and/or row names, a user can determine whether the identification result of the target Excel files is correct or not, and blind introduction is avoided to damage existing engineering files. When the identification result is unreasonable, the user can manually adjust, and the electronic device can respond to the adjustment instruction of the column names and/or the row names and adjust the column names and/or the row names based on the adjustment instruction. Therefore, the secondary adjustment of the identification result can be realized, the original target Excel file is not required to be modified, and the importing efficiency of the Excel file is improved.
The embodiment also provides an Excel file identification device, which is used for implementing the above embodiment and the preferred implementation manner, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
The present embodiment provides an Excel file identification device, as shown in fig. 4, including:
the obtaining module 41 is configured to obtain a target Excel file. The detailed description refers to the corresponding related description of the above method embodiments, and will not be repeated here.
And the parsing module 42 is configured to parse the target Excel file to obtain cell data of the target Excel file. The detailed description refers to the corresponding related description of the above method embodiments, and will not be repeated here.
The identifying module 43 is configured to identify the cell data, and determine a column name and/or a row name corresponding to the target Excel file. The detailed description refers to the corresponding related description of the above method embodiments, and will not be repeated here.
A determining module 44, configured to determine column text data and/or row text data corresponding to the column name and/or row name based on the column name and/or row name. The detailed description refers to the corresponding related description of the above method embodiments, and will not be repeated here.
According to the identification device for the Excel file, through analyzing the obtained target Excel file, each cell data in the target Excel file is obtained, each cell data is identified, and the column name and/or the row name corresponding to the target Excel file, and the column text data and/or the row text data corresponding to the column name and/or the row name are determined. The device needs to introduce the target Excel file according to a certain template format to be identified, column text data and row text data contained in the target Excel file can be determined by identifying column names and/or row names for the target Excel file in any format, automatic identification of the target Excel file is achieved, and the problem that file introduction errors are caused by inconsistent formats of the target Excel file and the template format is avoided, so that successful introduction of the Excel file in any format is achieved.
The identification means of the Excel file in this embodiment is presented in the form of functional units, where the units refer to ASIC circuits, processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above described functions.
Further functional descriptions of the above respective modules are the same as those of the above corresponding embodiments, and are not repeated here.
The embodiment of the invention also provides electronic equipment, which is provided with the identification device of the Excel file shown in the figure 4.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an alternative embodiment of the present invention, and as shown in fig. 5, the electronic device may include: at least one processor 501, such as a CPU (Central Processing Unit ), at least one communication interface 503, a memory 504, at least one communication bus 502. Wherein a communication bus 502 is used to enable connected communications between these components. The communication interface 503 may include a Display screen (Display), a Keyboard (Keyboard), and the optional communication interface 503 may further include a standard wired interface, and a wireless interface. The memory 504 may be a high-speed RAM memory (Random Access Memory, volatile random access memory) or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 504 may also optionally be at least one storage device located remotely from the aforementioned processor 501. Wherein the processor 501 may have stored in the memory 504 an application program in the apparatus described in connection with fig. 4 and the processor 501 invokes the program code stored in the memory 504 for performing any of the above-mentioned method steps.
The communication bus 502 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The communication bus 502 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.
Wherein the memory 504 may include volatile memory (english) such as random-access memory (RAM); the memory may also include a nonvolatile memory (English: non-volatile memory), such as a flash memory (English: flash memory), a hard disk (English: HARD DISK DRIVE, abbreviation: HDD) or a solid state disk (English: solid-STATE DRIVE, abbreviation: SSD); memory 504 may also include a combination of the types of memory described above.
The processor 501 may be a central processor (english: central processing unit, abbreviated: CPU), a network processor (english: network processor, abbreviated: NP) or a combination of CPU and NP.
The processor 501 may further include a hardware chip, among others. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof (English: programmable logic device). The PLD may be a complex programmable logic device (English: complex programmable logic device, abbreviated: CPLD), a field-programmable gate array (English: field-programmable GATE ARRAY, abbreviated: FPGA), a general-purpose array logic (English: GENERIC ARRAY logic, abbreviated: GAL), or any combination thereof.
Optionally, the memory 504 is also used for storing program instructions. The processor 501 may invoke program instructions to implement the method of identifying Excel files as shown in the embodiments of fig. 1-3 of the present application.
The embodiment of the invention also provides a non-transitory computer storage medium, which stores computer executable instructions, and the computer executable instructions can execute the processing method of the identification method of the Excel file in any of the method embodiments. Wherein the storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a hard disk (HARD DISK DRIVE, abbreviated as HDD), a Solid state disk (Solid-state-STATE DRIVE, SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims (8)

1. The identification method of the Excel file is characterized by comprising the following steps of:
obtaining a target Excel file;
analyzing the target Excel file to obtain cell data of the target Excel file;
Identifying the cell data, and determining a column name and/or a row name corresponding to the target Excel file, wherein the method comprises the following steps: matching the cell data based on a preset identifier, and judging whether the cell data meets a matching condition or not; when the cell data meet the matching condition, judging that the target Excel file is successfully identified, and obtaining a column name and/or a row name corresponding to the preset identifier;
Determining column text data and/or row text data corresponding to the column name and/or the row name based on the column name and/or the row name;
When the preset identifier is a column identifier, the cell data is matched based on the preset identifier, and whether the cell data meets a matching condition or not is judged, which comprises the following steps:
Acquiring an identification keyword and an exclusion keyword corresponding to the column identifier; wherein the identification key is that cell data containing the column identifier are all identified, and the exclusion key is that cell data containing the column identifier are not identified;
judging whether the column cell data are matched with the identification key or not;
when the column cell data is matched with the identification keyword, judging whether the column cell data is matched with the exclusion keyword or not;
and when the column cell data is not matched with the exclusion keyword, judging that the column cell data meets a matching condition.
2. The method according to claim 1, wherein when the preset identifier is a line identifier, the identifying the cell data based on the preset identifier, and determining whether the cell data satisfies a matching condition, comprises:
Determining current row cell data corresponding to the current row identifier based on the column cell data satisfying the matching condition;
Acquiring a preset condition corresponding to the current line identifier;
judging whether the current line cell data meets the preset condition or not;
when the current row cell data meets the preset conditions, judging that the current row cell data meets the matching conditions.
3. The method according to claim 2, wherein when the preset identifier is a line identifier, the identifying the cell data based on the preset identifier, determining whether the cell data satisfies a matching condition, further comprises:
And when the current line cell data does not meet the preset condition, judging that the current line cell data fails to match, and jumping to the next line for continuous identification.
4. The method according to claim 1, wherein the method further comprises:
And responding to a selection instruction of the target Excel file tab, and determining that the target Excel file corresponds to the tab to be imported.
5. The method according to claim 1, wherein the method further comprises:
displaying the column names and/or the row names corresponding to the identified target Excel file, and column text data and/or row text data corresponding to the column names and/or the row names;
an adjustment instruction responsive to the column name and/or the row name;
And adjusting the column names and/or the row names based on the adjustment instruction.
6. An Excel file identification device, comprising:
the acquisition module is used for acquiring the target Excel file;
The analysis module is used for analyzing the target Excel file to obtain cell data of the target Excel file;
The identification module is configured to identify the cell data, determine a column name and/or a row name corresponding to the target Excel file, and include: matching the cell data based on a preset identifier, and judging whether the cell data meets a matching condition or not; when the cell data meet the matching condition, judging that the target Excel file is successfully identified, and obtaining a column name and/or a row name corresponding to the preset identifier; when the preset identifier is a column identifier, the cell data is matched based on the preset identifier, and whether the cell data meets a matching condition or not is judged, which comprises the following steps: acquiring an identification keyword and an exclusion keyword corresponding to the column identifier; wherein the identification key is that cell data containing the column identifier are all identified, and the exclusion key is that cell data containing the column identifier are not identified; judging whether the column cell data are matched with the identification key or not; when the column cell data is matched with the identification keyword, judging whether the column cell data is matched with the exclusion keyword or not; when the column cell data is not matched with the exclusion keyword, judging that the column cell data meets a matching condition;
and the determining module is used for determining column text data and/or row text data corresponding to the column names and/or the row names based on the column names and/or the row names.
7. An electronic device, comprising:
the processor is in communication connection with the memory, the memory stores computer instructions, and the processor executes the computer instructions to perform the method for identifying an Excel file according to any one of claims 1-5.
8. A computer-readable storage medium storing computer instructions for causing a computer to perform the method of identifying an Excel file according to any one of claims 1-5.
CN202110231358.5A 2021-03-02 Excel file identification method and device, electronic equipment and readable storage medium Active CN112818937B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110231358.5A CN112818937B (en) 2021-03-02 Excel file identification method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110231358.5A CN112818937B (en) 2021-03-02 Excel file identification method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN112818937A CN112818937A (en) 2021-05-18
CN112818937B true CN112818937B (en) 2024-06-28

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897884A (en) * 2020-07-20 2020-11-06 北京用友薪福社云科技有限公司 Data relation information display method and terminal equipment
CN112035412A (en) * 2020-08-31 2020-12-04 北京奇虎鸿腾科技有限公司 Data file importing method, device, storage medium and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897884A (en) * 2020-07-20 2020-11-06 北京用友薪福社云科技有限公司 Data relation information display method and terminal equipment
CN112035412A (en) * 2020-08-31 2020-12-04 北京奇虎鸿腾科技有限公司 Data file importing method, device, storage medium and device

Similar Documents

Publication Publication Date Title
CN109508352B (en) Report data output method, device, equipment and storage medium
CN111222070B (en) Data processing method, device, equipment and storage medium
CN109635292B (en) Work order quality inspection method and device based on machine learning algorithm
CN112036295B (en) Bill image processing method and device, storage medium and electronic equipment
CN109271315B (en) Script code detection method, script code detection device, computer equipment and storage medium
CN111414727A (en) Method and device for editing header and footer of PDF (Portable document Format) document and electronic equipment
US20230092559A1 (en) Systems and methods for unstructured data processing
CN111563218A (en) Page repairing method and device
US9390073B2 (en) Electronic file comparator
CN111126058A (en) Text information automatic extraction method and device, readable storage medium and electronic equipment
US10643022B2 (en) PDF extraction with text-based key
CN114282288A (en) Axle network identification method, device, equipment and storage medium
CN117493309A (en) Standard model generation method, device, equipment and storage medium
CN111427544B (en) Software requirement document generation method and device, storage medium and electronic equipment
CN113158988A (en) Financial statement processing method and device and computer readable storage medium
CN112818937B (en) Excel file identification method and device, electronic equipment and readable storage medium
CN109542890B (en) Data modification method, device, computer equipment and storage medium
CN111159262A (en) Automatic driving simulation data processing method and device
CN110413279A (en) Data load method and device
JP6994138B2 (en) Information management device and file management method
CN111767223B (en) File processing method and device, electronic equipment and storage medium
CN112818937A (en) Excel file identification method and device, electronic equipment and readable storage medium
CN110781142A (en) Data import method and device, server and storage medium
CN113689207B (en) Monovalent form switching method and device and electronic equipment
CN112540754B (en) Component multiplexing method and device based on B/S architecture, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant