CN109543154B - Type conversion method and device of table data, storage medium and electronic equipment - Google Patents

Type conversion method and device of table data, storage medium and electronic equipment Download PDF

Info

Publication number
CN109543154B
CN109543154B CN201811183314.4A CN201811183314A CN109543154B CN 109543154 B CN109543154 B CN 109543154B CN 201811183314 A CN201811183314 A CN 201811183314A CN 109543154 B CN109543154 B CN 109543154B
Authority
CN
China
Prior art keywords
data
type
converted
probability
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811183314.4A
Other languages
Chinese (zh)
Other versions
CN109543154A (en
Inventor
蔡鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin ByteDance Technology Co Ltd
Original Assignee
Tianjin ByteDance Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin ByteDance Technology Co Ltd filed Critical Tianjin ByteDance Technology Co Ltd
Priority to CN201811183314.4A priority Critical patent/CN109543154B/en
Publication of CN109543154A publication Critical patent/CN109543154A/en
Application granted granted Critical
Publication of CN109543154B publication Critical patent/CN109543154B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a type conversion method and device of table data, a storage medium and an electronic device. The method comprises the following steps: receiving the untyped spreadsheet, and extracting column data information and row header information of the untyped spreadsheet; judging the data type of the column data information and judging the data type of the line header information; and converting the untyped spreadsheet into a typed database table according to the judgment result. By implementing the method and the device, when data of the typeless electronic table is imported, the value type of the related column is automatically defined according to the characteristics of the existing data, one-key import is realized, manual operation is effectively reduced, and therefore efficiency is improved.

Description

Type conversion method and device of table data, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of data storage, and more particularly, to a method and an apparatus for converting a type of table data, a storage medium, and an electronic device.
Background
There are two main types of online spreadsheets on the market: there is no type spreadsheet, represented as Google Sheet, and there is a type database table, represented as Airtable. The main differences between the two are as follows
Typeless spreadsheet: without a strict concept of "field", a user may fill any value in any column of cells, and if the value type of a certain cell needs to be defined, the cell needs to be implemented by means of "data check" or "cell format".
Typed database tables: when a field is added, a user needs to declare the type of the field in advance like a column of a newly added database, such as a text, a date, a single-choice list and a multi-choice list.
When a user adds a typed database table, if data is imported from a typed spreadsheet, there are generally 2 methods:
the first method comprises the following steps: establishing a database table with types in advance, declaring field types, and importing or directly pasting data from Excel
And the second method comprises the following steps: direct import of typeless spreadsheet data (e.g. Excel or CSV) and then manual modification of the value type of each column
The defects of the two existing methods are obvious, and users need to operate the methods manually, so that the efficiency is low.
Disclosure of Invention
The technical problem to be solved by the present disclosure is to provide a method and an apparatus for converting a type of table data, a storage medium, and an electronic device, aiming at the defects that the above-mentioned electronic table in the prior art needs manual conversion and is low in efficiency.
The technical scheme adopted by the disclosure for solving the technical problem is as follows: a type conversion method for constructing table data comprises the following steps:
receiving a non-type electronic form, and extracting column data information and row header information of the non-type electronic form;
judging the data type of the column data information and judging the data type of the row header information;
and converting the untyped spreadsheet into a typed database table according to the judgment result.
Further, in the method for converting the type of the form data according to the present disclosure, the column data information is data to be converted in each column except for the first row in the untyped electronic form;
the judging the data type of the column data information comprises: converting the data to be converted by using a preset data type;
and if the conversion is successful, determining the data type of the data to be converted.
Further, the type conversion method of the table data according to the present disclosure, the preset data type includes at least one of a date and time type, a date type, a link type, and a number type; the method further comprises the following steps:
setting a preset execution sequence of the preset data types;
the converting the data to be converted by using the preset data type comprises: and converting the data to be converted by using a preset data type according to the preset execution sequence.
Further, the method for converting the type of the table data according to the present disclosure further includes:
if the conversion is failed, the conversion is performed,
judging whether the data to be converted is of a single-selection type or a multi-selection type according to the probability of the data to be converted;
and if so, determining the data type of the data to be converted to be a single-selection type or a multi-selection type.
Further, the method for converting a type of table data according to the present disclosure, wherein the determining whether the type is a single-choice type or a multiple-choice type according to the probability of the data to be converted includes:
judging whether the first probability of the data to be converted is greater than a first preset probability or not;
if so, further judging whether a second probability of the data to be converted is greater than a second preset probability;
if so, the data type of the data to be converted is a multi-selection type;
and if not, the data type of the data to be converted is a single selection type.
Further, in the method for converting types of table data according to the present disclosure, a first probability obtaining process of the data to be converted is:
scanning the full-column data of the typeless electronic form, and removing the duplication, wherein m rows are adopted before the duplication is removed, and n rows are adopted after the duplication is removed;
the first probability of the data to be converted is: p is 1-n/m.
Further, in the method for converting a type of table data according to the present disclosure, a second probability obtaining process of the data to be converted is:
scanning the full-column data of the typeless electronic form and performing duplicate removal;
dividing the data to be converted after the duplication removal into a set A and a set B according to whether the data to be converted contains a separator, wherein the set A contains the separator, and the set B does not contain the separator;
dividing each line of characters in the set A according to separators to obtain a set C consisting of characters;
calculating the intersection D of the set B and the set C;
the second probability of the data to be converted is:
(size(D)/size(B)+size(D)/size(A))/2。
further, the method for converting the type of the table data according to the present disclosure further includes:
if the first probability of the data to be converted is not greater than a first preset probability, the data type of the data to be converted is a text type;
judging whether the characters of the data to be converted are larger than a preset length or not;
if yes, the data to be converted is of a long text type;
if not, the data to be converted is of a short text type.
Further, the method for converting the type of the table data according to the present disclosure, wherein the determining the data type of the line header information includes:
after the data type of the line data information is obtained, judging whether each line of the first row of data to be converted is matched with the data type of the line data information;
if so, the first row of data to be converted is a data row;
if not, the data to be converted in the first line is the title line.
In addition, the present disclosure also provides a type conversion apparatus of table data, including:
the receiving unit is used for receiving the non-type electronic form and extracting column data information and row header information of the non-type electronic form;
the first judging unit is used for judging the data type of the column data information and judging the data type of the row header information;
and the conversion unit is used for converting the untyped electronic form into a typed database form according to the judgment result.
In addition, the present disclosure also provides a computer storage medium having a computer program stored thereon, which when executed by a processor implements the method of type conversion of table data as described above.
In addition, the present disclosure also provides an electronic device comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is used for executing the computer program to realize the type conversion method of the table data.
The implementation of the type conversion method, the device, the storage medium and the electronic equipment of the table data has the following beneficial effects: the method comprises the following steps: receiving a non-type electronic form, and extracting column data information and row header information of the non-type electronic form; judging the data type of the column data information and judging the data type of the row header information; and converting the untyped spreadsheet into a typed database table according to the judgment result. By implementing the method and the device, when data of the typeless electronic table is imported, the value type of the related column is automatically defined according to the characteristics of the existing data, and one-key import is realized, so that the efficiency is improved.
Drawings
The disclosure will be further described with reference to the accompanying drawings and examples, in which:
fig. 1 is a flowchart of a type conversion method for table data according to an embodiment of the present disclosure;
fig. 2 is a flowchart for determining whether data to be converted is a preset data type according to another embodiment of the disclosure;
fig. 3 is a flowchart for determining whether data to be converted is of a single-selection type or a multiple-selection type according to another embodiment of the disclosure;
fig. 4 is a flowchart of determining whether data to be converted is of a single-selection type or a multiple-selection type according to another embodiment of the disclosure;
fig. 5 is a flowchart for determining whether data to be converted is of a long text type or a short text type according to another embodiment of the present disclosure;
FIG. 6 is a flowchart of determining a type of a first row of data to be converted according to another embodiment of the disclosure;
fig. 7 is a schematic structural diagram of a table data type conversion apparatus according to another embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an electronic device according to another embodiment of the present disclosure.
Detailed Description
For a more clear understanding of the technical features, objects, and effects of the present disclosure, specific embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings.
Fig. 1 is a flowchart of a type conversion method of table data according to the present disclosure.
Specifically, the method for converting the type of the table data includes:
and S11, receiving the non-type electronic form, and extracting column data information and row header information of the non-type electronic form. And after receiving the untyped electronic form, automatically scanning the data of the untyped electronic form, and extracting column data information and row header information of the untyped electronic form. Scanning first row data of the untyped electronic form as row title information; data of each column other than the first row is scanned as column data information.
S12, determining the data type of the column data information, and determining the data type of the row header information. The data types include, but are not limited to, a date and time type, a date type, a link type, a number type, etc., the data types of the present embodiment include a date and time type, a date type, a link type, a number type, and the data types may be selectively set as needed, and are more or less than the data types of the present embodiment, but the present disclosure is contemplated and also falls within the protection scope of the present disclosure. The specific judgment process is explained below.
And S13, converting the non-typed spreadsheet into a typed database table according to the judgment result. And determining the data type of the column data information and the data type of the line title information by judgment, converting the column data information according to the data type of the column data information, and converting the line title information according to the data type of the line title information.
Referring to fig. 2, the column data information is to-be-converted data in each column except for the first row in the untyped spreadsheet, and determining the data type of the column data information includes:
and S21, converting the data to be converted by using the preset data type. The preset data types include, but are not limited to, a date and time type, a date type, a link type, a number type, etc., the present embodiment is schematically illustrated by the date and time type, the date type, the link type, and the number type, and more or less than the preset data types of the present embodiment are contemplated by the present disclosure and belong to the protection scope of the present disclosure. After the preset data types are set, the preset execution sequence of the preset data types is set at the same time. For example, the preset execution sequence of the preset data types in this embodiment is: the data conversion method comprises the following steps of converting data to be converted by using preset data types, wherein the data to be converted comprises the following steps: and converting the data to be converted by using the preset data types according to the preset execution sequence. Firstly, converting data to be converted by using date and time types; if the conversion fails, converting the data to be converted by using the time type; if the conversion fails, converting the data to be converted by using the date type; if the conversion fails, converting the data to be converted by using the link type; if the conversion fails, the data to be converted is converted by using the digital type. In the above conversion process, if a certain conversion is successful, the remaining conversion of the preset data type is not executed.
And S22, if the conversion is successful, determining the data type of the data to be converted. In the process of converting the data to be converted by using the preset data type, if the data is successfully converted by using a certain preset data type, determining the data type of the data to be converted as the preset data type. For example, in the above conversion process, if the conversion of the data to be converted using the date and time type is successful, it is determined that the data type of the data to be converted is the date and time type; if the data to be converted is successfully converted by using the link type, determining the data type of the data to be converted as the link type; and so on.
After the conversion in step S21, if all the conversion of the preset data types fails, it indicates that the data type of the data to be converted is not within the preset data types. Referring to fig. 3, the method performs the following steps:
and S23, judging whether the data to be converted is of a single-selection type or a multi-selection type according to the probability of the data to be converted.
And S31, determining the data type of the data to be converted as a single-selection type or a multi-selection type if the probability of the data to be converted meets the requirements of the single-selection type or the multi-selection type.
And S32, judging that the data type of the data to be converted is a text type if the probability of the data to be converted does not meet the requirement of the single-choice type or the multi-choice type.
Specifically, referring to fig. 4, determining whether the data to be converted is of the single-choice type or the multiple-choice type according to the probability of the data to be converted includes:
s231, judging whether the first probability of the data to be converted is greater than a first preset probability. The first probability obtaining process of the data to be converted is as follows:
scanning full-column data of the untyped electronic form, and removing the duplication, wherein m rows are formed before the duplication is removed, and n rows are formed after the duplication is removed;
the first probability of the data to be converted is: p is 1-n/m.
S232, if the first probability of the data to be converted is greater than the first preset probability, the data to be converted is of a single-selection type or a multi-selection type. Whether the data to be converted is of a single-selection type or a multi-selection type needs to be further judged, namely whether the second probability of the data to be converted is greater than the second preset probability is judged. The second probability obtaining process of the data to be converted is as follows:
scanning the full-column data of the untyped spreadsheet and removing the duplication;
the data to be converted after the duplication removal is divided into a set A and a set B according to whether the data to be converted contains the separators or not, wherein the set A contains the separators, and the set B does not contain the separators;
dividing each line of characters in the set A according to separators to obtain a set C consisting of the characters;
calculating the intersection D of the set B and the set C;
the second probability of the data to be converted is:
(size(D)/size(B)+size(D)/size(A))/2。
the first preset probability and the second preset probability are obtained by training:
the first preset probability: given a set of data each column being of a single choice type, the number of rows exceeding 50 and the number of columns exceeding 1000, the average of the probabilities is calculated.
The second preset probability: given a set of data with multiple choice types per column, with a number of rows exceeding 50 and a number of columns exceeding 1000, the average of the probabilities is calculated.
Alternatively, a machine learning algorithm (e.g., naive bayes, decision trees) can be introduced, a model is obtained through a set of training sets, and the model is used to predict the type of data.
And S233, if the second probability of the data to be converted is greater than the second preset probability, the data type of the data to be converted is a multi-choice type.
And S234, if the second probability of the data to be converted is not greater than the second preset probability, the data type of the data to be converted is a single-choice type.
Referring to fig. 5, if the first probability of the data to be converted is not greater than the first preset probability, the data type of the data to be converted is a text type;
s321, judging whether the character of the data to be converted is larger than a preset length;
s322, if the characters of the data to be converted are larger than the preset length, the data to be converted is in a long text type;
s323, if the character of the data to be converted is not larger than the preset length, the data to be converted is in a short text type.
Referring to fig. 6, the judgment of the data type of the line header information in step S12 includes:
s121, after the data type of the column data information is obtained, whether each column of the first row of data to be converted is matched with the data type of the column data information is judged;
s122, if each column of the first row of data to be converted is matched with the data type of the column data information, the first row of data to be converted is a data row; and converting the data to be converted in the first line according to the data line attribute.
S123, if the data types of the columns and the row data information of the first row of data to be converted are not matched, the first row of data to be converted is a header row; and converting the data to be converted in the first line according to the attribute of the title line.
Referring to fig. 7, the present disclosure also provides a type conversion apparatus of table data, the type conversion apparatus 70 including:
the receiving unit 701 is configured to receive the non-type electronic form and extract column data information and row header information of the non-type electronic form;
a first determining unit 702, configured to determine a data type of the column data information and a data type of the row header information;
a converting unit 703, configured to convert the non-typed spreadsheet into a typed database table according to the determination result.
Further, in the table data type conversion device of the present disclosure, the column data information is data to be converted in each column except for the first row in the untyped electronic table;
the judging unit includes: a second judging unit for converting the data to be converted using a preset data type; and if the conversion is successful, determining the data type of the data to be converted.
Further, the type conversion device for table data of the present disclosure, the preset data type includes at least one of a date and time type, a date type, a link type, and a number type, and sets a preset execution order of the preset data type;
the second judgment unit includes: and the third judging unit is used for converting the data to be converted by using the preset data types according to the preset execution sequence.
Further, the type conversion apparatus for table data of the present disclosure further includes:
the fourth judging unit is used for judging whether the data to be converted is of a single-selection type or a multi-selection type according to the probability of the data to be converted if the conversion fails; and if so, determining the data type of the data to be converted to be a single-selection type or a multi-selection type.
Further, the determining, according to the probability of the data to be converted, whether the type of the table data is a single-choice type or a multiple-choice type includes:
the fifth judging unit is used for judging whether the first probability of the data to be converted is greater than the first preset probability or not;
if the first probability of the data to be converted is greater than the first preset probability, executing a sixth judging unit: judging whether a second probability of the data to be converted is greater than a second preset probability; if so, the data type of the data to be converted is a multi-selection type; if not, the data type of the data to be converted is a single selection type.
Further, in the apparatus for converting a type of table data of the present disclosure, a first probability obtaining process of data to be converted is: scanning full-column data of the untyped electronic form, and removing the duplication, wherein m rows are formed before the duplication is removed, and n rows are formed after the duplication is removed; the first probability of the data to be converted is: p is 1-n/m.
Further, in the apparatus for converting a type of table data of the present disclosure, a second probability obtaining process of data to be converted is: scanning the full-column data of the untyped spreadsheet and removing the duplication; the data to be converted after the duplication removal is divided into a set A and a set B according to whether the data to be converted contains the separators or not, wherein the set A contains the separators, and the set B does not contain the separators; dividing each line of characters in the set A according to separators to obtain a set C consisting of the characters; calculating the intersection D of the set B and the set C;
the second probability of the data to be converted is:
(size(D)/size(B)+size(D)/size(A))/2。
further, according to the type conversion device for table data, if the first probability of the data to be converted is not greater than the first preset probability, the data type of the data to be converted is a text type;
a seventh judging unit, configured to judge whether a character of the data to be converted is greater than a preset length; if yes, converting the data to be converted into a long text type; if not, the data to be converted is of a short text type.
Further, the type conversion apparatus of table data of the present disclosure, the first judgment unit includes:
the eighth judging unit is configured to judge whether each column of the first row of data to be converted matches the data type of the column data information after the data type of the column data information is obtained; if so, the first row of data to be converted is a data row; if not, the data to be converted in the first line is the title line.
In addition, the present disclosure also provides a computer storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing the method for type conversion of table data as described above.
Referring to fig. 8, the present disclosure also provides an electronic device comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is used for executing the computer program to realize the type conversion method of the table data.
Referring now to FIG. 8, shown is a schematic diagram of an electronic device 800 suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 8, an electronic device 800 may include a processing means (e.g., central processing unit, graphics processor, etc.) 801 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage means 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the electronic apparatus 800 are also stored. The processing apparatus 801, the ROM802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
Generally, the following devices may be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 807 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage 808 including, for example, magnetic tape, hard disk, etc.; and a communication device 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 8 illustrates an electronic device 800 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 809, or installed from the storage means 808, or installed from the ROM 802. The computer program, when executed by the processing apparatus 801, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure may be a computer readable signal medium or a computer storage medium or any combination of the two. A computer storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of computer storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
By implementing the method and the device, when data of the typeless electronic table is imported, the value type of the related column is automatically defined according to the characteristics of the existing data, one-key import is realized, manual operation is effectively reduced, and therefore efficiency is improved.
The above embodiments are merely illustrative of the technical concepts and features of the present disclosure, and are intended to enable those skilled in the art to understand the present disclosure and implement the present disclosure, and not to limit the scope of the present disclosure. All equivalent changes and modifications made within the scope of the claims of the present disclosure are intended to be covered by the claims of the present disclosure.

Claims (14)

1. A method for converting the type of tabular data, comprising:
receiving a non-type electronic form, and extracting column data information and row header information of the non-type electronic form;
judging the data type of the column data information and judging the data type of the row header information;
the judging the data type of the column data information comprises: converting the data to be converted by using a preset data type;
if the conversion is successful, determining the data type of the data to be converted;
if the conversion is failed, judging whether the data to be converted is a single-choice type or a multi-choice type according to the probability of the data to be converted, wherein the probability of the data to be converted is the probability that the data to be converted belongs to a specific data type;
if so, determining the data type of the data to be converted to be a single-selection type or a multi-selection type;
converting the untyped spreadsheet into a typed database table according to a judgment result;
the judging whether the data to be converted is of a single-selection type or a multiple-selection type according to the probability of the data to be converted comprises the following steps: judging whether the first probability of the data to be converted is greater than a first preset probability or not; if so, further judging whether a second probability of the data to be converted is greater than a second preset probability; if so, the data type of the data to be converted is a multi-selection type; and if not, the data type of the data to be converted is a single selection type.
2. The type conversion method of form data according to claim 1, wherein the preset data type includes at least one of a date and time type, a date type, a link type, a number type; the method further comprises the following steps:
setting a preset execution sequence of the preset data types;
the converting the data to be converted by using the preset data type comprises: and converting the data to be converted by using a preset data type according to the preset execution sequence.
3. The method for converting the type of tabular data according to claim 1, wherein said first probability obtaining process of the data to be converted is:
scanning the full-column data of the typeless electronic form, and removing the duplication, wherein m rows are adopted before the duplication is removed, and n rows are adopted after the duplication is removed;
the first probability of the data to be converted is: p is 1-n/m.
4. The method according to claim 1, wherein the second probability obtaining process of the data to be converted is:
scanning the full-column data of the typeless electronic form and performing duplicate removal;
dividing the data to be converted after the duplication removal into a set A and a set B according to whether the data to be converted contains a separator, wherein the set A contains the separator, and the set B does not contain the separator;
dividing each line of characters in the set A according to separators to obtain a set C consisting of characters;
calculating the intersection D of the set B and the set C;
the second probability of the data to be converted is:
(size(D)/size(B)+size(D)/size(A))/2。
5. the method for converting the type of tabular data according to claim 1, further comprising:
if the first probability of the data to be converted is not greater than a first preset probability, the data type of the data to be converted is a text type;
judging whether the characters of the data to be converted are larger than a preset length or not;
if yes, the data to be converted is of a long text type;
if not, the data to be converted is of a short text type.
6. The method of converting a type of tabular data according to claim 1, wherein said judging a data type of said row header information comprises:
after the data type of the line data information is obtained, judging whether each line of the first row of data to be converted is matched with the data type of the line data information;
if so, the first row of data to be converted is a data row;
if not, the data to be converted in the first line is the title line.
7. A type conversion apparatus of table data, comprising:
the receiving unit is used for receiving the non-type electronic form and extracting column data information and row header information of the non-type electronic form;
the first judging unit is used for judging the data type of the column data information and judging the data type of the row header information;
a second judging unit for converting the data to be converted using a preset data type; if the conversion is successful, determining the data type of the data to be converted;
a fourth judging unit, configured to judge whether the data to be converted is a single-choice type or a multiple-choice type according to a probability of the data to be converted if the conversion fails, where the probability of the data to be converted is a probability that the data to be converted belongs to a specific data type; if so, determining the data type of the data to be converted to be a single-selection type or a multi-selection type;
the conversion unit is used for converting the untyped electronic form into a typed database form according to a judgment result;
a fifth judging unit, configured to judge whether the first probability of the data to be converted is greater than a first preset probability; if yes, executing a sixth judging unit: judging whether a second probability of the data to be converted is greater than a second preset probability; if so, the data type of the data to be converted is a multi-selection type; and if not, the data type of the data to be converted is a single selection type.
8. The apparatus for converting the type of form data according to claim 7, wherein the preset data type includes at least one of a date and time type, a date type, a link type, a number type; setting a preset execution sequence of the preset data types;
the second determination unit includes: and the third judging unit is used for converting the data to be converted by using a preset data type according to the preset execution sequence.
9. The apparatus for converting a type of tabular data according to claim 7, wherein said first probability obtaining process of the data to be converted is:
scanning the full-column data of the typeless electronic form, and removing the duplication, wherein m rows are adopted before the duplication is removed, and n rows are adopted after the duplication is removed;
the first probability of the data to be converted is: p is 1-n/m.
10. The apparatus for converting a type of tabular data according to claim 7, wherein said second probability obtaining process for the data to be converted is:
scanning the full-column data of the typeless electronic form and performing duplicate removal;
dividing the data to be converted after the duplication removal into a set A and a set B according to whether the data to be converted contains a separator, wherein the set A contains the separator, and the set B does not contain the separator;
dividing each line of characters in the set A according to separators to obtain a set C consisting of characters;
calculating the intersection D of the set B and the set C;
the second probability of the data to be converted is:
(size(D)/size(B)+size(D)/size(A))/2。
11. the apparatus for converting types of tabular data according to claim 7, wherein if the first probability of the data to be converted is not greater than a first preset probability, the data type of the data to be converted is a text type; the device further comprises:
a seventh judging unit, configured to judge whether a character of the data to be converted is greater than a preset length; if yes, the data to be converted is of a long text type; if not, the data to be converted is of a short text type.
12. The apparatus for converting a type of tabular data according to claim 7, wherein said first judging unit includes:
the eighth judging unit is configured to judge whether each column of the first row of data to be converted matches the data type of the column of data information after the data type of the column of data information is obtained; if so, the first row of data to be converted is a data row; if not, the data to be converted in the first line is the title line.
13. A computer storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing a method of type conversion of tabular data according to any one of claims 1 to 6.
14. An electronic device comprising a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program to implement the method of converting a type of tabular data according to any one of claims 1 to 6.
CN201811183314.4A 2018-10-11 2018-10-11 Type conversion method and device of table data, storage medium and electronic equipment Active CN109543154B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811183314.4A CN109543154B (en) 2018-10-11 2018-10-11 Type conversion method and device of table data, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811183314.4A CN109543154B (en) 2018-10-11 2018-10-11 Type conversion method and device of table data, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN109543154A CN109543154A (en) 2019-03-29
CN109543154B true CN109543154B (en) 2021-07-23

Family

ID=65843876

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811183314.4A Active CN109543154B (en) 2018-10-11 2018-10-11 Type conversion method and device of table data, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN109543154B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051321B (en) * 2019-12-26 2024-05-28 航天信息股份有限公司 Data importing method, device, equipment and storage medium
CN111832268A (en) * 2020-06-30 2020-10-27 北京印象笔记科技有限公司 Information interaction method, readable storage medium and electronic device
CN113204555B (en) * 2021-05-21 2023-10-31 北京字跳网络技术有限公司 Data table processing method, device, electronic equipment and storage medium
CN115168478B (en) * 2022-09-06 2022-11-29 深圳市明源云科技有限公司 Data type conversion method, electronic device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020089A (en) * 2011-09-27 2013-04-03 深圳市金蝶友商电子商务服务有限公司 Method and device for importing data in EXCEL file to database
CN103970736A (en) * 2013-01-25 2014-08-06 苏州精易会信息技术有限公司 Method for converting Excel sheet to database table
CN104182382A (en) * 2013-05-21 2014-12-03 北大方正集团有限公司 Method and system for achieving table standardization
CN107291674A (en) * 2017-06-12 2017-10-24 广东川田卫生用品有限公司 A kind of method that Excel list datas are converted to database format
CN108491510A (en) * 2018-03-22 2018-09-04 平安科技(深圳)有限公司 Excel data import method, apparatus, computer equipment and the storage medium of database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010015554A (en) * 2008-06-03 2010-01-21 Just Syst Corp Table structure analysis device, table structure analysis method, and table structure analysis program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020089A (en) * 2011-09-27 2013-04-03 深圳市金蝶友商电子商务服务有限公司 Method and device for importing data in EXCEL file to database
CN103970736A (en) * 2013-01-25 2014-08-06 苏州精易会信息技术有限公司 Method for converting Excel sheet to database table
CN104182382A (en) * 2013-05-21 2014-12-03 北大方正集团有限公司 Method and system for achieving table standardization
CN107291674A (en) * 2017-06-12 2017-10-24 广东川田卫生用品有限公司 A kind of method that Excel list datas are converted to database format
CN108491510A (en) * 2018-03-22 2018-09-04 平安科技(深圳)有限公司 Excel data import method, apparatus, computer equipment and the storage medium of database

Also Published As

Publication number Publication date
CN109543154A (en) 2019-03-29

Similar Documents

Publication Publication Date Title
CN109543154B (en) Type conversion method and device of table data, storage medium and electronic equipment
CN110969012B (en) Text error correction method and device, storage medium and electronic equipment
CN109656923B (en) Data processing method and device, electronic equipment and storage medium
CN113204555B (en) Data table processing method, device, electronic equipment and storage medium
CN110413413A (en) A kind of method for writing data, device, equipment and storage medium
CN111597107B (en) Information output method and device and electronic equipment
CN111680761B (en) Information feedback method and device and electronic equipment
CN112784112A (en) Message checking method and device
CN116894188A (en) Service tag set updating method and device, medium and electronic equipment
CN111680799A (en) Method and apparatus for processing model parameters
CN111641690B (en) Session message processing method and device and electronic equipment
CN110852042A (en) Character type conversion method and device
CN113807056B (en) Document name sequence error correction method, device and equipment
CN113190460B (en) Automatic test case generation method and device
CN112541548B (en) Method, device, computer equipment and storage medium for generating relational network
CN111737572B (en) Search statement generation method and device and electronic equipment
CN105991400B (en) Group searching method and device
CN110928428B (en) Method, device, medium and electronic equipment for inputting electronic mail information
CN113868400A (en) Method and device for responding to digital human questions, electronic equipment and storage medium
CN110555070B (en) Method and apparatus for outputting information
CN113393288A (en) Order processing information generation method, device, equipment and computer readable medium
CN110990528A (en) Question answering method and device and electronic equipment
CN111339770A (en) Method and apparatus for outputting information
CN112131484A (en) Multi-person session establishing method, device, equipment and storage medium
CN110765764B (en) Text error correction method, electronic device, and computer-readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant