CN112860780B - Data export method and device and terminal equipment - Google Patents

Data export method and device and terminal equipment Download PDF

Info

Publication number
CN112860780B
CN112860780B CN202110341385.8A CN202110341385A CN112860780B CN 112860780 B CN112860780 B CN 112860780B CN 202110341385 A CN202110341385 A CN 202110341385A CN 112860780 B CN112860780 B CN 112860780B
Authority
CN
China
Prior art keywords
data
server
target data
target
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110341385.8A
Other languages
Chinese (zh)
Other versions
CN112860780A (en
Inventor
雷经纬
罗响
熊辉
杨丽萦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110341385.8A priority Critical patent/CN112860780B/en
Publication of CN112860780A publication Critical patent/CN112860780A/en
Application granted granted Critical
Publication of CN112860780B publication Critical patent/CN112860780B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The specification provides a data export method, a data export device and terminal equipment. Based on the method, when a user needs to export ordered target data from a distributed database according to a preset ordering rule, a data export request carrying a data identifier of the target data to be exported, a table identifier of a target source table where the target data is located and ordering indication parameters can be generated and sent to a first server responsible for managing the distributed database through a terminal device configured with a data export device; then, the terminal device can conduct multiple data interactions with the first server and the second server responsible for providing the data transfer service according to the preset interaction rules, so that a user can obtain ordered target data which meets the user requirements and is derived from the distributed database more efficiently and conveniently through the terminal device, and the ordered target data is ordered according to the preset ordering rules, user operation can be simplified, and user experience is improved.

Description

Data export method and device and terminal equipment
Technical Field
The specification belongs to the technical field of big data processing, and particularly relates to a data export method, a device and terminal equipment.
Background
In many data processing scenarios (e.g., statistical analysis scenarios of user historical transaction data), users often need to first derive target data that meets requirements from a distributed database that stores large amounts of data. However, based on the existing method, the target data directly obtained by the user is often unordered, and then the user also needs to sort the target data by himself or herself, so that the ordered target data meeting the requirements can be obtained. Furthermore, the user can perform the subsequent specific data processing according to the ordered target data.
Therefore, when the prior method is implemented, the user cannot directly and efficiently derive the ordered target data meeting the user requirements from the distributed database, and the ordered target data is ordered according to the preset ordering rule, so that the technical problems of complicated user operation, low processing efficiency and poor user experience exist.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The specification provides a data export method, a data export device and terminal equipment, so that a user can obtain target data which meets the user requirements and is exported from a distributed database more efficiently and conveniently, and the target data is ordered according to a preset ordering rule, thereby improving the use experience of the user.
The specification provides a data export method, comprising:
Transmitting a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
Under the condition that the ordering indication parameters indicate that target data are to be exported in order according to a preset ordering rule, a first data interaction is carried out with a first server, a target source table in a distributed database is inquired, and a first temporary table is built in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule;
Establishing a first appearance in the distributed database by performing a second data interaction with a first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance;
Through third data interaction with the second server, sorting the target data in the preset catalogue according to the serial number to obtain sorted target data;
And receiving the ordered target data fed back by the second server.
In one embodiment, the ordering indication parameter further includes a preset ordering rule set by a user.
In one embodiment, querying a target source table in a distributed database by performing a first data interaction with a first server, and establishing a first temporary table in the distributed database, includes:
obtaining table structure information of a target source table;
based on the table structure information of the target source table, adding a list of sequence number fields to obtain the table structure information of a first temporary table;
generating a first creation instruction about a first temporary table according to the table structure information of the first temporary table; the first creating instruction carries table structure information of a first temporary table;
Submitting the first creation instruction to the first server; the first server responds to the first creation instruction to create an empty table of a first temporary table in the distributed database; the first server queries a target source table according to the data export request; and the first server sorts the target data inquired from the target source table according to a preset sorting rule, and writes the target data and the serial number corresponding to the target data into an empty table of the first temporary table to obtain the first temporary table.
In one embodiment, after querying a target source table in a distributed database by a first data interaction with a first server and establishing a first temporary table in the distributed database, the method further comprises:
The first server releases an operation lock arranged on a target source table; the operation lock is set by the first server when the target source table is queried according to the data export request.
In one embodiment, a first appearance is established in the distributed database by a second data interaction with a first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset catalog of a second server in parallel based on the first temporary table according to the first appearance, and the method comprises the following steps:
Generating a second creation instruction about the first appearance according to the table structure information of the first temporary table;
submitting the second creation instruction to the first server; wherein the first server establishes a first appearance associated with the first temporary table in the distributed database in response to the second creation instruction; the first look stores parallel derived parameters.
In one embodiment, the parallel derived parameters include at least one of: separator, linefeed, identification information of node servers participating in parallel export, port configuration information associated with parallel export.
In one embodiment, by performing third data interaction with the second server, sorting the target data in the preset directory according to the serial number, to obtain sorted target data, including:
generating a sequencing request according to a preset protocol rule;
sending the ordering request to the second server; and the second server responds to the sorting request and calls a preset sorting script file to sort the target data in the preset catalogue according to the serial number, so as to obtain sorted target data.
In one embodiment, the preset sequencing script file includes a script file encapsulated with a start command.
In one embodiment, after sorting the target data in the preset catalog according to the serial number by performing third data interaction with the second server to obtain sorted target data, the method further includes:
generating a formatting request according to a preset protocol rule;
sending the formatting request to the second server; and the second server responds to the formatting request and calls a preset formatting script file to perform formatting processing on the ordered target data so as to obtain the ordered target data meeting the preset format requirement.
In one embodiment, after sorting the target data in the preset catalog according to the serial number by performing third data interaction with the second server to obtain sorted target data, the method further includes:
Generating a verification request;
Sending the verification request to the second server; the second server responds to the verification request and counts the number of the ordered target data and the number of lines of a first temporary table; and checking whether the ordered target data meets the preset quality requirement according to the number of the ordered target data and the number of the rows of the first temporary table.
In one embodiment, in case the ordering indication parameter does not indicate that the target data is to be derived in order according to a preset ordering rule, the method further comprises:
Querying a target source table in a distributed database by performing fourth data interaction with the first server, and establishing a second temporary table in the distributed database; the second temporary table comprises inquired target data;
Establishing a second appearance in the distributed database by performing a fifth data interaction with the first server; the first server coordinates a plurality of node servers in the distributed cluster to write target data into a preset directory of a second server in parallel based on the second temporary table according to the second appearance;
And receiving target data fed back by the second server.
The present disclosure also provides a data export method applied to a first server, including:
Receiving a data export request; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
Determining whether target data are to be exported in sequence according to a preset ordering rule according to the ordering indication parameters;
Under the condition that target data are determined to be exported in sequence according to a preset ordering rule, inquiring a target source table in a distributed database by performing first data interaction, and establishing a first temporary table in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule;
Establishing a first appearance in the distributed database by performing a second data interaction;
And according to the first appearance, a plurality of node servers in the coordinated distributed cluster write target data carrying the serial numbers into a preset directory of a second server in parallel based on the first temporary table.
The present specification also provides a data deriving apparatus comprising:
the sending module is used for sending a data export request to the first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
The first interaction module is used for inquiring a target source table in the distributed database and establishing a first temporary table in the distributed database by carrying out first data interaction with a first server under the condition that the ordering indication parameter indicates that target data is to be exported in sequence according to a preset ordering rule; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule;
The second interaction module is used for establishing a first appearance in the distributed database through second data interaction with the first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance;
the third interaction module is used for sorting the target data in the preset catalogue according to the serial number by carrying out third data interaction with the second server to obtain sorted target data;
And the receiving module is used for receiving the ordered target data fed back by the second server.
The present specification also provides a terminal device comprising a processor, data deriving means, and a memory for storing instructions executable by the processor, said processor implementing the relevant steps of the data deriving method when executing said instructions.
The present specification also provides a computer readable storage medium having stored thereon computer instructions which when executed implement the relevant steps of the data export method.
When a user needs to derive ordered target data from a distributed database according to a preset ordering rule, the data derivation method, device and terminal equipment provided by the specification can generate and send a data identifier carrying target data to be derived, a table identifier of a target source table where the target data is located and a data derivation request of ordering indication parameters to a first server responsible for managing the distributed database through the terminal equipment provided with the data derivation device; then, the terminal equipment can firstly inquire a target source table in the distributed database by carrying out first data interaction with the first server, and establish a first temporary table containing inquired target data and sequence numbers obtained after sequencing the target data based on a preset sequencing rule in the distributed database; then, the terminal equipment can establish a first appearance in the distributed database by performing second data interaction with the first server; furthermore, the first server can coordinate a plurality of node servers in the distributed cluster to efficiently write the target data carrying the serial number into a preset catalog of the second server responsible for providing the data transfer service in a parallel manner based on the first temporary table according to the first appearance; further, the terminal equipment performs third data interaction with the second server, sorts the target data in the preset catalogue according to the sequence number carried by the target data, and obtains ordered target data after sorting, and feeds the ordered target data back to the terminal equipment. Therefore, the user can obtain the ordered target data which meets the user requirements from the distributed database more efficiently and conveniently through the terminal equipment, and the ordered target data is ordered according to the preset ordering rule, so that the operation of the user is simplified, and the use experience of the user is improved.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure, the drawings that are required for the embodiments will be briefly described below, in which the drawings are only some of the embodiments described in the present disclosure, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of one embodiment of the structural composition of a system to which the data export method provided by the embodiments of the present specification is applied;
FIG. 2 is a flow chart of a data export method according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of a data export method according to an embodiment of the present disclosure;
fig. 4 is a schematic structural composition diagram of a terminal device provided in an embodiment of the present disclosure;
FIG. 5 is a schematic diagram showing the structural composition of a data deriving device according to one embodiment of the present disclosure;
FIG. 6 is a schematic diagram of one embodiment of a data export method provided by embodiments of the present disclosure, in one example scenario;
FIG. 7 is a schematic diagram of one embodiment of a data export method provided by embodiments of the present disclosure, in one example scenario;
Fig. 8 is a schematic diagram of an embodiment of a data export method provided by the embodiments of the present disclosure, in one scenario example.
Detailed Description
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
Considering that the required data is often difficult to be efficiently and orderly exported from the distributed database based on the existing data export method, the operation burden of the user is increased, and the use experience of the user is also affected.
For the root cause of the above problem, the present specification considers that when a user needs to export ordered target data from a distributed database according to a preset ordering rule, a data identifier carrying target data to be exported, a table identifier of a target source table where the target data is located, and a data export request of ordering indication parameters may be generated by a terminal device and sent to a first server responsible for managing the distributed database. And then, the terminal equipment responds to the data export request and respectively performs first data interaction and second data interaction with the first server, so that the first server can coordinate a plurality of node servers in the distributed cluster to efficiently write target data carrying the serial numbers determined according to the preset ordering rules into a preset catalog of the second server responsible for providing the data transfer service in a parallel mode based on a first temporary table established in a distributed database. Further, the terminal equipment can perform third data interaction with the second server so as to sort the target data in the preset catalog according to a preset sorting rule and the serial number carried by the target data at the side of the second server, obtain ordered target data after sorting, and feed the ordered target data back to the terminal equipment. Therefore, the operation of the user can be simplified, and the terminal equipment can efficiently and conveniently derive the ordered target data which meets the user requirement from the distributed database and is ordered according to the preset ordering rule.
The embodiment of the specification provides a data export method which can be particularly applied to a system comprising a distributed cluster and terminal equipment. Specifically, reference may be made to fig. 1. The distributed cluster may specifically include a plurality of node servers based on distributed computing.
Wherein a first server (which may be referred to as a control Node, CN Node) of the plurality of Node servers is responsible for managing, controlling, and coordinating the Node servers in the distributed cluster. The first server may not be used to store data in the distributed database. A second server of the plurality of node servers is responsible for coordinating the data export process involving the distributed database and providing data transfer services.
The distributed database may specifically be a relational database that employs a large-scale Processing architecture (MASSIVE PARALLEL Processing, MPP). The Data of each table in the distributed database may be stored in each Node server (may be referred to as a physical Node, a Data Node, and a DN Node) in the distributed cluster in a distributed manner according to a certain rule (for example, a hash redundancy rule, etc.).
The terminal device may be configured or installed with a device or a program (e.g., a data deriving device or the like) supporting the data deriving service in advance. The terminal equipment provided with the data export device can orderly export the data needed by the user from the distributed database through multiple data interaction with the first server and the second server, and provide the ordered target data for the user.
In this embodiment, the first server and the second server may specifically include a background server applied to a side of the data processing system and capable of implementing functions such as data transmission and data processing. Specifically, the first server and the second server may be, for example, an electronic device having a data operation function, a storage function, and a network interaction function. Or the first server and the second server may be software programs running in the electronic device and supporting data processing, storage and network interaction. In the present embodiment, the number of servers included in the first server and the second server is not particularly limited. The first server and the second server may be one server, or may be several servers, or a server cluster formed by several servers.
In this embodiment, the terminal device may specifically include a front-end device applied to a user side and capable of implementing functions such as data acquisition and data transmission. Specifically, the terminal device may be, for example, a desktop computer, a tablet computer, a notebook computer, a smart phone, etc. Or the terminal device may be a software application capable of running in the electronic device described above. For example, an application running on a desktop computer, or the like.
When the user needs to derive the needed ordered target data from the distributed database according to the preset ordering rule, the user can set the ordering indication parameters through the terminal equipment.
Specifically, the user may perform the setting operation on the terminal device. Specifically, the user can set whether to sequentially export the target data according to a preset ordering rule by performing setting operation on the terminal equipment; and a preset ordering rule (for example, ordering according to the order of the number of strokes of the data name from large to small) according to the ordering when ordering is used as an ordering indication parameter. In addition, the user may set the data identifier of the target data to be exported (for example, the data name of the target data to be exported, etc.) and the table identifier of the target source table in which the target data is located (for example, the table name of the target source table, etc.) by performing a setting operation on the terminal device.
After the user completes the setting operation, the user may perform a confirmation operation on the terminal device. Correspondingly, the terminal equipment can respond to the confirmation operation to generate a data export request carrying at least the data identifier of the target data, the table identifier of the target source table and the ordering indication parameter, and the data export request is sent to the first server.
And the first server receives the data export request and acquires the ordering indication parameters carried in the data export request through data analysis. The first server may determine, according to the ranking indication parameter, whether to derive the target data in sequence according to a preset ranking rule.
In the case of determining that the target data is to be derived in order according to a preset ordering rule, the first server may extract the preset ordering rule according to which it is based from the ordering indication parameters. Further, the server may also send a prompt to the terminal device regarding the first data interaction to trigger the first data interaction with the terminal device.
The terminal device may then query the target source table in the distributed database by performing a first data interaction with the first server and build a first temporary table in the distributed database. The first temporary table may specifically include target data queried during the first data interaction process, and a sequence number corresponding to preset target data obtained after the target data is sequenced based on a preset sequencing rule.
Specifically, the terminal device may first obtain table structure information of the target source table provided by the first server based on the table identifier of the target source table; and adding a list of sequence number fields based on the table structure information of the target source table to obtain the table structure information of the first temporary table different from the target source table. The terminal device may generate a first creation instruction (for example, DDL statement) regarding the first temporary table according to the table structure information of the first temporary table; and submitting the first creation instruction to the first server.
Accordingly, the first server may respond to the first creation instruction; and creating an empty table of the first temporary table in the distributed database according to the table structure information of the first temporary table carried by the first creation instruction. Meanwhile, the first server can also query the target source table in the distributed database according to the data identification of the target data carried in the data export request and the table identification of the target source table so as to obtain a plurality of target data; the first server sorts the plurality of target data according to a preset sorting rule so as to determine the serial number of each target data. Further, the first server may write the queried target data into the empty table of the first temporary table, and write the sequence number of each target data into the newly added sequence number field of the row where the target data is located, so as to establish and obtain the corresponding first temporary table. The first temporary table may include a plurality of rows of data, where each row of data includes target data and a sequence number of the target data.
The terminal device may then establish the first appearance in the distributed database by performing a second data interaction with the first server. The first table may be understood as a communication configuration table for guiding the data in the first temporary table to be exported to the distributed database in a parallel manner. The first table may specifically include a plurality of parallel export parameters such as a separator, a linefeed, identification information of a node server participating in parallel export, port configuration information associated with parallel export, and the like.
Specifically, the terminal device may generate a second creation instruction (for example, DML sentence) regarding the first appearance according to the table structure information of the first temporary table; and submitting the second creation instruction to the first server.
Accordingly, the first server may establish a first appearance associated with the first temporary table in the distributed database in response to the second creation instruction. The first server may coordinate the plurality of node servers in the distributed cluster according to the first appearance, and write the target data carrying the serial number into the preset directory of the second server in a parallel manner according to the data recorded in the first temporary table. The first server may generate and send a prompt for a third data interaction to the terminal device to trigger the terminal device to perform the third data interaction with the second server, if it is determined that the target data contained in the first temporary table are all written into the preset directory.
Furthermore, the terminal device may sort the target data in the preset target path according to the sequence number carried by the target data by performing third data interaction with the second server, so as to obtain sorted target data (i.e. ordered target data).
Specifically, the terminal device may generate the ordering request according to a preset protocol rule (for example, SSH protocol or RPC communication protocol); and sending the sorting request to a second server. Correspondingly, the second server can respond to the sorting request and call a preset sorting script file (for example, a script file packaged with a sort command), sort the target data according to a serial number carried by the target data in a preset catalog, and obtain sorted target data; and feeding the sorted target data back to the terminal equipment.
And the terminal equipment receives and displays the sorted target data to a user.
Therefore, the user does not need to additionally sort the exported target data, and related data analysis or data statistics can be directly carried out by using the sorted target data.
For example, in a transaction risk prediction scenario, a user may predict a transaction risk for an account object according to a transaction record in which the account object is ordered according to a time sequence of transactions. For another example, in the regional sales statistics scenario, the user may perform statistical analysis on sales conditions of different regions of a company according to sales data of the company ordered according to the regional number sequence.
Through the system, the user can obtain the ordered target data which meets the user requirements and is derived from the distributed database by using the terminal equipment provided with the data deriving device more efficiently and conveniently, and the ordered target data are ordered according to the preset ordering rule, so that the operation of the user can be simplified, the use experience of the user is improved, and meanwhile, the data deriving efficiency is also improved.
Referring to fig. 2, an embodiment of the present disclosure provides a data export method. In particular implementations, the method may include the following.
S201: transmitting a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter.
S202: under the condition that the ordering indication parameters indicate that target data are to be exported in order according to a preset ordering rule, a first data interaction is carried out with a first server, a target source table in a distributed database is inquired, and a first temporary table is built in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule.
S203: establishing a first appearance in the distributed database by performing a second data interaction with a first server; and the first server coordinates a plurality of node servers in the distributed cluster to write the target data carrying the serial number into a preset directory of the second server in parallel based on the first temporary table according to the first appearance.
S204: and sorting the target data in the preset catalogue according to the serial number by performing third data interaction with the second server to obtain sorted target data.
S205: and receiving the ordered target data fed back by the second server.
The method described above is applicable to a terminal device in which a device or a program (e.g., a data exporting device) supporting a data exporting service is installed. And can also be applied to a server in which the above-described apparatus or program is installed. The present specification mainly uses a terminal device provided with a data deriving device as an example, and specifically describes a data deriving method provided. For the case of application to the server provided with the data deriving means, reference may be made to the embodiment applied to the terminal device. This description is not repeated.
Through the embodiment, a user can efficiently and conveniently obtain the target data which meets the user requirements and is derived from the distributed database and ordered according to the preset ordering rule only by carrying out simpler operation on the terminal equipment provided with the data deriving device, so that the operation of the user can be simplified, and the use experience of the user is improved.
In some embodiments, the user may perform a corresponding setting operation on the terminal device according to specific requirements, so as to set data identification of the target data to be exported, table identification of the target source table where the target data is located, and data such as ordering indication parameters. Correspondingly, the terminal device can collect and generate a data export request according to the data. The data export request at least carries the data identifier of the target data and the table of the target source table where the target data is located, and indicates the parameters in order.
In some embodiments, the target data may be specific data that is selected by the user to be exported from the distributed database. Specifically, the target data may be a transaction record of a certain account object in a certain period of time, or may be sales data of a certain enterprise object in a certain area, or the like. The target source table may be a data table for storing target data in a distributed database.
In some embodiments, the identification information of the target data may be a data identifier that indicates the target data. For example, a data name of the target data, a storage address of the target data, a number of the target data, and the like. The table identifier of the target source table may specifically be a data identifier capable of indicating the target source table. For example, the table name of the target source table, the storage address of the target source table, the number of the target source table, and the like.
In some embodiments, the above-mentioned ordering indication parameter may be used to indicate whether the target data needs to be derived in order according to a preset ordering rule.
In some embodiments, the data value of the above-mentioned ordering indication parameter may be 0 or null without the need to sequentially derive the target data according to a preset ordering rule. Correspondingly, when the first server detects that the data value of the ordering indication parameter carried by the data export request is 0 or is empty, it can be determined that the target data does not need to be exported in sequence according to a preset ordering rule, and at this time, the first server can generate and feed back a prompt about fourth data interaction to trigger an export process of the target data which does not involve ordering with the terminal device.
In some embodiments, in a case where the target data needs to be derived in order according to a preset ordering rule, the data value of the ordering indication parameter may be 1 or other non-0 value. Correspondingly, when the first server detects that the data value of the ordering indication parameter carried by the data export request is 1 or other non-0 values, it can be determined that the target data need to be exported in sequence according to a preset ordering rule, and at this time, the first server can generate and feed back a prompt about first data interaction to trigger the export process of the target data related to ordering with the terminal device.
In some embodiments, the ranking indication parameter may specifically further include a preset ranking rule set by a user.
Specifically, in the case that the target data needs to be derived in sequence according to a preset ordering rule, the ordering indication parameter may further carry a preset ordering rule for indicating how to order specifically.
The preset sorting rule may be a sorting rule customized by a user according to specific situations and processing requirements. For example, the preset sorting rule may be a sorting rule indicating that the number of strokes according to the name of the target data is sorted in order from large to small, a sorting rule indicating that the number of the region numbers corresponding to the target data is sorted in order from small to large, a sorting rule indicating that the number of the region numbers corresponding to the target data is sorted in order from large to small, or the like. Of course, the above listed preset ordering rules are only one illustrative. In specific implementation, other types of ordering rules can be included according to specific situations and processing requirements. The present specification is not limited to this.
Through the embodiment, the user can flexibly and custom set the proper preset ordering rule according to specific requirements, so that the derived target data can be correspondingly ordered according to the ordering rule custom set by the user, and the diversified derived requirements of the user can be met.
In some embodiments, when the setting operation is specifically performed, if the user determines that the target data does not need to be derived according to the sequence number according to the preset ordering rule according to specific requirements, the ordering indication icon may not be selected in the setting interface displayed by the terminal device, so as to complete the setting operation about the ordering indication parameter.
Conversely, if the user determines that the target data needs to be exported according to the sequence number according to the preset ordering rule according to the specific requirement, the ordering indication icon can be firstly selected in the setting interface displayed by the terminal device. Further, a preset ordering rule meeting the requirement is custom set in an input interface popped up after the ordering indication icon is selected, and the setting operation of ordering indication parameters is completed. Of course, the user may choose not to set the preset ordering rule, and the terminal device may randomly extract one ordering rule from the preset ordering rules as the preset ordering rule.
In some embodiments, in order to ensure that the subsequent data export process can be performed normally, the terminal device may also perform a pre-verification process before sending the data export request to the first server. The pre-verification process may specifically include one or more of the following verification processes: checking whether the data identification of the target data, the table identification of the target source table, the ordering indication parameters and other data formats set by a user meet the requirements or not; checking whether a data table matched with the table identification of the target source table exists in the distributed database; checking whether the network connection between the terminal equipment and the first server is normal or not; and whether the device or the program which is arranged on the transaction terminal equipment and supports the data export service is available or not. And the terminal equipment sends the data export request to the first server only when the terminal equipment confirms that the pre-verification is passed through the pre-verification processing. In contrast, when the terminal device confirms that at least one pre-verification fails through the pre-verification process, fault prompt information for the user can be generated to prompt the user to check and eliminate the corresponding fault.
In some embodiments, after receiving the data export request, the first server may first extract the ordering indication parameter from the data export request, and determine whether to export the target data in order according to a preset ordering rule according to the ordering indication parameter. In the case that the first server determines that the target data is to be exported in order according to the preset ordering rule, a prompt about the first data interaction may be generated and sent to the terminal device, so as to trigger the export process of the target data related to ordering with the terminal device. Conversely, the first server may generate and send a prompt to the terminal device regarding the fourth data interaction to trigger a derivation procedure with the terminal device of the target data that does not involve ordering, if it is determined that the target data is not to be derived in order according to the preset ordering rule.
In some embodiments, in the case of determining that the target data is to be exported in order according to the preset ordering rule, the foregoing steps of performing the first data interaction with the first server, querying the target source table in the distributed database, and establishing the first temporary table in the distributed database may include the following when implemented:
s1: obtaining table structure information of a target source table;
S2: based on the table structure information of the target source table, adding a list of sequence number fields to obtain the table structure information of a first temporary table;
S3: generating a first creation instruction about a first temporary table according to the table structure information of the first temporary table; the first creating instruction carries table structure information of a first temporary table;
S4: submitting the first creation instruction to the first server; the first server responds to the first creation instruction to create an empty table of a first temporary table in the distributed database; the first server queries a target source table according to the data export request; and the first server sorts the target data inquired from the target source table according to a preset sorting rule, and writes the target data and the serial number corresponding to the target data into an empty table of the first temporary table to obtain the first temporary table.
Through the embodiment, the terminal equipment provided with the data export device can interact with the first server through the first data, and the corresponding target data can be obtained while the target source table in the distributed database is queried; the target data can be sequenced according to a preset sequencing rule so as to determine a sequence number corresponding to the target data; and storing the inquired target data and the sequence number of the target data to obtain a first temporary table.
In some embodiments, when the first data interaction is specifically performed, the first server may first find the target source table in the distributed database according to the table identifier of the target source table, obtain table structure information of the target source table, and send the table structure information of the target source table to the terminal device. The terminal device may add a list of sequence number fields based on the table structure information of the target source table, for writing the sequence number of the target data object, so as to obtain the table structure information of the first temporary table different from the target source table. Further, the terminal device may generate a first creation instruction for instructing to create the first temporary table, for example, a corresponding DDL statement, according to the table structure information of the first temporary table. The first creating instruction carries table structure information of a first temporary table. The terminal device may submit the first creation instruction to the first server.
The first server may respond to the first creation instruction, and create an empty table of the first temporary table of the corresponding table structure in the distributed database according to the carried table structure information of the first temporary table. In addition, the first server may query the target source table according to the data identifier of the target data, and query the corresponding target data. Meanwhile, the first server can also determine the serial number corresponding to the target data by performing SQL-based sequencing operation on the queried target data according to a preset sequencing rule. Further, the first server may write the target data and the corresponding sequence number into an empty table of the first temporary table. Thus, a first temporary table can be established which stores target data and a serial number of the target data at the same time.
In some embodiments, after querying a target source table in a distributed database by performing a first data interaction with a first server and establishing a first temporary table in the distributed database, the method may further include, when implemented, the following: the first server releases an operation lock arranged on a target source table; the operation lock is set by the first server when the target source table is queried according to the data export request.
Through the embodiment, when the first server starts to query the target source table, the operation lock can be set on the target source table first, so that interference and influence caused by other instructions or operations on the operation of the target source table in the process of querying the target source table are avoided; after the query operation on the target source table is completed and the first temporary table is established, the first server can timely release the operation lock set before so that other instructions or operations can normally operate and process the target source table.
In some embodiments, the creating a first appearance in the distributed database by the second data interaction with the first server; the first server coordinates a plurality of node servers in the distributed cluster according to the first appearance, and writes target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table, and when the method is implemented, the method can comprise the following steps:
S1: generating a second creation instruction about the first appearance according to the table structure information of the first temporary table;
S2: submitting the second creation instruction to the first server; wherein the first server establishes a first appearance associated with the first temporary table in the distributed database in response to the second creation instruction; the first look stores parallel derived parameters.
By means of the above embodiment, the terminal device provided with the data deriving means may interact with the first server via the second data, creating a first look for guiding the subsequent derivation of the relevant data in the first temporary table in the distributed database in a parallel manner.
In some embodiments, the first look-up table is associated with a first temporary table. The first aspect described above may be understood in particular as a communication configuration table for guiding the export of data in a first temporary table from a distributed database in a parallel manner.
In some embodiments, the first table may specifically store parallel export parameters related to parallel exports. Specifically, the parallel derived parameters may include at least one of: separator, linefeed, identification information of node servers participating in parallel export, port configuration information associated with parallel export, etc.
In this way, the related data (for example, the target data carrying the serial number) in the first temporary table can be exported efficiently from the first table by using the parallel export parameters, and accurately and uniformly written into the preset directory of the second server.
In some embodiments, the second server may specifically be a node server in the distributed cluster specifically responsible for providing the data relay service. In some cases, the first server or the terminal device may also replace the second server to provide the corresponding data transfer service.
In some embodiments, after the first server establishes the first appearance, according to the first appearance, coordinating a plurality of node servers in the distributed cluster to export the target data carrying the serial number from the distributed database based on the data in the first temporary table in a parallel manner, and uniformly writing the target data into a preset directory in charge of providing the data transfer service.
The preset directory may be specifically understood as a file storage area set in the second server, where the file storage area is used for storing target data carrying a serial number.
It should be noted that, since the plurality of target data carrying the serial numbers are written by the plurality of node servers, respectively. Thus, the target data stored in the preset directory is arranged unordered. But different from conventionally derived target data, the target data stored in the preset directory also carries a corresponding serial number.
In some embodiments, the first server may generate and send a prompt about third data interaction to the terminal device under the condition that it is determined that all the target data carrying the serial number is exported from the distributed database and written into the preset directory of the second server, so as to trigger the terminal device to interact with the second server to obtain ordered target data obtained after ordering according to the preset ordering rule.
In some embodiments, the sorting the target data in the preset directory according to the sequence number by performing the third data interaction with the second server to obtain sorted target data may include the following when implemented:
S1: generating a sequencing request according to a preset protocol rule;
s2: sending the ordering request to the second server; and the second server responds to the sorting request and calls a preset sorting script file to sort the target data in the preset catalogue according to the serial number, so as to obtain sorted target data.
Through the above embodiment, the terminal device configured with the data deriving device may interact with the second server, so that in the preset directory, the target data is ordered according to the serial number carried by the target data, so as to obtain ordered target data after the ordering.
In some embodiments, the preset protocol rule may be an SSH protocol, an RPC communication protocol, or the like. Of course, the above listed preset protocol rules are only one illustrative example. In specific implementation, other types of protocols can be introduced as the preset protocol rules according to specific situations and processing requirements.
In some embodiments, the preset ordering script file may specifically include a script file encapsulated with a start command. The sort command may be specifically a command line program in a Unix operating system or a Unix-like operating system, based on which data may be sorted according to one or more designated sorting keys (e.g., serial numbers, etc.), and the command also has the characteristics of simple access manner, stable execution process, etc.
Before specific implementation, a script file packaged with a start command can be pre-constructed as a preset sequencing script file; and deploying the preset sequencing script file to a second server in advance.
Through the above embodiment, the terminal device may conveniently sort the target data in the preset directory of the second server through the preset sorting script file, so as to obtain the ordered target data after sorting according to the preset sorting rule.
In some embodiments, it is considered that the sorted target data is imported in parallel according to the first appearance, so that the target data stored in the preset directory may carry character identifiers (e.g., redundant separators, redundant fields, etc.) that are not in the target source table. In addition, the sorted target data also carries a sequence number, and the target data does not carry the sequence number in the target source table. Therefore, the sorted target data obtained by the above method may have a difference in format from the target data in the target source table, so that the sorted target data may not meet the preset format requirement. In order to meet the preset format requirement, the terminal equipment can also interact with the second server to format the sorted target data.
In some embodiments, after sorting the target data in the preset directory according to the sequence number by performing third data interaction with the second server to obtain sorted target data, the method may be implemented in a specific implementation, and further include the following:
s1: generating a formatting request according to a preset protocol rule;
S2: sending the formatting request to the second server; and the second server responds to the formatting request and calls a preset formatting script file to perform formatting processing on the ordered target data so as to obtain the ordered target data meeting the preset format requirement.
Through the above embodiment, the terminal device configured with the data deriving device and the second server may perform formatting processing on the ordered target data in the preset directory through further data interaction, so as to obtain the ordered target data meeting the user requirement and meeting the preset format requirement.
In some embodiments, the preset formatted script file may specifically include a script file encapsulated with the sed command.
Correspondingly, the second server can receive and respond to the formatting request sent by the terminal device, call the script file packaged with the sed command, and perform various formatting processes such as removing redundant fields and redundant separators on the sequenced target data to delete the carried serial numbers and the like, so as to obtain sequenced target data which has the same format as the target data in the target source table and meets the preset format requirement.
In some embodiments, after the second server obtains the sorted target data, the sorted target data may be sent to the terminal device. Correspondingly, the terminal equipment can receive and display the sorted target data to the user. Thus, the user does not need to sort the exported target data in addition, and the sorted target data can be directly obtained.
In some embodiments, after sorting the target data in the preset directory according to the sequence number by performing third data interaction with the second server to obtain sorted target data, when the method is implemented, the method may further include the following steps: generating a verification request; sending the verification request to the second server; the second server responds to the verification request and counts the number of the ordered target data and the number of lines of a first temporary table; and checking whether the ordered target data meets the preset quality requirement according to the number of the ordered target data and the number of the rows of the first temporary table.
Through the embodiment, after the terminal device provided with the data deriving device receives the sorted target data, the terminal device can also quickly verify the data quality of the sorted target data through data interaction with the second server, so that the sorted target data which is high in accuracy and meets the preset quality requirement is finally provided for the user.
In some embodiments, when the method is implemented, if the second server determines that the number of the sorted target data is equal to the number of the rows of the first temporary table, it may be determined that the sorted target data meets the preset quality requirement, and further may generate and send prompt information meeting the preset quality requirement to the terminal device. After receiving the prompt information meeting the preset quality requirement, the terminal equipment displays the ordered target data to the user.
In contrast, if the second server determines that the number of the sorted target data and the number of the rows of the first temporary table are not equal, it can be determined that the sorted target data has a large probability of error and does not meet the preset quality requirement, and further prompt information which does not meet the preset quality requirement can be generated and sent to the terminal device. After receiving the prompt information which does not meet the preset quality requirement, the terminal equipment displays the ordered target data to a user; but triggering to re-conduct data export so as to ensure that the sorted target data finally exported and displayed to the user is high in accuracy and meets the preset quality requirement.
In some embodiments, in a case that it is determined that the target data does not need to be derived in order according to a preset ordering rule, the terminal device may be triggered to perform a data derivation procedure with the first server, where the data derivation procedure does not involve ordering.
In some embodiments, when the ordering indication parameter does not indicate that the target data is to be derived in order according to a preset ordering rule, the method may further include the following when implemented:
S1: querying a target source table in a distributed database by performing fourth data interaction with the first server, and establishing a second temporary table in the distributed database; the second temporary table comprises inquired target data;
S2: establishing a second appearance in the distributed database by performing a fifth data interaction with the first server; the first server coordinates a plurality of node servers in the distributed cluster to write target data into a preset directory of a second server in parallel based on the second temporary table according to the second appearance;
S3: and receiving target data fed back by the second server.
Through the embodiment, the data export method provided by the specification can also be applied to export of target data which does not relate to ordering, so that export requirements of a plurality of different types of users can be met simultaneously.
In some embodiments, the fourth data interaction is performed with the first server, the target source table in the distributed database is queried, and a second temporary table is built in the distributed database; the second temporary table contains the queried target data, and includes: obtaining table structure information of a target source table; obtaining the table structure information of a second temporary table based on the table structure information of the target source table; generating a third creation instruction about the second temporary table according to the table structure information of the second temporary table; wherein, the third creation instruction carries table structure information of a second temporary table; submitting the third creation instruction to the first server; wherein the first server creates an empty table of a second temporary table in the distributed database in response to the third creation instruction; the first server queries a target source table according to the data export request; and the first server writes the queried target data into an empty table of the second temporary table to obtain the second temporary table.
In some embodiments, establishing the second appearance in the distributed database by performing a fifth data interaction with the first server includes: generating a fourth creation instruction on the second appearance according to the table structure information of the second temporary table; submitting the fourth creation instruction to the first server; wherein the first server establishes a second appearance associated with the second temporary table in the distributed database in response to the fourth creation instruction; the second external table stores parallel derived parameters.
In some embodiments, the terminal device may receive the target data stored in the preset directory of the second server by performing data interaction with the second server.
As can be seen from the foregoing, in the data export method provided in the embodiments of the present disclosure, when a user needs to export ordered target data from a distributed database according to a preset ordering rule, a data identifier carrying target data to be exported, a table identifier of a target source table where the target data is located, and a data export request of an ordering indication parameter may be generated by a terminal device and sent to a first server responsible for managing the distributed database; then, the terminal equipment can firstly inquire a target source table in the distributed database by carrying out first data interaction with the first server, and establish a first temporary table containing inquired target data and sequence numbers obtained after sequencing the target data based on a preset sequencing rule in the distributed database; then, the terminal equipment can establish a first appearance in the distributed database by performing second data interaction with the first server; furthermore, the first server can coordinate a plurality of node servers in the distributed cluster to write the target data carrying the serial number into a preset catalog of a second server responsible for providing data transfer service in parallel based on the first temporary table according to the first appearance; further, the terminal equipment performs third data interaction with the second server, sorts the target data in the preset catalog according to the sequence number carried by the target data, and obtains ordered target data after sorting, and feeds the ordered target data back to the terminal equipment. Therefore, the user can obtain the ordered target data which meets the user requirements and is derived from the distributed database through the terminal equipment more efficiently and conveniently, the operation of the user is simplified, and the use experience of the user is improved.
Referring to fig. 3, the embodiment of the present disclosure further provides another data export method, which is applied to the first server, and when the method is implemented, the method may include the following steps:
S301: receiving a data export request; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
S302: determining whether target data are to be exported in sequence according to a preset ordering rule according to the ordering indication parameters;
S303: under the condition that target data are determined to be exported in sequence according to a preset ordering rule, inquiring a target source table in a distributed database by performing first data interaction with terminal equipment, and establishing a first temporary table in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule;
S304: establishing a first appearance in the distributed database by performing second data interaction with the terminal equipment;
s305: and according to the first appearance, a plurality of node servers in the coordinated distributed cluster write target data carrying the serial numbers into a preset directory of a second server in parallel based on the first temporary table.
The terminal device may specifically be a terminal device provided with a data deriving means for providing a data deriving service.
The embodiment of the specification also provides a terminal device, which comprises a processor, a data deriving device and a memory for storing executable instructions of the processor, wherein the processor can execute the following steps according to the instructions when being implemented: transmitting a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter; under the condition that the ordering indication parameters indicate that target data are to be exported in order according to a preset ordering rule, a first data interaction is carried out with a first server, a target source table in a distributed database is inquired, and a first temporary table is built in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule; establishing a first appearance in the distributed database by performing a second data interaction with a first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance; through third data interaction with the second server, sorting the target data in the preset catalogue according to the serial number to obtain sorted target data; and receiving the ordered target data fed back by the second server.
In order to more accurately complete the above instructions, referring to fig. 4, another specific terminal device is further provided in this embodiment of the present disclosure, where the terminal device includes a network communication port 401, a processor 402, and a memory 403, and the above structures are connected by an internal cable, so that each structure may perform specific data interaction. Wherein the terminal device may be further configured with data deriving means for supporting a data deriving service.
The network communication port 401 may be specifically configured to send a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter.
The processor 402 may be specifically configured to query a target source table in a distributed database and establish a first temporary table in the distributed database by performing a first data interaction with a first server when the ordering indication parameter indicates that target data is to be sequentially derived according to a preset ordering rule; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule; establishing a first appearance in the distributed database by performing a second data interaction with a first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance; through third data interaction with the second server, sorting the target data in the preset catalogue according to the serial number to obtain sorted target data; and receiving the ordered target data fed back by the second server.
The memory 403 may be used for storing a corresponding program of instructions.
In this embodiment, the network communication port 401 may be a virtual port that binds with different communication protocols, so that different data may be sent or received. For example, the network communication port may be a port responsible for performing web data communication, a port responsible for performing FTP data communication, or a port responsible for performing mail data communication. The network communication port may also be an entity's communication interface or a communication chip. For example, it may be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it may also be a Wifi chip; it may also be a bluetooth chip.
In this embodiment, the processor 402 may be implemented in any suitable manner. For example, a processor may take the form of, for example, a microprocessor or processor, and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, among others. The description is not intended to be limiting.
In this embodiment, the memory 403 may include a plurality of layers, and in a digital system, the memory may be any memory as long as it can hold binary data; in an integrated circuit, a circuit with a memory function without a physical form is also called a memory, such as a RAM, a FIFO, etc.; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card, and the like.
The embodiments of the present specification also provide a computer storage medium based on the above data deriving method, the computer storage medium storing computer program instructions that when executed implement: transmitting a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter; under the condition that the ordering indication parameters indicate that target data are to be exported in order according to a preset ordering rule, a first data interaction is carried out with a first server, a target source table in a distributed database is inquired, and a first temporary table is built in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule; establishing a first appearance in the distributed database by performing a second data interaction with a first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance; through third data interaction with the second server, sorting the target data in the preset catalogue according to the serial number to obtain sorted target data;
And receiving the ordered target data fed back by the second server.
In the present embodiment, the storage medium includes, but is not limited to, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), a Cache (Cache), a hard disk (HARD DISK DRIVE, HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects of the program instructions stored in the computer storage medium may be explained in comparison with other embodiments, and are not described herein.
Referring to fig. 5, on a software level, the embodiment of the present disclosure further provides a data export device, which may specifically include the following structural modules:
The sending module 501 may be specifically configured to send a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
The first interaction module 502 may be specifically configured to query a target source table in a distributed database and establish a first temporary table in the distributed database by performing a first data interaction with a first server when the ordering indication parameter indicates that target data is to be sequentially derived according to a preset ordering rule; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule;
The second interaction module 503 may be specifically configured to establish a first appearance in the distributed database by performing a second data interaction with the first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance;
The third interaction module 504 may be specifically configured to sort the target data in the preset directory according to the serial number by performing third data interaction with the second server, so as to obtain sorted target data;
The receiving module 505 may be specifically configured to receive the sorted target data fed back by the second server.
In some embodiments, the data deriving device may be specifically configured and applied to a side of the terminal device, or may be configured and applied to a side of a server (e.g., a first server, a second server, etc.), according to specific situations and usage requirements.
It should be noted that, the units, devices, or modules described in the above embodiments may be implemented by a computer chip or entity, or may be implemented by a product having a certain function. For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, when the present description is implemented, the functions of each module may be implemented in the same piece or pieces of software and/or hardware, or a module that implements the same function may be implemented by a plurality of sub-modules or a combination of sub-units, or the like. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
From the above, the data deriving device provided in the embodiment of the present disclosure can obtain the ordered target data derived from the distributed database, which meets the user requirement, more efficiently and conveniently, and after being ordered according to the preset ordering rule, thereby improving the user experience of the user.
In one specific example of a scenario, a means (Export Service Tool, which may be abbreviated as a data export means) for exporting ordered data files from a distributed database may be constructed based on the data export method provided in the present specification.
Specifically, the device can be shown in fig. 6, and includes the following structural modules: the system comprises an entry parameter analysis module 1, an environment check module 2, a database query statement generation module 3, a distributed export execution module 4, an operating system single-point ordering execution module 5, a data file quality check module 6, a task control module 7 and a bus 8.
The inlet parameter analysis module 1 and the environment inspection module 2 are connected through a bus 8 to form a pre-verification module, and can perform validity verification on input inlet parameters, load necessary configuration parameters and pre-verification on environment availability. If the pre-verification passes, all necessary configuration loads are completed, a signal is sent to the task control module 7 via the bus 8.
The database query statement generation module 3, the distributed export execution module 4, the operating system single-point ordering execution module 5 and the data file quality verification module 6 are connected through a bus 8 to form a ordered data query export module together, and the ordered data query export module is used for realizing the core function of the device.
The task control module 7 communicates with each functional module through the bus 8 to provide functions such as flow control, failed task retry, and the like.
In particular, the apparatus may be connected to an external system (e.g., a distributed cluster, etc.). See in particular fig. 7. Wherein, include: the system comprises a number of CN nodes 8 in the MPP database management system, a number of DN nodes 9 in the MPP database management system, a data deriving means 10 (e.g. corresponding to terminal devices), a data serving means 11 (e.g. corresponding to a first server) of the database management system and an ETL server 12 (e.g. corresponding to a second server).
In a specific application scenario, the CN node 8 and the DN node 9 are main components of the MPP data management system cluster, and the CN node may coordinate a plurality of DN nodes, and export the data table in parallel to the ETL server 12 by using the data exporting apparatus 10 and using a port communication manner.
In this scenario example, the data exporting apparatus 10 may communicate directly with the CN node through the database client, and submit an SQL query statement; then the data service device 11 is connected through a port communication mode, and the availability of the data service device 11 is pre-checked; and also communicates with the ETL server 12 via SSH protocol or RPC communication services (e.g., preset protocol rules) to submit single point ordering instructions and may verify the quality of the data file.
In particular, when the above-mentioned data export device 10 is used to export ordered data files from a database in a distributed manner, the following steps may be performed as shown in fig. 8:
step S101: a pre-verification step comprising: and (5) checking entry parameters and checking environment. Wherein the portal parameter check comprises a validity check of the parameter, and the environment check comprises an availability check of the connected external system or service.
Step S102: the data deriving means performs a sort key judgment on the entry parameters to determine whether sort keys are included.
If it is determined in step S102 that the sort instruction (e.g., sort instruction parameter) is not included, an unordered branch is entered (e.g., the sorted target data derivation flow is not involved).
Step S103: the first step of unordered branching may be to parse the source table name T1 (e.g., table identification of the target source table) in the entry parameter by the data deriving means, and submit DDL statements to the distributed database according to the table structure of the source table T1, creating a temporary table Tmp2 storing the query result.
Step S104: in the second step of the unordered branches, the data exporting apparatus 10 may submit DML statements to the distributed database, write the results of the query related to the source table T1 into the temporary table Tmp2, and after the writing transaction is completed, the complex association query calculation related to the database system is completed, where the lock held by the source table T1 may be released, so as not to affect the query or writing of other batch operations or interfaces to the source table T1.
Step S105: in a third step of not ordering branches, the data exporting apparatus 10 submits DDL statements to the distributed database, creates a table FT2 storing parallel export parameters according to the table structure of the temporary table Tmp2, and determines the export parameters, such as delimiters, line breaks, export service nodes, ports, and other configurations.
Step S106: and in the fourth step of the unordered branch, submitting a DML statement to a distributed database by a data export device 10, writing all data in a temporary table Tmp2 for storing the query result into a table FT2, carrying out port communication between the distributed database system and a process of a data service device 11 on an ETL server, coordinating each DN node to write the data in the Tmp2 table into a designated directory of the ETL server in parallel, and exporting the data into a data file in a file system. Upon receiving the signal of completion of export, the data exporting apparatus 11 initiates a verification step S113.
If it is determined in step S102 that the sorting instruction is included, a sorting branch (e.g., a target data derivation flow involving sorting) is entered.
Step S107: in the first step of sorting branches, the data exporting apparatus 10 may parse the source table name T1 in the entry parameter, and submit the DDL statement to the distributed database according to the new sequence number field of the table structure of the source table T1, so as to create the temporary table Tmp1 storing the query result.
Step S108: a second step of sorting branches, in which the data deriving device 10 submits DML sentences with sorting keywords to the distributed database, and the query result and sequence number fields are written into the temporary table Tmp1; after the writing transaction is completed, the complex association query calculation and the ordering query calculation related to the database system are completed, and the lock held by the source table T1 can be released, so that the query or the writing of other batch operations or interfaces to the source table T1 is not influenced.
Step S109: in a third step of ordering the branches, the data exporting device 10 submits DDL statements to the distributed database, creates a table FT1 storing parallel export parameters according to the table structure of the temporary table Tmp1 containing sequence number fields, and determines the export parameters such as delimiters, line connectors, export service nodes and ports.
Step S110: in the fourth step of the sorting branch, the data exporting apparatus 10 submits DML statements to the distributed database, writes all data in the temporary table Tmp1 storing the query result into the table FT1, the distributed database system performs port communication with the process of the data service apparatus 11 on the ETL server, coordinates each DN node to write data in the Tmp2 table into the designated directory of the ETL server 12 in parallel, and exports the data file as a data file in the file system, but since the data in Tmp1 is not stored in DN nodes sequentially, each data line in the exported data file is in a state of having a sequence number field (for example, target data carrying a sequence number) and not being sorted.
Step S111: a fifth step of sorting the branches, in which the data exporting apparatus 10 uses SSH protocol or RPC communication to upload the sorting request (such as script for packaging a sort command) for the data files with sequence numbers and not sorted in step S110 to the ETL server 12, and after the sorting process is finished, the data files on the ETL server 12 are in a state with sequence numbers and sorted; upon receiving the sorting completion signal, the data deriving means 10 initiates a file formatting step S112.
Step S112: the sixth step of the sorting branch is that the data exporting apparatus 10 initiates the verification step S113 after receiving the signal of the completion of the formatting by means of SSH protocol or RPC communication, by raising the format request (e.g. script encapsulating sed command) of the ordered data file with sequence number generated in S111 on the ETL server 12, deleting the sequence number field and redundant separator in the data file.
Step S113: and a line number checking step, comprising determining whether the line number check is passed. A query command is initiated by the data export device 10 to the distributed database and ETL server 12, respectively, comparing the intermediate result temporary table (Tmp 1 or Tmp 2) with the number of rows of the exported data file. If so, the export flow ends and the export ranking is complete (corresponding to step S114). If not, the data exporting device retries according to a predetermined retry mechanism.
Through the scene example, the problem that the data files sequenced according to the fields cannot be efficiently and distributively exported in the application scene of inquiring and exporting the data files of the distributed relational database is solved by verifying the data exporting method provided by the specification. Compared with serial export commands (such as COPY commands) of the distributed database, the method has the advantages of high data file export efficiency, stable execution process, high reliability and the like. In addition, the method is flexible in deployment mode, relatively low in coupling degree with an external system, and has the advantages of being simple in configuration, easy to maintain and the like.
Although the present description provides method operational steps as described in the examples or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented by an apparatus or client product in practice, the methods illustrated in the embodiments or figures may be performed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment, or even in a distributed data processing environment). The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, it is not excluded that additional identical or equivalent elements may be present in a process, method, article, or apparatus that comprises a described element. The terms first, second, etc. are used to denote a name, but not any particular order.
Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of embodiments, it will be apparent to those skilled in the art that the present description may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present specification may be embodied essentially in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and include several instructions to cause a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to perform the methods described in the various embodiments or portions of the embodiments of the present specification.
Various embodiments in this specification are described in a progressive manner, and identical or similar parts are all provided for each embodiment, each embodiment focusing on differences from other embodiments. The specification is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Although the present specification has been described by way of example, it will be appreciated by those skilled in the art that there are many variations and modifications to the specification without departing from the spirit of the specification, and it is intended that the appended claims encompass such variations and modifications as do not depart from the spirit of the specification.

Claims (15)

1. A data export method, comprising:
Transmitting a data export request to a first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
Under the condition that the ordering indication parameters indicate that target data are to be exported in order according to a preset ordering rule, a first data interaction is carried out with a first server, a target source table in a distributed database is inquired, and a first temporary table is built in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule; the data of each table in the distributed database is distributed and stored to each node server in the distributed cluster; when the first server starts to inquire the target source table, an operation lock is arranged on the target source table so as to avoid interference and influence of other instructions or operations on the operation of the target source table in the process of inquiring the target source table;
Establishing a first appearance in the distributed database by performing a second data interaction with a first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance; the first table is a communication configuration table for guiding data in the first temporary table to be exported to the distributed database in a parallel mode;
Through third data interaction with the second server, sorting the target data in the preset catalogue according to the serial number to obtain sorted target data;
And receiving the ordered target data fed back by the second server.
2. The method of claim 1, wherein the ranking indication parameter further comprises a preset ranking rule set by a user.
3. The method of claim 1, wherein querying a target source table in a distributed database by a first data interaction with a first server and establishing a first temporary table in the distributed database comprises:
obtaining table structure information of a target source table;
based on the table structure information of the target source table, adding a list of sequence number fields to obtain the table structure information of a first temporary table;
generating a first creation instruction about a first temporary table according to the table structure information of the first temporary table; the first creating instruction carries table structure information of a first temporary table;
Submitting the first creation instruction to the first server; the first server responds to the first creation instruction to create an empty table of a first temporary table in the distributed database; the first server queries a target source table according to the data export request; and the first server sorts the target data inquired from the target source table according to a preset sorting rule, and writes the target data and the serial number corresponding to the target data into an empty table of the first temporary table to obtain the first temporary table.
4. A method according to claim 3, wherein after querying a target source table in a distributed database by a first data interaction with a first server and establishing a first temporary table in the distributed database, the method further comprises:
The first server releases an operation lock arranged on a target source table; the operation lock is set by the first server when the target source table is queried according to the data export request.
5. A method according to claim 3, wherein creating a first appearance in the distributed database by second data interaction with a first server comprises:
Generating a second creation instruction about the first appearance according to the table structure information of the first temporary table;
submitting the second creation instruction to the first server; wherein the first server establishes a first appearance associated with the first temporary table in the distributed database in response to the second creation instruction; the first look stores parallel derived parameters.
6. The method of claim 5, wherein the parallel derived parameters include at least one of: separator, linefeed, identification information of node servers participating in parallel export, port configuration information associated with parallel export.
7. The method of claim 1, wherein sorting the target data in the preset directory according to the sequence number by performing a third data interaction with the second server to obtain sorted target data, comprises:
generating a sequencing request according to a preset protocol rule;
Sending the ordering request to the second server; and the second server responds to the ordering request and calls a preset ordering script file to order the target data in the preset catalogue according to the serial number carried by the target data, so as to obtain ordered target data.
8. The method of claim 7, wherein the pre-set ordering script file comprises a script file encapsulated with a sort command.
9. The method of claim 7, wherein after sorting the target data in the preset directory according to the sequence number by performing the third data interaction with the second server, the method further comprises:
generating a formatting request according to a preset protocol rule;
sending the formatting request to the second server; and the second server responds to the formatting request and calls a preset formatting script file to perform formatting processing on the ordered target data so as to obtain the ordered target data meeting the preset format requirement.
10. The method of claim 7, wherein after sorting the target data in the preset directory according to the sequence number by performing the third data interaction with the second server, the method further comprises:
Generating a verification request;
Sending the verification request to the second server; the second server responds to the verification request and counts the number of the ordered target data and the number of lines of a first temporary table; and checking whether the ordered target data meets the preset quality requirement according to the number of the ordered target data and the number of the rows of the first temporary table.
11. The method according to claim 1, wherein in case the ordering indication parameter does not indicate that target data is to be derived in order according to a preset ordering rule, the method further comprises:
Querying a target source table in a distributed database by performing fourth data interaction with the first server, and establishing a second temporary table in the distributed database; the second temporary table comprises inquired target data;
Establishing a second appearance in the distributed database by performing a fifth data interaction with the first server; the first server coordinates a plurality of node servers in the distributed cluster to write target data into a preset directory of a second server in parallel based on the second temporary table according to the second appearance;
And receiving target data fed back by the second server.
12. A data exporting method, applied to a first server, comprising:
Receiving a data export request; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
Determining whether target data are to be exported in sequence according to a preset ordering rule according to the ordering indication parameters;
Under the condition that target data are determined to be exported in sequence according to a preset ordering rule, inquiring a target source table in a distributed database by performing first data interaction, and establishing a first temporary table in the distributed database; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule; the data of each table in the distributed database is distributed and stored to each node server in the distributed cluster; when the first server starts to inquire the target source table, an operation lock is further arranged on the target source table so as to avoid interference and influence caused by other instructions or operations on the operation of the target source table in the process of inquiring the target source table;
Establishing a first appearance in the distributed database by performing a second data interaction; wherein the first table is a communication configuration table for guiding data in the first temporary table to be exported to the distributed database in a parallel manner;
And according to the first appearance, a plurality of node servers in the coordinated distributed cluster write target data carrying the serial numbers into a preset directory of a second server in parallel based on the first temporary table.
13. A data deriving apparatus, comprising:
the sending module is used for sending a data export request to the first server; the data export request at least carries a data identifier of target data to be exported, a table identifier of a target source table where the target data is located, and an ordering indication parameter;
The first interaction module is used for inquiring a target source table in the distributed database and establishing a first temporary table in the distributed database by carrying out first data interaction with a first server under the condition that the ordering indication parameter indicates that target data is to be exported in sequence according to a preset ordering rule; the first temporary table comprises inquired target data and sequence numbers corresponding to the target data, which are obtained after the target data are sequenced based on a preset sequencing rule; the data of each table in the distributed database is distributed and stored to each node server in the distributed cluster; when the first server starts to inquire the target source table, an operation lock is arranged on the target source table so as to avoid interference and influence of other instructions or operations on the operation of the target source table in the process of inquiring the target source table;
the second interaction module is used for establishing a first appearance in the distributed database through second data interaction with the first server; the first server coordinates a plurality of node servers in a distributed cluster to write target data carrying a serial number into a preset directory of a second server in parallel based on the first temporary table according to the first appearance; the first table is a communication configuration table for guiding data in the first temporary table to be exported to the distributed database in a parallel mode;
the third interaction module is used for sorting the target data in the preset catalogue according to the serial number by carrying out third data interaction with the second server to obtain sorted target data;
And the receiving module is used for receiving the ordered target data fed back by the second server.
14. A terminal device comprising a processor, data deriving means, and a memory for storing processor executable instructions, the processor implementing the steps of the method of any one of claims 1 to 11 when the instructions are executed.
15. A computer readable storage medium having stored thereon computer instructions which when executed implement the steps of the method of any of claims 1 to 11.
CN202110341385.8A 2021-03-30 2021-03-30 Data export method and device and terminal equipment Active CN112860780B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110341385.8A CN112860780B (en) 2021-03-30 2021-03-30 Data export method and device and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110341385.8A CN112860780B (en) 2021-03-30 2021-03-30 Data export method and device and terminal equipment

Publications (2)

Publication Number Publication Date
CN112860780A CN112860780A (en) 2021-05-28
CN112860780B true CN112860780B (en) 2024-06-14

Family

ID=75993189

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110341385.8A Active CN112860780B (en) 2021-03-30 2021-03-30 Data export method and device and terminal equipment

Country Status (1)

Country Link
CN (1) CN112860780B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1874254A (en) * 2005-06-02 2006-12-06 华为技术有限公司 Method for browsing data based on structure of client end / server end
CN111510493A (en) * 2020-04-15 2020-08-07 中国工商银行股份有限公司 Distributed data transmission method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677754B (en) * 2015-12-30 2019-03-26 华为技术有限公司 Obtain the methods, devices and systems of subitem metadata in file system
CN106126633B (en) * 2016-06-22 2019-10-18 中国建设银行股份有限公司 Processing method, the device and system of noble metal data
CN107070953B (en) * 2017-06-09 2019-11-01 武汉虹旭信息技术有限责任公司 Link guard system and its method based on Dynamic Programming
CN109145034B (en) * 2017-06-15 2022-05-06 阿里巴巴集团控股有限公司 Resource presentation method and device and computer terminal
CN112527847A (en) * 2020-11-12 2021-03-19 贝壳技术有限公司 Data sorting method and device, electronic medium and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1874254A (en) * 2005-06-02 2006-12-06 华为技术有限公司 Method for browsing data based on structure of client end / server end
CN111510493A (en) * 2020-04-15 2020-08-07 中国工商银行股份有限公司 Distributed data transmission method and device

Also Published As

Publication number Publication date
CN112860780A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN102236672A (en) Method and device for importing data
CN108628748B (en) Automatic test management method and automatic test management system
CN110399383A (en) Applied to the data processing method of server, device, calculate equipment, medium
CN108388604A (en) User right data administrator, method and computer readable storage medium
CN110941630A (en) Database operation and maintenance method, device and system
CN112463144A (en) Distributed storage command line service method, system, terminal and storage medium
EP1725953A1 (en) Apparatus and method for data consistency validation.
CN109740129B (en) Report generation method, device and equipment based on blockchain and readable storage medium
CN110188103A (en) Data account checking method, device, equipment and storage medium
CN110245145A (en) Structure synchronization method and apparatus of the relevant database to Hadoop database
CN112817995B (en) Data processing method and device, electronic equipment and storage medium
CN110222028A (en) A kind of data managing method, device, equipment and storage medium
CN107133233B (en) Processing method and device for configuration data query
CN112559525B (en) Data checking system, method, device and server
CN113836237A (en) Method and device for auditing data operation of database
CN112860780B (en) Data export method and device and terminal equipment
CN116974874A (en) Database testing method and device, electronic equipment and readable storage medium
CN110874365A (en) Information query method and related equipment thereof
CN110827001A (en) Accounting event bookkeeping method, system, equipment and storage medium
CN112445860A (en) Method and device for processing distributed transaction
CN111694724A (en) Testing method and device of distributed table system, electronic equipment and storage medium
CN115481026A (en) Test case generation method and device, computer equipment and storage medium
CN104216986A (en) Device and method for improving data query efficiency through pre-operation according to data update period
CN107704557B (en) Processing method and device for operating mutually exclusive data, computer equipment and storage medium
CN113760841A (en) Method and device for realizing distributed lock

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant