CN110471977A - A kind of method for interchanging data, device, equipment, medium - Google Patents

A kind of method for interchanging data, device, equipment, medium Download PDF

Info

Publication number
CN110471977A
CN110471977A CN201910779089.9A CN201910779089A CN110471977A CN 110471977 A CN110471977 A CN 110471977A CN 201910779089 A CN201910779089 A CN 201910779089A CN 110471977 A CN110471977 A CN 110471977A
Authority
CN
China
Prior art keywords
data
etl
data base
concurrently
data exchange
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910779089.9A
Other languages
Chinese (zh)
Other versions
CN110471977B (en
Inventor
林鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dt Dream Technology Co Ltd
Original Assignee
Hangzhou Dt Dream Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dt Dream Technology Co Ltd filed Critical Hangzhou Dt Dream Technology Co Ltd
Priority to CN201910779089.9A priority Critical patent/CN110471977B/en
Publication of CN110471977A publication Critical patent/CN110471977A/en
Application granted granted Critical
Publication of CN110471977B publication Critical patent/CN110471977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of method for interchanging data, device, equipment, media, this method comprises: obtaining back-end data base information in distributed data base;Then according to the back-end data base information, the ETL working node creation quantity data exchange mission thread equal with back-end data base quantity is controlled;And control the ETL working node and execute the data exchange mission thread, directly concurrently to extract target data from source database, and the target data is concurrently directly loaded into purpose database.It can be seen that, the application is in getting distributed data base after back-end data base information, the ETL working node creation quantity data exchange mission thread equal with back-end data base quantity can be controlled, and execute a plurality of data exchange mission thread, with direct concurrent reading and writing database, the data exchange mission thread is without middleware, and there is no the limitations of middleware single-point performance, Data Concurrent read-write is realized, distributed data base data exchange performance is improved.

Description

A kind of method for interchanging data, device, equipment, medium
Technical field
This application involves database technical field, in particular to a kind of method for interchanging data, device, equipment, medium.
Background technique
ETL (Extract-Transform-Load) is used to describe by data from source terminal by extracting (extract), handing over The mutually process of conversion (transform), load (load) to destination.Utilize the number of ETL tool for processing distributed data base When according to switching task, progress SQL service agent semantic using middleware parsing SQL is needed.Although lateral by middleware expands Exhibition ability can promote the whole agent capability of middleware, but the data exchange of a specific tables of data can only be by a centre Part agency, there are single-point performance bottleneck, data exchange performance is low.
Summary of the invention
In view of this, the application's is designed to provide a kind of method for interchanging data, device, equipment, medium, so that distribution Formula database data switching performance is greatly improved.Its concrete scheme is as follows:
In a first aspect, being applied to ETL control node this application discloses a kind of method for interchanging data, comprising:
Obtain back-end data base information in distributed data base;
According to the back-end data base information, it is equal with back-end data base quantity to control ETL working node creation quantity Data exchange mission thread;
It controls the ETL working node and executes the data exchange mission thread, directly concurrently to be taken out from source database Target data is taken, and the target data is concurrently directly loaded into purpose database.
Optionally, back-end data base information in the acquisition distributed data base, comprising:
When needing to carry out data reading from distributed data base, acquisition includes the quantity information of back-end data base, institute State the back-end data base information of the link information of back-end data base and the authority information of the back-end data base;
Or, when need into distributed data base carry out data write-in when, obtain include back-end data base quantity information, The permission of the fragment Rule Information of the back-end data base, the link information of the back-end data base and the back-end data base The back-end data base information of information.
Optionally, the control ETL working node executes the data exchange mission thread, directly from source data Target data is concurrently extracted in library, and the target data is concurrently directly loaded into purpose database, comprising:
When needing to carry out data reading from distributed data base, controls the ETL working node and execute the data Switching task thread, directly concurrently to extract target data from the back-end data base, and directly by the target data simultaneously Hair is loaded into purpose database;
Or, controlling the ETL working node when needing to carry out data write-in to distributed data base and executing the data Switching task thread directly concurrently to extract target data from source database, and directly concurrently loads the target data To the back-end data base.
Optionally, the control ETL working node executes the data exchange mission thread, directly from source data Target data is concurrently extracted in library, and the target data is concurrently directly loaded into purpose database, comprising:
When needing to carry out data write-in from distributed data base, controls the ETL working node and execute the data The fragment rule of the back-end data base is converted into decimation rule, directly concurrently extracted from source database by switching task thread Target data, and the target data is concurrently directly loaded into the back-end data base.
Optionally, the control ETL working node executes the data exchange mission thread, directly from source data Target data is concurrently extracted in library, and during the target data is concurrently directly loaded into purpose database, further includes:
When data exchange mission failure, judge the data exchange mission failure whether by the data exchange task line Journey failure causes;
If it is, controlling the corresponding ETL working node of the data exchange task establishes new data switching task Thread, and rollback operation is carried out to the fault data in purpose database, then again from the source database described in extraction The corresponding data of fault data, until the data exchange task is completed.
Optionally, described to judge whether the data exchange mission failure is led by the data exchange mission thread failure It causes, comprising:
If the data exchange mission failure is caused by the data exchange mission thread failure, the number is judged Whether caused by the ETL working node failure according to switching task failure;
If it is, control establishes new data switching task thread in the ETL working node of normal operating conditions, and right Fault data in purpose database carries out rollback operation, then extracts the fault data pair from the source database again The data answered, until the data exchange task is completed.
Optionally, rollback operation is carried out to the fault data in purpose database, comprising:
The corresponding fragment rule of the data exchange task is converted selection rule by control ETL working node, with screening Fault data in purpose database out, and delete the fault data.
Optionally, the control ETL working node directly concurrently extracts target data from source database, and by institute State that target data is concurrent to be loaded directly into after purpose database, further includes:
Summarize the total data exchange information of the data exchange thread, and controls the total data exchange information and opened up It is existing.
Second aspect, this application discloses a kind of DEU data exchange units, comprising:
Data obtaining module, for obtaining back-end data base information in distributed data base;
Thread creation module, for controlling ETL working node creation quantity and rear end according to the back-end data base information The equal data exchange mission thread of quantity database;
Data exchange control module executes the data exchange mission thread for controlling the ETL working node, with straight It connects and concurrently extracts target data from source database, and the target data is concurrently directly loaded into purpose database.
The third aspect, this application discloses a kind of switches, comprising:
Memory and processor;
Wherein, the memory, for storing computer program;
The processor, for executing the computer program, to realize aforementioned disclosed method for interchanging data.
Fourth aspect, this application discloses a kind of computer readable storage mediums, for saving computer program, wherein The computer program realizes aforementioned disclosed method for interchanging data when being executed by processor.
As it can be seen that the application obtains back-end data base information in distributed data base;Then believed according to the back-end data base Breath, the control ETL working node creation quantity data exchange mission thread equal with back-end data base quantity;And described in controlling ETL working node executes the data exchange mission thread, directly concurrently to extract target data from source database, and directly The target data is concurrently loaded into purpose database.It can be seen that the application rear end in getting distributed data base After database information, the ETL working node creation quantity data exchange mission thread equal with back-end data base quantity can be controlled, And control the operating point ETL and be performed simultaneously a plurality of data exchange mission thread, with directly concurrently access source database and Purpose database, the data exchange mission thread is without middleware, and there is no the limitations of middleware single-point performance, realizes data Concurrent reading and writing improves distributed data base data exchange performance.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of method for interchanging data flow chart disclosed in the present application;
Fig. 2 is a kind of specific method for interchanging data flow chart disclosed in the present application;
Fig. 3 is a kind of specific method for interchanging data flow chart disclosed in the present application;
Fig. 4 is a kind of specific data exchange schematic illustration disclosed in the present application;
Fig. 5 is a kind of specific method for interchanging data flow chart disclosed in the present application;
Fig. 6 is a kind of specific data exchange operation principle schematic diagram figure disclosed in the present application;
Fig. 7 is a kind of DEU data exchange unit structural schematic diagram disclosed in the present application;
Fig. 8 is a kind of ETL control node structure chart disclosed in the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
Currently, needing to need to utilize using middleware when handling distributed data base data exchange task using ETL tool Middleware parses SQL semanteme, carries out SQL service agent, and the use of middleware can have single-point performance bottleneck.For this purpose, this Shen A kind of method for interchanging data please be accordingly provide, source database and purpose database directly can be concurrently accessed, realize number According to concurrent reading and writing, distributed data base data exchange performance is improved.
The embodiment of the present application discloses a kind of method for interchanging data, is applied to ETL control node, shown in Figure 1, the party Method includes:
Step S11: back-end data base information in distributed data base is obtained.
It is understood that ETL is used to describe to convert data by extraction, interaction from source terminal, is loaded onto destination Process.ETL tool includes the software that can be realized ETL function, for the ability for improving ETL exchange, supports more data sources Access, ETL tool can be disposed by the way of cluster, form ETL cluster, are managed collectively multiple work sections by control node Point carries out the scheduling of operation, and exchanging operation is distributed to one or more working nodes.Working node can be extending transversely, into The specific data exchange work of row, is periodically extracted according to exchanging operation, interacts conversion and load.
Distributed data base system can individually be placed on a ground usually using lesser computer system, every computer It is square, may all there are a complete copy copy or copied part pair of data base management system (DBMS) in every computer This, and the database with oneself part, many computers positioned at different location are interconnected by network, collectively constitute one A large database that is complete, global, concentrating, be physically distributed in logic.
In the present embodiment, the distributed data base includes back-end data base and middleware.The ETL control node and institute Middleware interaction is stated, to obtain back-end data base information in the distributed data base.
In the first specific embodiment, back-end data base information in the acquisition distributed data base, comprising: when need When carrying out data reading from distributed data base, quantity information, the back-end data base including back-end data base are obtained Link information and the back-end data base authority information back-end data base information.
In second of specific embodiment, back-end data base information in the acquisition distributed data base, comprising: when need When carrying out data write-in into distributed data base, quantity information, the back-end data base including back-end data base are obtained Fragment Rule Information, the back-end data base link information and the back-end data base authority information Back end data Library information.
Step S12: according to the back-end data base information, ETL working node creation quantity and back-end data base number are controlled Measure equal data exchange mission thread.
In the present embodiment, the back-end data base information includes back-end data base quantity information, the control of ETL control node The operating point ETL creates the quantity data exchange mission thread equal with back-end data base quantity.The ETL working node is according to obtaining The different back-end data base link informations and authority information got, directly establish connection with corresponding back-end data base, are not necessarily to Through middleware.Wherein, the data exchange mission thread can be created by an ETL working node, can also be by multiple ETL works Make node creation.
Step S13: it controls the ETL working node and executes the data exchange mission thread, directly from source database In concurrently extract target data, and the target data is concurrently directly loaded into purpose database.
It is understood that when the ETL working node executes the data exchange mission thread, directly from source database In concurrently extract target data, and the target data is concurrently directly loaded into purpose database.Specifically, described every time ETL working node will execute the more than one data exchange mission thread, can realize the concurrently access to database.
When the data exchange mission thread is created by an ETL working node, the data exchange mission thread is simultaneously Row executes, and directly concurrently to extract target data from source database, and the target data is concurrently directly loaded into purpose Database.For example, back-end data base quantity is that 4, ETL control node controls the first ETL working node creation 4 data exchanges times Business thread, and control the first ETL working node and execute 4 data switching task thread, directly from source database In concurrently extract target data, and the target data is concurrently directly loaded into purpose database.
When the data exchange mission thread is created by multiple ETL working nodes, and ETL working node quantity and rear end number It is equal according to library quantity, when ETL working node and back-end data base connect one to one, data exchange task described in two or more Thread parallel executes, and directly concurrently to extract target data from source database, and directly concurrently loads the target data To purpose database.For example, back-end data base quantity is that 4, ETL control node controls the first ETL working node, the 2nd ETL work Make node, the 3rd ETL working node and the 4th ETL working node and 1 data switching task thread is respectively created, and described in control First ETL working node, the 2nd ETL working node, the 3rd ETL working node and the 4th ETL working node Middle any two or several combinations execute the data exchange mission thread, directly concurrently to extract target from source database Data, and the target data is concurrently directly loaded into purpose database.
When the data exchange mission thread is created by multiple ETL working nodes, and ETL working node quantity is less than rear end When quantity database, data exchange mission thread described in two or more executes parallel, directly concurrently to take out from source database Target data is taken, and the target data is concurrently directly loaded into purpose database.For example, back-end data base quantity is 4, ETL control node controls the first ETL working node, 2 mission threads are respectively created in the 2nd ETL working node, and described in control One or two of first ETL working node and the 2nd ETL working node execute the data exchange mission thread, Directly concurrently to extract target data from source database, and the target data is concurrently directly loaded into purpose database.
In the first specific embodiment, the control ETL working node executes the data exchange task line The target data directly concurrently to extract target data from source database, and is directly concurrently loaded into purpose data by journey Library, comprising: when needing to carry out data reading from distributed data base, control the ETL working node and execute the data Switching task thread directly concurrently to extract target data from back-end data base, and directly concurrently adds the target data It is downloaded to purpose database.
In second of specific embodiment, the control ETL working node executes the data exchange task line The target data directly concurrently to extract target data from source database, and is directly concurrently loaded into purpose data by journey Library, comprising: when needing to carry out data write-in to distributed data base, control the ETL working node and execute the data friendship Mission thread is changed, directly concurrently to extract target data from source database, and is directly concurrently loaded into the target data Back-end data base.
In the present embodiment, the control ETL working node executes the data exchange mission thread, directly from source Target data is concurrently extracted in database, and after the target data is concurrently directly loaded into purpose database, it can be with It include: to summarize the total data exchange information of the data exchange thread, and control the total data exchange information and showed.
As it can be seen that the application obtains back-end data base information in distributed data base;Then believed according to the back-end data base Breath, the control ETL working node creation quantity data exchange mission thread equal with back-end data base quantity;And described in controlling ETL working node executes the data exchange mission thread, directly concurrently to extract target data from source database, and directly The target data is concurrently loaded into purpose database.It can be seen that the application rear end in getting distributed data base After database information, the ETL working node creation quantity data exchange mission thread equal with back-end data base quantity can be controlled, And control the operating point ETL and be performed simultaneously a plurality of data exchange mission thread, with directly concurrently access source database and Purpose database, the data exchange mission thread is without middleware, and there is no the limitations of middleware single-point performance, realizes data Concurrent reading and writing improves distributed data base data exchange performance.
Shown in Figure 2, the embodiment of the present application discloses a kind of specific method for interchanging data, is applied to ETL control section Point, this method comprises:
Step S21: back-end data base information in distributed data base is obtained.
Step S22: according to the back-end data base information, ETL working node creation quantity and back-end data base number are controlled Measure equal data exchange mission thread.
Step S23: it controls the ETL working node and executes the data exchange mission thread, directly from source database In concurrently extract target data, and the target data is concurrently directly loaded into purpose database.
Step S24: when data exchange mission failure, judge whether the data exchange mission failure is handed over by the data Changing mission thread failure causes.
It is understood that ETL working node is in the process for executing the data exchange mission thread in the present embodiment In, ETL control node collects the task status of the data exchange mission thread, when there is data exchange task, described in judgement Data exchange mission failure is caused by what reason, and is interacted with middleware, and it is corresponding to obtain the data exchange task The fragment rule of purpose database.
Step S25: if it is, controlling the corresponding ETL working node of the data exchange task establishes new data Switching task thread, and rollback operation is carried out to the fault data in purpose database, then extracted from source database again The corresponding data of the fault data, until the data exchange task is completed.
In the present embodiment, if the data exchange mission failure is caused by the data exchange mission thread failure, It then controls the corresponding ETL working node of the data exchange task and establishes new data switching task thread, and to purpose number Rollback operation is carried out according to the fault data in library, then extracts the corresponding data of the fault data from source database again, Until the data exchange task is completed.Specifically it is to be understood that when the data exchange mission failure is handed over by the data When changing mission thread failure and causing, controls the corresponding ETL working node of the data exchange task and establish new data exchange Mission thread, and fragment rule described in above-mentioned steps is converted into selection rule, from the corresponding mesh of the data exchange task Database in filter out fault data, and the fault data is deleted.It is to be deleted to complete and then taken out from source database The corresponding partial data of the fault data is taken, and is loaded into corresponding purpose database.Wherein, the selection rule includes But it is not limited to where condition and limit condition etc..For example, the fragment rule is converted to where condition, from the data Fault data is filtered out in the corresponding purpose database of switching task, and the fault data is deleted.If the data exchange The corresponding fragment rule of task is data=20190726, then the fragment rule is converted into where rule, i.e. where Data=20190726 filters out fault data from purpose database, and the fault data is deleted.Completion to be deleted Afterwards, the corresponding data of the fragment rule data=20190726 are being extracted from source database, and be loaded into the data and hand over It changes in the corresponding purpose database of task.
If the data exchange mission failure is caused by the data exchange mission thread failure, the number is judged Whether caused by the ETL working node failure according to switching task failure;If it is, control is in normal operating conditions ETL working node establishes new data switching task thread, and carries out rollback operation to the fault data in purpose database, then Again the corresponding data of the fault data are extracted from source database, until the data exchange task is completed;If it is not, then Stop the data exchange task, then notify that ETL operation maintenance personnel is checked, or after waiting purpose database recovery, to purpose Fault data in database carries out rollback operation, to be deleted to complete and then extract from source database the fault data Corresponding partial data, and be loaded into corresponding purpose database.The rollback operation specific steps can refer to aforementioned disclosure Content.
The rollback operation may insure that the data being written in purpose database do not repeat, and ensure that the data of data exchange are complete It is whole and have and be not take up extra storage space.
Step S26: summarize the total data exchange information of the data exchange thread, and control the total data exchange information Showed.
It is understood that ETL control node summarizes the data exchange after the completion of data exchange mission thread The total data of thread exchanges information, and controls the total data exchange information on corresponding front end UI.Wherein, the total data Switching packets include but be not limited to the data table name in data exchange, data exchange total line number and size of data.It will be described Total data exchange information is presented on corresponding front end UI, it can be achieved that data exchange information visualization, facilitates user to obtain sum According to exchange information, judge whether data exchange task is completed.
Shown in Figure 3, the embodiment of the present application discloses a kind of specific method for interchanging data, is applied to ETL control section Point, this method comprises:
Step S31: back-end data base information in distributed data base is obtained.
In the present embodiment, ETL tool needs extract data from distributed data base.ETL control node will need to extract Data information be sent to distributed data base middleware, after the middleware parses the data information, according to itself fragment Regular and routing rule obtains distributed data base back-end data base information, the back-end data base in conjunction with the data information Information includes the permission of the information of back-end data base quantity, the link information of the back-end data base and the back-end data base Information, and the back-end data base information is returned into the ETL control node.
Step S32: according to the back-end data base information, ETL working node creation quantity and back-end data base number are controlled Measure equal data exchange mission thread.
Step S33: it controls the ETL working node and executes the data exchange mission thread, directly from Back end data Target data is concurrently extracted in library, and the target data is concurrently directly loaded into purpose database.
It is understood that ETL tool needs extract data from distributed data base, i.e., to distribution in the present embodiment Formula database carries out read operation, so the back-end data base of the distributed data base is source database.
Step S34: when data exchange mission failure, judge whether the data exchange mission failure is handed over by the data Changing mission thread failure causes.
Step S35: if it is, controlling the corresponding ETL working node of the data exchange task establishes new data Switching task thread, and rollback operation is carried out to the fault data in purpose database, then taken out from back-end data base again The corresponding data of the fault data are taken, until the data exchange task is completed.
Step S36: summarize the total data exchange information of the data exchange thread, and control the total data exchange information Showed.
When ETL tool needs to extract data from distributed data base, according to the back-end data base information got, ETL control node controls the ETL working node creation quantity data exchange mission thread equal with back-end data base quantity, and holds The row data exchange mission thread, directly concurrently to extract target data from back-end data base, and directly the target Data Concurrent is loaded into purpose database.Middleware single-point performance is not present without middleware in the data exchange mission thread Bottleneck.As shown in figure 4, ETL control node F11 controls ETL working node G11, ETL working node G12 and ETL working node A data switching task thread is respectively created in G13, respectively from back-end data base E11, back-end data base E12 and back-end data base Target data is concurrently directly extracted in E13, and directly the target data is concurrently loaded into purpose database H11, in figure Middleware K11 is connected by a dotted line with the ETL control node F11, indicates the ETL control node F11 and the middleware K11 is interacted, and gets the process of back-end data base information.
Shown in Figure 5, the embodiment of the present application discloses a kind of specific method for interchanging data, is applied to ETL control section Point, this method comprises:
Step S41: back-end data base information in distributed data base is obtained.
In the present embodiment, ETL tool needs that data are written into distributed data base.ETL control node will need to be written Data information be sent to distributed data base middleware, after the middleware parses the data information, according to itself fragment Regular and routing rule obtains distributed data base back-end data base information, the back-end data base in conjunction with the data information Information include the information of back-end data base quantity, the link information of the back-end data base, the back-end data base fragment rule The then authority information of information and the back-end data base, and the back-end data base information is returned into the ETL control section Point.
Step S42: according to the back-end data base information, ETL working node creation quantity and back-end data base number are controlled Measure equal data exchange mission thread.
Step S43: it controls the ETL working node and executes the data exchange mission thread, directly from source database In concurrently extract target data, and the target data is concurrently directly loaded into back-end data base.
It is understood that ETL tool needs to be written data into distributed data base, i.e., to distribution in the present embodiment Formula database carries out write operation, so the back-end data base of distributed data base is purpose database.It specifically it is to be understood that will The fragment rule that step S41 is got is converted into decimation rule, and target data is directly concurrently extracted from the source database, and Directly target data is written in the corresponding back-end data base of the fragment rule.
Step S44: when data exchange mission failure, judge whether the data exchange mission failure is handed over by the data Changing mission thread failure causes.
Step S45: if it is, controlling the corresponding ETL working node of the data exchange task establishes new data Switching task thread, and rollback operation is carried out to the fault data in back-end data base, then extracted from source database again The corresponding data of the fault data, until the data exchange task is completed.
Step S46: summarize the total data exchange information of the data exchange thread, and control the total data exchange information Showed.
When ETL tool needs that data are written into distributed data base, according to the back-end data base information got, ETL control node controls the ETL working node creation quantity data exchange mission thread equal with back-end data base quantity, and holds The row data exchange mission thread, directly concurrently to extract target data from source database, and directly the number of targets According to the back-end data base for being concurrently loaded into distributed data base.The data exchange mission thread is not present without middleware Between part single-point performance bottleneck.As shown in fig. 6, ETL control node F21 control ETL working node G21, ETL working node G22 and A data switching task thread is respectively created in ETL working node G23, directly concurrently extracts mesh from source database H21 respectively Data are marked, and the target data is concurrently directly loaded into back-end data base E21, back-end data base E22 and back-end data base In E23, middleware K21 is connected by a dotted line with the ETL control node F21 in figure, indicate the ETL control node F21 with The middleware K21 is interacted, and gets the process of back-end data base information.
Shown in Figure 7, the embodiment of the present application discloses a kind of DEU data exchange unit, comprising:
Data obtaining module 11, for obtaining back-end data base information in distributed data base;
Thread creation module 12, for controlling ETL working node creation quantity with after according to the back-end data base information The equal data exchange mission thread of client database quantity;
Data exchange control module 13 executes the data exchange mission thread for controlling the ETL working node, with Target data is directly concurrently extracted from source database, and the target data is concurrently directly loaded into purpose database.
As it can be seen that the application is as it can be seen that the application obtains back-end data base information in distributed data base;Then according to after described Client database information, the control ETL working node creation quantity data exchange mission thread equal with back-end data base quantity;And It controls the ETL working node and executes the data exchange mission thread, directly concurrently to extract number of targets from source database According to, and the target data is concurrently directly loaded into purpose database.It can be seen that the application is getting distributed data In library after back-end data base information, the ETL working node creation quantity data exchange equal with back-end data base quantity can be controlled Mission thread, and control the operating point ETL and be performed simultaneously a plurality of data exchange mission thread, with direct concurrent access originator Database and purpose database, the data exchange mission thread is without middleware, and there is no the limitations of single-point performance, realizes number According to concurrent reading and writing, distributed data base data exchange performance is improved.
In the present embodiment, the DEU data exchange unit, further includes:
Information summarizing module, the total data for summarizing the data exchange thread exchanges information, and controls the sum Showed according to exchange information.
Further, the embodiment of the present application also discloses a kind of switch, and the switch can be ETL control node 20 as shown in Figure 8, the ETL control node can specifically include but are not limited to computer etc..
In general, the ETL control node 20 in the present embodiment includes: processor 21 and memory 22.
Wherein, the memory 22, for storing computer program;
The processor 21, for executing the computer program, to realize data exchange disclosed in previous embodiment Method.
Wherein, processor 21 may include one or more processing cores, such as four core processors, eight core processors Deng.Processor 21 can use DSP (digital signal processing, Digital Signal Processing), FPGA (field- Programmable gate array, field-programmables array), PLA (programmable logic array, may be programmed Logic array) at least one of hardware realize.Processor 21 also may include primary processor and coprocessor, primary processor It is the processor for being handled data in the awake state, also referred to as CPU (central processing unit, in Answer processor);Coprocessor is the low power processor for being handled data in the standby state.In some implementations In example, processor 21 can integrate GPU (graphics processing unit, image processor), and GPU is aobvious for being responsible for The rendering and drafting of image to be shown needed for display screen.In some embodiments, processor 21 may include AI (artificial Intelligence, artificial intelligence) processor, the AI processor is for handling the calculating operation in relation to machine learning.
Memory 22 may include one or more computer readable storage mediums, and computer readable storage medium can be Non-transient.Memory 22 can also be including high-speed random access memory and nonvolatile memory, such as one or more A disk storage equipment, flash memory device.In the present embodiment, memory 22 is at least used to store following computer program 221, Wherein, it after which is loaded and executed by processor 21, can be realized disclosed in aforementioned any embodiment by ETL The method and step that control node executes.In addition, the resource that memory 22 is stored can also include operating system 222 and data 223 etc., storage mode can be of short duration storage and be also possible to permanently store.Wherein, operating system 222 can be Windows, Unix, Linux etc..Data 223 may include various data.
In some embodiments, ETL control node 20, which may also include, has display screen 23, input/output interface 24, communication to connect Mouth 25, sensor 26, power supply 27 and communication bus 28.
Those skilled in the art is appreciated that structure shown in Fig. 8 does not constitute the restriction to ETL control node 20, It may include than illustrating more or fewer components.
Further, the embodiment of the present application also discloses a kind of computer readable storage medium, for saving computer journey Sequence, wherein the computer program performs the steps of when being executed by processor
Obtain back-end data base information in distributed data base;According to the back-end data base information, ETL work section is controlled The point creation quantity data exchange mission thread equal with back-end data base quantity;It controls described in the ETL working node execution Data exchange mission thread, directly concurrently to extract target data from source database, and it is directly that the target data is concurrent It is loaded into purpose database.
As it can be seen that the application is as it can be seen that the application obtains back-end data base information in distributed data base;Then according to after described Client database information, the control ETL working node creation quantity data exchange mission thread equal with back-end data base quantity;And It controls the ETL working node and executes the data exchange mission thread, directly concurrently to extract number of targets from source database According to, and the target data is concurrently directly loaded into purpose database.It can be seen that the application is getting distributed data In library after back-end data base information, the ETL working node creation quantity data exchange equal with back-end data base quantity can be controlled Mission thread, and control the operating point ETL and be performed simultaneously a plurality of data exchange mission thread, with direct concurrent access originator Database and purpose database, the data exchange mission thread is without middleware, and there is no the limitations of single-point performance, realizes number According to concurrent reading and writing, distributed data base data exchange performance is improved.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: when needing to carry out data reading from distributed data base, obtaining includes back-end data base Quantity information, the back-end data base link information and the back-end data base authority information back-end data base letter Breath.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: when needing to carry out data write-in into distributed data base, obtaining includes back-end data base Quantity information, the fragment Rule Information of the back-end data base, the link information of the back-end data base and the rear end The back-end data base information of the authority information of database.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: when needing to carry out data reading from distributed data base, control the ETL working node The data exchange mission thread is executed, directly concurrently to extract target data from back-end data base, and directly by the mesh Mark Data Concurrent is loaded into purpose database.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: when needing to carry out data write-in to distributed data base, control the ETL working node and hold The row data exchange mission thread, directly concurrently to extract target data from source database, and directly by the number of targets According to being concurrently loaded into back-end data base.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: when needing to carry out data write-in from distributed data base, control the ETL working node The data exchange mission thread is executed, the fragment rule of the back-end data base is converted into decimation rule, directly from source number Target data is concurrently extracted according to library, and the target data is concurrently directly loaded into the back-end data base.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: when data exchange mission failure, judge the data exchange mission failure whether by described Data exchange mission thread failure causes;It is built if it is, controlling the corresponding ETL working node of the data exchange task Vertical new data switching task thread, and rollback operation is carried out to the fault data in purpose database, then again from source data The corresponding data of the fault data are extracted in library, until the data exchange task is completed.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: if the data exchange mission failure is led by the data exchange mission thread failure It causes, then judges whether the data exchange mission failure is caused by the ETL working node failure;If it is, control is in The ETL working node of normal operating conditions establishes new data switching task thread, and to the fault data in purpose database into Then row rollback operation extracts the corresponding data of the fault data from the source database again, until the data are handed over Change task completion.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: the corresponding fragment rule of the data exchange task is converted choosing by control ETL working node Rule is selected, to filter out the fault data in purpose database, and deletes the fault data.
In the present embodiment, when the computer subprogram saved in the computer readable storage medium is executed by processor, Following steps can be implemented: summarize the total data exchange information of the data exchange thread, and control the total data and hand over Information is changed to be showed.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part Explanation.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
Finally, it is to be noted that, herein, such as first and second etc relational terms are used merely to one A entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that a series of process, method, article or equipments comprising other element are not only wrapped Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that equipment is intrinsic.In the absence of more restrictions, the element limited by sentence "including a ...", and It is not excluded in process, method, article or equipment in the process, method, article or apparatus that includes the element that there is also other identical elements.
A kind of method for interchanging data provided herein, device, equipment, medium are described in detail above, this Specific case is applied in text, and the principle and implementation of this application are described, the explanation of above example is only intended to Help understands the present processes and its core concept;At the same time, for those skilled in the art, the think of according to the application Think, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not be construed as pair The limitation of the application.

Claims (10)

1. a kind of method for interchanging data, which is characterized in that be applied to ETL control node, comprising:
Obtain back-end data base information in distributed data base;
According to the back-end data base information, the ETL working node creation quantity data equal with back-end data base quantity are controlled Switching task thread;
It controls the ETL working node and executes the data exchange mission thread, directly concurrently to extract mesh from source database Data are marked, and the target data is concurrently directly loaded into purpose database.
2. method for interchanging data according to claim 1, which is characterized in that rear end number in the acquisition distributed data base According to library information, comprising:
When need from distributed data base carry out data reading when, obtain include back-end data base quantity information, it is described after The back-end data base information of the authority information of the link information of client database and the back-end data base;
Or, when needing to carry out data write-in into distributed data base, obtain include back-end data base quantity information, described The authority information of the fragment Rule Information of back-end data base, the link information of the back-end data base and the back-end data base Back-end data base information.
3. method for interchanging data according to claim 2, which is characterized in that the control ETL working node executes The data exchange mission thread, directly concurrently to extract target data from source database, and directly by the target data Concurrently it is loaded into purpose database, comprising:
When needing to carry out data reading from distributed data base, controls the ETL working node and execute the data exchange Mission thread directly concurrently to extract target data from the back-end data base, and directly concurrently adds the target data It is downloaded to purpose database;
Or, controlling the ETL working node when needing to carry out data write-in to distributed data base and executing the data exchange The target data directly concurrently to extract target data from source database, and is directly concurrently loaded into institute by mission thread State back-end data base.
4. method for interchanging data according to claim 3, which is characterized in that the control ETL working node executes The data exchange mission thread, directly concurrently to extract target data from source database, and directly by the target data Concurrently it is loaded into purpose database, comprising:
When needing to carry out data write-in from distributed data base, controls the ETL working node and execute the data exchange The fragment rule of the back-end data base is converted into decimation rule, directly concurrently extracts target from source database by mission thread Data, and the target data is concurrently directly loaded into the back-end data base.
5. method for interchanging data according to claim 1, which is characterized in that the control ETL working node executes The data exchange mission thread, directly concurrently to extract target data from source database, and directly by the target data During being concurrently loaded into purpose database, further includes:
When data exchange mission failure, judge the data exchange mission failure whether by the data exchange mission thread event Barrier causes;
If it is, controlling the corresponding ETL working node of the data exchange task establishes new data switching task thread, And rollback operation is carried out to the fault data in purpose database, the number of faults is then extracted from the source database again According to corresponding data, until the data exchange task is completed.
6. method for interchanging data according to claim 5, which is characterized in that the judgement data exchange mission failure Whether caused by the data exchange mission thread failure, comprising:
If the data exchange mission failure is caused by the data exchange mission thread failure, judge that the data are handed over Change whether mission failure is caused by the ETL working node failure;
If it is, control establishes new data switching task thread in the ETL working node of normal operating conditions, and to purpose Fault data in database carries out rollback operation, and it is corresponding then to extract the fault data from the source database again Data, until the data exchange task is completed.
7. method for interchanging data according to claim 6, which is characterized in that the fault data in purpose database Carry out rollback operation, comprising:
The corresponding fragment rule of the data exchange task is converted selection rule by control ETL working node, to filter out mesh Database in fault data, and delete the fault data.
8. a kind of DEU data exchange unit characterized by comprising
Data obtaining module, for obtaining back-end data base information in distributed data base;
Thread creation module, for controlling ETL working node creation quantity and Back end data according to the back-end data base information The equal data exchange mission thread of library quantity;
Data exchange control module executes the data exchange mission thread for controlling the ETL working node, with directly from Target data is concurrently extracted in source database, and the target data is concurrently directly loaded into purpose database.
9. a kind of switch, comprising:
Memory and processor;
Wherein, the memory, for storing computer program;
The processor, for executing the computer program, to realize the described in any item data exchanges of claim 1 to 7 Method.
10. a kind of computer readable storage medium, for saving computer program, wherein the computer program is by processor Method for interchanging data as described in any one of claim 1 to 7 is realized when execution.
CN201910779089.9A 2019-08-22 2019-08-22 Data exchange method, device, equipment and medium Active CN110471977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910779089.9A CN110471977B (en) 2019-08-22 2019-08-22 Data exchange method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910779089.9A CN110471977B (en) 2019-08-22 2019-08-22 Data exchange method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN110471977A true CN110471977A (en) 2019-11-19
CN110471977B CN110471977B (en) 2022-04-22

Family

ID=68512813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910779089.9A Active CN110471977B (en) 2019-08-22 2019-08-22 Data exchange method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN110471977B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111309805A (en) * 2019-12-13 2020-06-19 华为技术有限公司 Data reading and writing method and device for database
CN112347173A (en) * 2020-11-09 2021-02-09 杭州数梦工场科技有限公司 Data exchange control method and device
CN113076365A (en) * 2021-04-07 2021-07-06 杭州数梦工场科技有限公司 Data synchronization method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140040182A1 (en) * 2008-08-26 2014-02-06 Zeewise, Inc. Systems and methods for collection and consolidation of heterogeneous remote business data using dynamic data handling
CN104462269A (en) * 2014-11-24 2015-03-25 中国联合网络通信集团有限公司 Isomerous database data exchange method and system
CN105205105A (en) * 2015-08-27 2015-12-30 浪潮集团有限公司 Data ETL (Extract Transform Load) system based on storm and treatment method based on storm
CN106354795A (en) * 2016-08-26 2017-01-25 南威软件股份有限公司 Distributed data exchanging system
US20170075964A1 (en) * 2015-09-11 2017-03-16 International Business Machines Corporation Transforming and loading data utilizing in-memory processing
CN107784026A (en) * 2016-08-31 2018-03-09 杭州海康威视数字技术股份有限公司 A kind of ETL data processing methods and device
CN108388615A (en) * 2018-02-09 2018-08-10 杭州数梦工场科技有限公司 A kind of method for interchanging data, system and electronic equipment
CN109635024A (en) * 2018-11-23 2019-04-16 华迪计算机集团有限公司 A kind of data migration method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140040182A1 (en) * 2008-08-26 2014-02-06 Zeewise, Inc. Systems and methods for collection and consolidation of heterogeneous remote business data using dynamic data handling
CN104462269A (en) * 2014-11-24 2015-03-25 中国联合网络通信集团有限公司 Isomerous database data exchange method and system
CN105205105A (en) * 2015-08-27 2015-12-30 浪潮集团有限公司 Data ETL (Extract Transform Load) system based on storm and treatment method based on storm
US20170075964A1 (en) * 2015-09-11 2017-03-16 International Business Machines Corporation Transforming and loading data utilizing in-memory processing
CN106354795A (en) * 2016-08-26 2017-01-25 南威软件股份有限公司 Distributed data exchanging system
CN107784026A (en) * 2016-08-31 2018-03-09 杭州海康威视数字技术股份有限公司 A kind of ETL data processing methods and device
CN108388615A (en) * 2018-02-09 2018-08-10 杭州数梦工场科技有限公司 A kind of method for interchanging data, system and electronic equipment
CN109635024A (en) * 2018-11-23 2019-04-16 华迪计算机集团有限公司 A kind of data migration method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QIAN WANG 等: "Research of ETL on university data exchange platform", 《2011 IEEE INTERNATIONAL SYMPOSIUM ON IT IN MEDICINE AND EDUCATION》 *
李磊: "基于ETL的数据集成及交换***的实现与优化", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111309805A (en) * 2019-12-13 2020-06-19 华为技术有限公司 Data reading and writing method and device for database
CN111309805B (en) * 2019-12-13 2023-10-20 华为技术有限公司 Data reading and writing method and device for database
US11868333B2 (en) 2019-12-13 2024-01-09 Huawei Technologies Co., Ltd. Data read/write method and apparatus for database
CN112347173A (en) * 2020-11-09 2021-02-09 杭州数梦工场科技有限公司 Data exchange control method and device
CN113076365A (en) * 2021-04-07 2021-07-06 杭州数梦工场科技有限公司 Data synchronization method and device, electronic equipment and storage medium
CN113076365B (en) * 2021-04-07 2024-05-10 杭州数梦工场科技有限公司 Data synchronization method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110471977B (en) 2022-04-22

Similar Documents

Publication Publication Date Title
CN111723160B (en) Multi-source heterogeneous incremental data synchronization method and system
CN105468473B (en) Data migration method and data migration device
CN110471977A (en) A kind of method for interchanging data, device, equipment, medium
CN112130768B (en) Disk array online capacity expansion method and device and computer readable storage medium
CN103888378B (en) A kind of data exchange system and method based on caching mechanism
CN102750317B (en) Method and device for data persistence processing and data base system
CN107122360A (en) Data mover system and method
CN103929500A (en) Method for data fragmentation of distributed storage system
CN103890729A (en) Collaborative management of shared resources
CN104598376A (en) Data driving layered automation test system and method
CN103377100B (en) A kind of data back up method, network node and system
CN104202424B (en) A kind of method using software architecture to expand buffer memory
CN105095103B (en) For the storage device management method and apparatus under cloud environment
CN106055622A (en) Data searching method and system
CN102799485A (en) Historical data migration method and device
CN104735107A (en) Recovery method and device for data copies in distributed storage system
CN106326398A (en) Data consistency comparison method and device
CN109766206A (en) A kind of log collection method and system
CN111125065A (en) Visual data synchronization method, system, terminal and computer readable storage medium
CN110402429A (en) Storage account interruption is kept out in duplication for managing the storage table of resource based on cloud
CN101827120A (en) Cluster storage method and system
CN107111534A (en) A kind of method and apparatus of data processing
CN107832403A (en) Catalogue file management method, device, electric terminal and readable storage medium storing program for executing
CN110213100A (en) A kind of disaster recovery method of configuration data, device, equipment and readable storage medium storing program for executing
CN104462108A (en) Database structure object processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant