CN103763368A - Cross-data-center data synchronism method - Google Patents

Cross-data-center data synchronism method Download PDF

Info

Publication number
CN103763368A
CN103763368A CN201410023373.0A CN201410023373A CN103763368A CN 103763368 A CN103763368 A CN 103763368A CN 201410023373 A CN201410023373 A CN 201410023373A CN 103763368 A CN103763368 A CN 103763368A
Authority
CN
China
Prior art keywords
data
data center
daily record
module
master
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410023373.0A
Other languages
Chinese (zh)
Other versions
CN103763368B (en
Inventor
王恩东
文中领
张立强
袁冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410023373.0A priority Critical patent/CN103763368B/en
Publication of CN103763368A publication Critical patent/CN103763368A/en
Priority to PCT/CN2015/070416 priority patent/WO2015106656A1/en
Application granted granted Critical
Publication of CN103763368B publication Critical patent/CN103763368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a cross-data-center data synchronism method. The method specifically comprises the step of finishing writing-in of data and recording of a log, the step of carrying out synchronous dispatching and pushing, the step of log playing back to finish the data synchronism and the step of carrying out the cross-data-center data access to achieve asynchronous data synchronous operation. Compared with the prior art, the cross-data-center data synchronism method has the advantages that the cross-data-center asynchronous data synchronous operation can be achieved, and the data safety performance can be improved; IO resources in data centers and network resources between the data centers are effectively utilized, the practicality is high, and the method is easy to popularize.

Description

A kind of method of data synchronization across data center
Technical field
The present invention relates to technical field of computer data storage, specifically a kind of method of data synchronization across data center.
Background technology
Along with Internet era arrived: social networks, microblogging, location-based service etc. are just being surging forward towards general internet user's interactive website, as Google, Facebook, Twitter and domestic Renren Network, microblogging etc., to hundreds of millions of users, provide the interactive service based on the Internet and wireless network.The Internet user who is found everywhere through the world carries out diversified mutual every day, is all manufacturing various data at any time, and the quantity of these data is several times of unit epoch data volume.
For storing these data, each Internet firm has set up huge data center all over the world, the host number at individual data center hundreds of to the tens thousand of order of magnitude not etc.Information from Google shows, Google has dozens of data center and crosses ten million station server in the whole world, store the mass data that its global user produces every day.To the management and using of these data, be all huge challenge: comprise reading and the interface of storage, index and addressing, configuration and management, the data Replica between data center etc. of data, this wherein, between Dui Duo data center, synchronous support and the Research Requirements of data is particularly urgent.
The research of storing for the data of magnanimity is at present still in the infancy, for the method for data synchronization between data center, still have much worth research and improved aspect, take Hbase as example, Hbase copies the architecture that depends on Master/Slave, at 0.90.0 version, just added the characteristic of carrying out data Replica between simple Liang Ge data center, replication task does not have the realization of priority query, for the load of data center, does not do unified scheduling.On the other hand, traditional data synchronized algorithm across data center is conventionally with the transmission of monoblock data be covered as main method, and this method can take a large amount of Internet resources and IO resource.
For this situation, this patent has been invented a kind of method of data synchronization across data center based on daily record playback.
Summary of the invention
Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of method of data synchronization across data center is provided.
Technical scheme of the present invention realizes in the following manner, this kind of method of data synchronization across data center, and its specific implementation process is:
One, complete the record with daily record that writes of data: in master data center running log logging modle, when master data center receives the request of data that client sends, this module is recorded in master data center by the mode of asking desired operation with daily record, this module, in the mode of embedded or plug-in unit, is incorporated in the operation flow at master data center.
Two, isochronous schedules and propelling movement: scheduler module is set and operates in master data center, this scheduler module is responsible for data dispatching playback operation, according to the load at the load at master data center, Backup Data center, scheduling strategy information, the propelling movement of Activation Log and playback operation; The push operation that scheduler module requires completes by daily record pushing module, and this daily record pushing module, in the operation of master data center, arrives Backup Data center by data manipulation log transmission.
Three, daily record playback, complete data synchronous: daily record pushing module in master data center pushes the data manipulation execution of coming and received by daily record playback module, this daily record playback module operates in Backup Data center, and at current data center playback of data Operation Log, the data that realize Liang Ge data center are synchronous.
Four, carry out the data access across data center, realize asynchronous data simultaneous operation.
The detailed process of described step 1 is: client recognizes the master data center at customer data place according to local configuration, and all data manipulations are all sent to master data center, transfers to the back end in master data to process; Receive after the request of client at master data center, according to the operation of request and content, carries out client's operation, and in this course, logger module captures operation and the related data of client requests by the mode of intercept requests; Logger module judges whether the operation of client needs the data of data center to modify, if needed, this data manipulation needs as the content across data center's data simultaneous operation, now logger module is saved in the asynchronous log recording region at master data center by the operation of request and relevant data with proprietary journal format, and the content in this region is all to carry out the content across data center's data readback.
The detailed process of described step 2 is: first by the scheduler module that operates in master data center, monitor following condition.
1) number of daily record and the data volume relating in asynchronous log recording region;
2) loading condition at master data center, comprises network I/O and disk I/O;
3) loading condition at Backup Data center, comprises network I/O and disk I/O;
When above three meets the scheduling strategy of Configuration Management Officer setting, trigger daily record push operation, daily record push operation is carried out by daily record pushing module, and this module is responsible for the data manipulation daily record in asynchronous log recording region, master data center to be written to the asynchronous daily record execution area at Backup Data center; When daily record pushing module completes after transmitting across data center of daily record, can notify scheduler module, then scheduler module drives the playback of the daily record playback module execution journal at Backup Data center.
The detailed process of described step 3 is: operate in the daily record playback module at Backup Data center after receiving the notice of scheduler module, start the playback operation of execution journal, daily record playback module reads the data logging being stored in asynchronous daily record execution area, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again carry out this operation, make data consistent in the heart in the data at Backup Data center and master data, realize the synchronous across data center of data.
The client of described step 4 is obtained data by access Backup Data center in following two kinds of situations, when client cannot connect master data center; Client can connect master data center, but during the heavy traffic of master data center.
The beneficial effect that the present invention compared with prior art produced is:
A kind of method of data synchronization across data center of the present invention can be realized the asynchronous data simultaneous operation across data center, improves the fail safe of data; User, in the time cannot accessing master data center, can also obtain data by access preliminary data center; Due in replayed section, only need to transmit the difference of data, and without transmission data itself, so this method can also reduce the data volume of transmission, minimizing simultaneous operation taking bandwidth between data center; In addition, the scheduler module in system can be dispatched according to the load of data center, effectively utilizes the IO resource of data center inside and the Internet resources between data center, plays the effect of load balance; Practical, be easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 is implementation procedure schematic diagram of the present invention.
Embodiment
Below in conjunction with accompanying drawing, a kind of method of data synchronization across data center of the present invention is described in detail below.
As shown in Figure 1, a kind of method of data synchronization across data center, by the playback of data manipulation daily record, the asynchronous data realizing between data center is synchronous.Its specific implementation process is:
First by programming, following module is set:
(1) logger module.Operate in master data center, be responsible for, when master data center receives the request of data that client sends, the mode of asking desired operation with daily record being recorded in to master data center.This module, in the mode of embedded or plug-in unit, is incorporated in the operation flow at master data center.
(2) scheduler module.Operate in master data center, be responsible for data dispatching playback operation.According to the information such as the load at the load at master data center, Backup Data center, scheduling strategy, the propelling movement of Activation Log and playback operation.
(3) daily record pushing module.Operate in master data center, be responsible for the push operation that operation dispatching module requires, data manipulation log transmission is arrived to Backup Data center.
(4) daily record playback module.Operate in Backup Data center, be responsible for receiving master data center daily record pushing module and push the data manipulation execution of coming, and at current data center playback of data Operation Log, the data that realize Liang Ge data center are synchronous.
By complete following operation with upper module:
One, data writes the record with daily record.
In normal situation, client recognizes the master data center at customer data place according to local configuration, and by all data manipulations, comprise read, write, deletion etc., all send to master data center, transfer to the back end in master data to process.
Receive after the request of client at master data center, can carry out client's operation according to the operation of request and content.In this course, logger module can capture by the mode of intercept requests operation and the related data of client requests.
Logger module can judge whether the operation of client needs the data of data center to modify, if needed, illustrates that this data manipulation needs as the content across data center's data simultaneous operation.In this time, logger module can be saved in the asynchronous log recording region at master data center by the operation of request and relevant data with proprietary journal format, and the content in this region is all to carry out the content across data center's data readback.
Two, the propelling movement of isochronous schedules and daily record.
Operate in the scheduler module at master data center, can monitor following condition:
1) number of daily record and the data volume relating in asynchronous log recording region.
2) loading condition at master data center, comprises network I/O and disk I/O.
3) loading condition at Backup Data center, main network I/O and the disk I/O of comprising.
When above three meets the scheduling strategy of Configuration Management Officer setting, trigger daily record push operation.The prerequisite triggering is condition 1 normally) higher, and condition 2) and condition 3) lower.
Daily record push operation is carried out by daily record pushing module, and this module is responsible for the data manipulation daily record in asynchronous log recording region, master data center to be written to the asynchronous daily record execution area at Backup Data center.
When daily record pushing module completes after transmitting across data center of daily record, can notify scheduler module.Then scheduler module drives the playback of the daily record playback module execution journal at Backup Data center.
Three, the playback of daily record.
The daily record that operates in Backup Data center is paid a return visit module after receiving the notice of scheduler module, starts the playback operation of execution journal.
Daily record is paid a return visit module and is read the data logging being stored in asynchronous daily record execution area, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again carry out this operation, make data consistent in the heart in the data at Backup Data center and master data.Thereby realized the synchronous across data center of data.
Four, across the data access of data center.
Client may be obtained data by access Backup Data center in both cases:
The first, when client cannot connect master data center.This situation is likely that fault has occurred at master data center, may be also because the network between master data center and client interrupts.
When there is this situation, client will be attempted from Backup Data center read operation, and can only carry out read operation.In addition, because whether data between uncertain Backup Data center and master data center now have completed simultaneous operation, therefore client can show relevant information to user, notify user now the source of data are Backup Data centers, and there is Data Consistency.
The second, client can connect master data center, but during the heavy traffic of master data center.
When there is this situation, if client judgement client's operation is read-only operation, can confirm to master data center whether the data that operation relates to have been synchronized to Backup Data center, if master data center informs that client data synchronously completes, client can be obtained the data that client will read by access Backup Data center.Now, the effect of a load balance is played at Backup Data center.
Method of data synchronization across data center of the present invention, can realize the asynchronous data simultaneous operation across data center, improves the fail safe of data.User, in the time cannot accessing master data center, can also obtain data by access preliminary data center.Can effectively reduce data volume synchronous produced between data center.Due in replayed section, only need to transmit the difference of data, and without transmission data itself, therefore can reduce the data volume of transmission, reduce simultaneous operation taking bandwidth between data center.Can dispatch according to the load of data center, effectively utilize the IO resource of data center inside and the Internet resources between data center, play the effect of load balance.
The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (5)

  1. The method of data synchronization of 1.Yi Zhongkua data center, is characterized in that its specific implementation process is:
    One, complete the record with daily record that writes of data: in master data center running log logging modle, when master data center receives the request of data that client sends, this module is recorded in master data center by the mode of asking desired operation with daily record, this module, in the mode of embedded or plug-in unit, is incorporated in the operation flow at master data center;
    Two, isochronous schedules and propelling movement: scheduler module is set and operates in master data center, this scheduler module is responsible for data dispatching playback operation, according to the load at the load at master data center, Backup Data center, scheduling strategy information, the propelling movement of Activation Log and playback operation; The push operation that scheduler module requires completes by daily record pushing module, and this daily record pushing module, in the operation of master data center, arrives Backup Data center by data manipulation log transmission;
    Three, daily record playback, complete data synchronous: daily record pushing module in master data center pushes the data manipulation execution of coming and received by daily record playback module, this daily record playback module operates in Backup Data center, and at current data center playback of data Operation Log, the data that realize Liang Ge data center are synchronous;
    Four, carry out the data access across data center, realize asynchronous data simultaneous operation.
  2. 2. a kind of method of data synchronization across data center according to claim 1, it is characterized in that: the detailed process of described step 1 is: client recognizes the master data center at customer data place according to local configuration, and all data manipulations are all sent to master data center, transfer to the back end in master data to process; Receive after the request of client at master data center, according to the operation of request and content, carries out client's operation, and in this course, logger module captures operation and the related data of client requests by the mode of intercept requests; Logger module judges whether the operation of client needs the data of data center to modify, if needed, this data manipulation needs as the content across data center's data simultaneous operation, now logger module is saved in the asynchronous log recording region at master data center by the operation of request and relevant data with proprietary journal format, and the content in this region is all to carry out the content across data center's data readback.
  3. 3. a kind of method of data synchronization across data center according to claim 1 and 2, is characterized in that: the detailed process of described step 2 is: first by the scheduler module that operates in master data center, monitor following condition,
    1) number of daily record and the data volume relating in asynchronous log recording region;
    2) loading condition at master data center, comprises network I/O and disk I/O;
    3) loading condition at Backup Data center, comprises network I/O and disk I/O;
    When above three meets the scheduling strategy of Configuration Management Officer setting, trigger daily record push operation, daily record push operation is carried out by daily record pushing module, and this module is responsible for the data manipulation daily record in asynchronous log recording region, master data center to be written to the asynchronous daily record execution area at Backup Data center; When daily record pushing module completes after transmitting across data center of daily record, can notify scheduler module, then scheduler module drives the playback of the daily record playback module execution journal at Backup Data center.
  4. 4. a kind of method of data synchronization across data center according to claim 3, it is characterized in that: the detailed process of described step 3 is: operate in the daily record playback module at Backup Data center after receiving the notice of scheduler module, start the playback operation of execution journal, daily record playback module reads the data logging being stored in asynchronous daily record execution area, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again carry out this operation, make data consistent in the heart in the data at Backup Data center and master data, realize the synchronous across data center of data.
  5. 5. a kind of method of data synchronization across data center according to claim 4, is characterized in that: the client of described step 4 is obtained data by access Backup Data center in following two kinds of situations, when client cannot connect master data center; Client can connect master data center, but during the heavy traffic of master data center.
CN201410023373.0A 2014-01-20 2014-01-20 A kind of method of data synchronization across data center Active CN103763368B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410023373.0A CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center
PCT/CN2015/070416 WO2015106656A1 (en) 2014-01-20 2015-01-09 Cross-data-center data synchronization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410023373.0A CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center

Publications (2)

Publication Number Publication Date
CN103763368A true CN103763368A (en) 2014-04-30
CN103763368B CN103763368B (en) 2016-07-06

Family

ID=50530527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410023373.0A Active CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center

Country Status (2)

Country Link
CN (1) CN103763368B (en)
WO (1) WO2015106656A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104219288A (en) * 2014-08-14 2014-12-17 中国南方电网有限责任公司超高压输电公司 Multi-thread based distributed data synchronism method and system thereof
CN104519130A (en) * 2014-12-16 2015-04-15 北京中交兴路车联网科技有限公司 Trans-IDC (internet data center) data sharing caching method
WO2015106656A1 (en) * 2014-01-20 2015-07-23 浪潮电子信息产业股份有限公司 Cross-data-center data synchronization method
CN104899278A (en) * 2015-05-29 2015-09-09 北京京东尚科信息技术有限公司 Method and apparatus for generating data operation logs of Hbase database
CN105610917A (en) * 2015-12-22 2016-05-25 腾讯科技(深圳)有限公司 Method and system for achieving repair of synchronous data in system
CN106557530A (en) * 2015-09-30 2017-04-05 腾讯科技(深圳)有限公司 Operation system, data recovery method and device
CN110290214A (en) * 2019-06-28 2019-09-27 苏州浪潮智能科技有限公司 A kind of transmitting data file method and system

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750594B (en) * 2019-09-30 2023-05-30 上海视云网络科技有限公司 Real-time cross-network database synchronization method based on mysql incremental log

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677931A (en) * 2004-04-02 2005-10-05 鸿富锦精密工业(深圳)有限公司 Network daily-record data management system and method
CN101043375A (en) * 2007-03-15 2007-09-26 华为技术有限公司 Distributed system journal collecting method and system
US20100179940A1 (en) * 2008-08-26 2010-07-15 Gilder Clark S Remote data collection systems and methods
US20120151250A1 (en) * 2010-12-14 2012-06-14 Hitachi, Ltd. Failure recovery method in information processing system and information processing system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075556B (en) * 2009-11-19 2014-11-26 北京明朝万达科技有限公司 Method for designing service architecture with large-scale loading capacity
CN103500229B (en) * 2013-10-24 2017-04-19 北京奇虎科技有限公司 Database synchronization method and database system
CN103763368B (en) * 2014-01-20 2016-07-06 浪潮电子信息产业股份有限公司 A kind of method of data synchronization across data center

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677931A (en) * 2004-04-02 2005-10-05 鸿富锦精密工业(深圳)有限公司 Network daily-record data management system and method
CN101043375A (en) * 2007-03-15 2007-09-26 华为技术有限公司 Distributed system journal collecting method and system
US20100179940A1 (en) * 2008-08-26 2010-07-15 Gilder Clark S Remote data collection systems and methods
US20120151250A1 (en) * 2010-12-14 2012-06-14 Hitachi, Ltd. Failure recovery method in information processing system and information processing system

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015106656A1 (en) * 2014-01-20 2015-07-23 浪潮电子信息产业股份有限公司 Cross-data-center data synchronization method
CN104219288A (en) * 2014-08-14 2014-12-17 中国南方电网有限责任公司超高压输电公司 Multi-thread based distributed data synchronism method and system thereof
CN104219288B (en) * 2014-08-14 2018-03-23 中国南方电网有限责任公司超高压输电公司 Distributed Data Synchronization method and its system based on multithreading
CN104519130A (en) * 2014-12-16 2015-04-15 北京中交兴路车联网科技有限公司 Trans-IDC (internet data center) data sharing caching method
CN104519130B (en) * 2014-12-16 2018-02-27 北京中交兴路车联网科技有限公司 A kind of data sharing caching method across IDC
CN104899278A (en) * 2015-05-29 2015-09-09 北京京东尚科信息技术有限公司 Method and apparatus for generating data operation logs of Hbase database
CN106557530A (en) * 2015-09-30 2017-04-05 腾讯科技(深圳)有限公司 Operation system, data recovery method and device
CN106557530B (en) * 2015-09-30 2019-10-11 腾讯科技(深圳)有限公司 Operation system, data recovery method and device
CN105610917A (en) * 2015-12-22 2016-05-25 腾讯科技(深圳)有限公司 Method and system for achieving repair of synchronous data in system
CN105610917B (en) * 2015-12-22 2019-12-20 腾讯科技(深圳)有限公司 Method and system for realizing synchronous data repair in system
CN110290214A (en) * 2019-06-28 2019-09-27 苏州浪潮智能科技有限公司 A kind of transmitting data file method and system

Also Published As

Publication number Publication date
WO2015106656A1 (en) 2015-07-23
CN103763368B (en) 2016-07-06

Similar Documents

Publication Publication Date Title
CN103763368B (en) A kind of method of data synchronization across data center
CN105335513B (en) A kind of distributed file system and file memory method
CN101877783B (en) Network video recorder clustering video monitoring system and method
CN107895253A (en) A kind of method that electricity transaction function carries out micro services transformation
CN106446159B (en) A kind of method of storage file, the first virtual machine and name node
CN105912428B (en) Realize that source data is converted into the system and method for virtual machine image in real time
CN103455577A (en) Multi-backup nearby storage and reading method and system of cloud host mirror image file
CN103647797A (en) Distributed file system and data access method thereof
CN104899274B (en) A kind of memory database Efficient Remote access method
US9557938B2 (en) Data retrieval based on storage device activation schedules
CN105892943A (en) Access method and system for block storage data in distributed storage system
WO2020173248A1 (en) Data synchronization method and device, terminal, and storage medium
CN106303428A (en) A kind of security protection cloud platform
CN105283847A (en) Local store data versioning
US20130031221A1 (en) Distributed data storage system and method
CN105828017B (en) A kind of cloud storage access system and method towards video conference
CN103561033B (en) User remotely accesses the device and method of HDFS cluster
CN102143228A (en) Cloud storage system, cloud client and method for realizing storage area network service
CN101370027A (en) Network storage system, method and application server
CN102982182A (en) Data storage planning method and device
CN110083306A (en) A kind of distributed objects storage system and storage method
CN110099084A (en) A kind of method, system and computer-readable medium guaranteeing storage service availability
CN102820998B (en) Realize the dual computer fault-tolerant service system towards office application and date storage method thereof
CN103036952B (en) A kind of enterprise-level isomery merges storage management system
CN109451079A (en) A kind of cloud USB flash disk and its storage method and storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant