CN112559473B - Priority-based two-way synchronization method and system - Google Patents

Priority-based two-way synchronization method and system Download PDF

Info

Publication number
CN112559473B
CN112559473B CN202011446567.3A CN202011446567A CN112559473B CN 112559473 B CN112559473 B CN 112559473B CN 202011446567 A CN202011446567 A CN 202011446567A CN 112559473 B CN112559473 B CN 112559473B
Authority
CN
China
Prior art keywords
target
synchronized
object file
task queue
target operation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011446567.3A
Other languages
Chinese (zh)
Other versions
CN112559473A (en
Inventor
刘启春
余院兰
孙峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dream Database Co ltd
Original Assignee
Wuhan Dream Database Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dream Database Co ltd filed Critical Wuhan Dream Database Co ltd
Priority to CN202011446567.3A priority Critical patent/CN112559473B/en
Publication of CN112559473A publication Critical patent/CN112559473A/en
Application granted granted Critical
Publication of CN112559473B publication Critical patent/CN112559473B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1873Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a priority-based two-way synchronization method and a priority-based two-way synchronization system, wherein the two-way synchronization method comprises the following steps: acquiring to-be-synchronized operation from a source end, and attributing the to-be-synchronized operation to a first task queue or a second task queue according to the operation type of the to-be-synchronized operation; the first execution thread acquires a first target operation from the first task queue, analyzes the first target operation to obtain an operation object related to the first target operation, and updates the version number of an object file of the operation object based on the log sequence number of the first target operation; the second execution thread acquires a second target operation from the second task queue and analyzes the second target operation to obtain an object identification number ID1 of the second target operation and a log serial number SCN1 of the second target operation; and obtaining the version number of the target object file based on the object identification number ID1, and selectively synchronizing according to the size relationship between the log serial number SCN1 and the version number of the target object file.

Description

Priority-based two-way synchronization method and system
Technical Field
The invention belongs to the technical field of data synchronization, and particularly relates to a priority-based two-way synchronization method and a priority-based two-way synchronization system.
Background
The synchronization performance is a key index in a database data real-time synchronization system. In reality, the upper layer application of the database often generates (create table/insert) intermediate tables or intermediate data for storing intermediate result sets or processing some data, and after the tables or data are used up, the tables or data are deleted (drop table) or cleaned up (dump table) by the upper layer application.
Currently, database synchronization software synchronizes all data operations of a source database to a destination database strictly according to an execution sequence of the source database by taking a source database operation transaction as a unit. On the one hand, in a system with a large amount of statistical analysis on database data, on the other hand, the operation of a large amount of temporary tables, intermediate tables or intermediate data affects the performance of a data real-time synchronization system, especially when a large amount of temporary tables and intermediate tables without main keys/unique indexes are operated, the synchronization speed is seriously slowed down, on the other hand, the data operation is useless for a destination-end database, and the influence of the data synchronization result of data synchronization for a certain period of time on the actual data of the destination database is 0.
In view of this, overcoming the deficiencies of the prior art products is an urgent problem to be solved in the art.
Disclosure of Invention
Aiming at the defects or the improvement requirements of the prior art, the invention provides a priority-based two-way synchronization method and a priority-based two-way synchronization system, and aims to solve the problems that the operation of a large number of temporary tables, intermediate tables or intermediate data influences the performance of a data real-time synchronization system and seriously slows down the synchronization speed because a first execution thread and a second execution thread are in two-way synchronization, and the version number of an object file is associated with the log serial number of a DDL operation corresponding to the object file, so that the version number of the object file can indirectly reflect the execution condition of the DDL operation related to the object, and then the DML operation is selectively synchronized according to the size relationship between the log serial number of the DML operation and the version number of the object file, thereby not only influencing the consistency of the synchronization performance, but also accelerating the synchronization progress
To achieve the above object, according to one aspect of the present invention, there is provided a two-way synchronization method based on priority, the two-way synchronization method including:
acquiring an operation to be synchronized from a source end, and attributing the operation to be synchronized to a first task queue or a second task queue according to the operation type of the operation to be synchronized, wherein the priority of the first task queue is higher than that of the second task queue;
a first execution thread acquires a first target operation from the first task queue, analyzes the first target operation to obtain an operation object related to the first target operation, and updates the version number of an object file of the operation object based on the log sequence number of the first target operation;
a second execution thread acquires a second target operation from the second task queue and analyzes the second target operation to obtain an object identification number ID1 of the second target operation and a log serial number SCN1 of the second target operation;
and acquiring a target object file based on the object identification number ID1, further acquiring the version number of the target object file, and selectively synchronizing according to the size relationship between the log serial number SCN1 and the version number of the target object file.
Preferably, the selectively synchronizing according to the size relationship between the log serial number SCN1 and the version number of the target object file includes:
judging whether the log serial number SCN1 is larger than the version number of the target object file;
and if the log serial number SCN1 is greater than the version number of the target object file, synchronizing the second target operation.
Preferably, after the determining whether the log serial number SCN1 is greater than the version number of the target object file, the method further includes:
and if the log serial number SCN1 is smaller than the version number of the target object file, discarding the second target operation.
Preferably, the object file of each operation object comprises the reference times of the object file and the object identification number of the operation object;
the two-way synchronization method further comprises the following steps:
the second execution thread acquires a target object file based on the object identification number ID1, and adds one to the reference times of the target object file;
and after the transaction to which the second target operation belongs is finished, performing a subtraction operation on the reference times of the target object file.
Preferably, the two-way synchronization method further comprises:
the first execution thread judges whether the reference times of the operation object related to the first target operation are 0 or not;
and if the reference times of the operation objects related to the first target operation are 0, executing the first target operation.
Preferably, the first execution thread, after determining whether the number of references of the operation object related to the first target operation is 0, further includes:
and if the reference times of the operation objects related to the first target operation are not 0, waiting until the reference times are 0, and then executing the first target operation.
Preferably, the two-way synchronization method further comprises:
after a first execution thread acquires a first target operation from the first task queue, setting a first synchronization mark for the first target operation, wherein the first synchronization mark indicates that the first target operation is not synchronized to a destination;
after the first target operation is synchronized to a destination, setting a second synchronization mark for the first target operation, wherein the second synchronization mark indicates that the second target operation is synchronized to the destination;
the first execution thread obtains a next operation from the first task queue.
Preferably, the obtaining the operation to be synchronized from the source end, and attributing the operation to be synchronized to the first task queue or the second task queue according to the operation type of the operation to be synchronized includes:
acquiring an operation to be synchronized from a source end, and analyzing the operation to be synchronized to obtain an operation type of the operation to be synchronized;
if the operation to be synchronized is a DROP operation or a TRUNCATE operation, adding the operation to be synchronized to the first task queue;
and if the operation to be synchronized is a DML operation, adding the operation to be synchronized to the second task queue.
Preferably, the operation object includes a table.
To achieve the above object, according to another aspect of the present invention, there is provided a synchronization system including at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor programmed to perform the two-way synchronization method of the present invention.
Generally, compared with the prior art, the technical scheme of the invention has the following beneficial effects: the invention provides a priority-based two-way synchronization method and a priority-based two-way synchronization system, wherein the two-way synchronization method comprises the following steps: acquiring an operation to be synchronized from a source end, and attributing the operation to be synchronized to a first task queue or a second task queue according to the operation type of the operation to be synchronized, wherein the priority of the first task queue is higher than that of the second task queue; a first execution thread acquires a first target operation from the first task queue, analyzes the first target operation to obtain an operation object related to the first target operation, and updates the version number of an object file of the operation object based on the log sequence number of the first target operation; a second execution thread acquires a second target operation from the second task queue and analyzes the second target operation to obtain an object identification number ID1 of the second target operation and a log serial number SCN1 of the second target operation; and acquiring a target object file based on the object identification number ID1, further acquiring the version number of the target object file, and selectively synchronizing according to the size relationship between the log serial number SCN1 and the version number of the target object file.
According to the invention, the operations to be synchronized belong to different task queues according to the operation types of the operations to be synchronized, then different execution threads are adopted to synchronize the operations with different priority levels, and the operations with high priority levels influence the execution of the operations with low priority levels. In the actual execution process, because the first execution thread and the second execution thread are in two-way synchronization, and the version number of the object file is associated with the log serial number of the DDL operation corresponding to the object file, the version number of the object file can indirectly reflect the execution condition of the DDL operation related to the object, and then the DML operation is selectively synchronized according to the size relationship between the log serial number of the DML operation and the version number of the object file, so that the consistency of the synchronization performance is influenced, and the synchronization progress can be accelerated.
Drawings
Fig. 1 is a schematic flowchart of a priority-based two-way synchronization method according to an embodiment of the present invention;
FIG. 2 is a flow diagram illustrating a first thread of execution according to an embodiment of the present invention;
FIG. 3 is a flow diagram illustrating a second thread of execution according to an embodiment of the invention;
fig. 4 is a schematic structural diagram of a synchronization system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
in an actual application scenario, a source data synchronization system is deployed at a source database, a destination data synchronization system is deployed at a destination database, and the destination data synchronization system is provided with a first task queue, a second task queue, a first execution thread and a plurality of second execution threads, wherein the first task queue is used for storing operations with a first priority, the second task queue is used for storing operations with a second priority, the first execution thread is used for processing operations with the first priority, and the second execution thread is used for processing operations with the second priority, wherein the first priority is higher than the second priority.
Based on the foregoing synchronization environment, the present embodiment provides a two-way synchronization method based on priority, and referring to fig. 1, the two-way synchronization method includes the following steps:
step 101: the method comprises the steps of obtaining operations to be synchronized from a source end, and attributing the operations to be synchronized to a first task queue or a second task queue according to the operation type of the operations to be synchronized, wherein the priority of the first task queue is higher than that of the second task queue.
The operation types of the operations to be synchronized comprise DML operations and DDL operations, wherein the DML operations comprise INSERT operations, UPDATE operations and DELETE operations; the DDL operations, in turn, include a trunk operation, which refers to clearing the table, and a DROP operation, which refers to deleting the table.
Specifically, the source data synchronization system is responsible for capturing data operation information MSGINFO of the source database, and the destination data synchronization system is responsible for synchronizing the data operation information to the destination database, where the MSGINFO at least includes the following information: table information (tab) of the data operation, a log sequence number (SCN) of the data operation, and an operation type (op) of the data operation, wherein the op comprises a DML operation and a DDL operation.
In this embodiment, an operation to be synchronized from a source end is obtained, and the operation to be synchronized is analyzed to obtain an operation type of the operation to be synchronized; if the operation to be synchronized is a DROP operation or a trunk operation, adding the operation to be synchronized to the first task queue; and if the operation to be synchronized is a DML operation, adding the operation to be synchronized to the second task queue.
Step 102: and the first execution thread acquires a first target operation from the first task queue, analyzes the first target operation to obtain an operation object related to the first target operation, and updates the version number of an object file of the operation object based on the log sequence number of the first target operation.
Wherein the operation object comprises a table.
In this embodiment, the first execution thread fetches operations from the first task queue for synchronization, and the second execution thread fetches operations from the second task queue for synchronization. The priority of the second execution thread is higher than that of the first execution thread, wherein the number of the first execution threads can be multiple, for example, one execution thread for each transaction.
In an actual application scenario, an object file is set for each operand, where the object file is used to store an object identification number and a version number of the object file, where the version number of each object file is determined by the DDL operation associated with the operand, and in an alternative embodiment, the version number of each object file is equal to a log serial number of the DDL operation associated with the operand. In the synchronization process, the object file corresponding to the object identification number can be found based on the object identification number, the version number of the object file influences the execution of the DML operation of the operation object, and useless DML operation can be avoided. The object identification number of the operation object can be used as the name of the object file, so that the uniqueness of the name is ensured, and the subsequent traversal search is facilitated.
In this embodiment, first, the version number of the object file is associated with the log serial number of the DDL operation corresponding to the object file, and then the DML operation is selectively synchronized according to the size relationship between the log serial number of the DML operation and the version number of the object file, which not only affects the consistency of synchronization performance, but also accelerates the synchronization progress.
Step 103: and the second execution thread acquires a second target operation from the second task queue and analyzes the second target operation to obtain an object identification number ID1 of the second target operation and a log serial number SCN1 of the second target operation.
Step 104: and acquiring a target object file based on the object identification number ID1, further acquiring the version number of the target object file, and selectively synchronizing according to the size relationship between the log serial number SCN1 and the version number of the target object file.
In this embodiment, the second execution thread acquires a second target operation from the second task queue, and parses the second target operation to obtain an object identification number ID1 of the second target operation and a log serial number SCN1 of the second target operation, traverses the object file according to the object identification number ID1 to obtain a target object file of an operation object related to the second target operation, further determines a version number of the target object file, and selectively synchronizes according to a size relationship between the log serial number SCN1 and the version number of the target object file.
Specifically, it is determined whether the log serial number SCN1 is greater than the version number of the target object file; and if the log serial number SCN1 is greater than the version number of the target object file, synchronizing the second target operation. And if the log serial number SCN1 is smaller than the version number of the target object file, discarding the second target operation.
Because the first execution thread and the second execution thread are in double-path synchronization, and the version number of the object file is associated with the log serial number of the DDL operation corresponding to the object file, the version number of the object file can indirectly reflect the execution condition of the DDL operation related to the object, and then the DML operation is selectively synchronized according to the size relationship between the log serial number of the DML operation and the version number of the object file, so that the consistency of the synchronization performance is influenced, and the synchronization progress can be accelerated.
For convenience of understanding, in the following description, it is assumed that the first execution thread fetches a DDL operation, the DDL operation is a DROP operation for table 1, a log sequence number of the DROP operation is 10, and the version number of the object file corresponding to table 1 is updated to be 10; and the DML operation aiming at the table 1 with the log sequence number of 5-8 exists in the second task queue, and the synchronization is carried out according to the normal execution sequence and the size sequence of the log sequence number, so that the destination terminal synchronizes the DML operation with the log sequence number of 5-8 when synchronizing, receives the DROP operation of the table 1, deletes the table 1, namely, does useless synchronization operation, wastes resources and slows down the synchronization progress. After the two-way synchronization scheme of this embodiment is adopted, the second execution thread firstly determines whether the log sequence number of the DML operation is greater than 10 before synchronizing the DML operations, so that the DML operations aiming at table 1 with log sequence numbers of 5-8 can be prevented from being synchronized, consistency of synchronization transactions is not affected, and synchronization efficiency can be improved. The foregoing is only a simple distance for easy understanding, and in a practical application scenario, especially a scenario with a data volume of millions of statistics, the synchronization efficiency can be greatly improved according to the method.
The foregoing describes the restriction of the first thread to the second thread, and correspondingly, the second thread will also restrict the first thread, as described in detail below:
in an actual application scenario, an object file of each operation object not only includes a version number of the object file and an object identification number of the operation object, but also includes the number of times of reference of the object file, where in an initial state, the number of times of reference is 0, when a certain transaction references the object, the number of times of reference of the object file is increased, after the transaction is ended, the number of times of reference of the object file is decreased by one, whether a certain operation object is referenced can be intuitively determined through the number of times of reference, if a certain operation object is referenced, a trunk operation or a DROP operation related to the operation object cannot be executed, otherwise, a synchronization error is caused, and the trunk operation or the DROP operation is executed after the number of times of reference of the operation object is changed to 0.
Based on the restriction of the second execution thread to the first execution thread, the two-way synchronization method further comprises: the second execution thread acquires a target object file based on the object identification number ID1, and adds one to the reference times of the target object file; and after the transaction to which the second target operation belongs is ended, performing a subtraction operation on the reference times of the target object file. Specifically, an object HASH table (object HASH) is set for each transaction, and in the synchronization process, after a new operation object is involved, the operation object is stored in the object HASH table, so that the reference number of the operation object is set, wherein one transaction only performs one plus operation on the same operation object once. In this embodiment, the second execution thread determines, based on the object identification number ID1, whether an object to which the object identification number ID1 belongs exists in the object hash table, and if so, does not add one to the number of references, and if not, adds one to the number of references of the object file obtained based on the object identification number ID 1.
Before the first execution thread carries out synchronization, firstly judging whether the reference times of the operation object related to the first target operation is 0; and if the reference times of the operation objects related to the first target operation are 0, executing the first target operation. And if the reference frequency of the operation object related to the first target operation is not 0, waiting until the reference frequency is 0, and then executing the first target operation.
In addition, the first execution thread also sets a mark for the first target operation, and whether the corresponding target operation is synchronized to the destination end database is known through the mark. Specifically, after obtaining a first target operation from the first task queue, a first execution thread sets a first synchronization flag for the first target operation, where the first synchronization flag indicates that the first target operation is not synchronized to a destination, for example, a first synchronization flag equal to 0 indicates that the first target operation is not synchronized to the destination; after the first target operation is synchronized to the destination, setting a second synchronization flag for the first target operation, where the second synchronization flag indicates that the second target operation has been synchronized to the destination, for example, a second synchronization flag equal to 1 indicates that the first target operation has been synchronized to the destination; the first execution thread obtains a next operation from the first task queue.
In this embodiment, the DROP operation and the trunk operation are set as high priority operations, the DML operation is set as a low priority operation, when a low priority execution thread executes a certain transaction, the execution speed of the table operation involved in the transaction is slow, which results in low synchronization performance, and at this time, the synchronization operation of the high priority execution thread can interrupt the execution of the slow transaction in the low priority, thereby accelerating the synchronization performance.
Based on the foregoing description, the complete execution of the first thread of execution is briefly described below in conjunction with FIG. 2: first, MSGINFO information of a first target operation is obtained, wherein the MSGINFO information comprises tables related to the operation and log serial numbers of the operation, an object file corresponding to the first target operation is obtained according to an object identification number, the version number of the object file is set to be equal to the log serial number of the first target operation, and a mark of the first target operation is set to be 0. Then, judging whether the number of times of reference of the object related to the first target operation is equal to 0, if the number of times of reference is equal to 0, synchronizing the first target operation to a destination-end database, and setting the mark of the first target operation to be 1; if the number of times of reference is not equal to 0, wait for a set time (e.g., 10ms), and then determine whether the number of times of reference of the object involved in the first target operation is equal to 0.
Based on the foregoing description, the complete execution of the second thread of execution is briefly described below in conjunction with fig. 3: firstly, acquiring a second target operation from a transaction, acquiring MSGINFO information of the second target operation, wherein the MSGINFO information comprises tables related to the operation and log serial numbers of the operation, acquiring whether the log serial number of the second target operation is greater than the version number of a target object file, if so, judging whether a Tabi _ application _ flag is equal to 0, if so, continuing to judge whether the Tabi _ application _ flag is equal to 0 after waiting for a set time, if not, judging whether an operation object related to the second target operation is in an object HASH, wherein the object HASH is used for storing an operation object related to the transaction, if in the object HASH, setting a ref +1, adding the operation object related to the second target operation into the object H, synchronizing the second target operation to a destination end database, and further judging whether the execution of the transaction related to the second target operation is finished, if the execution is not completed, the next second target operation continues to be acquired from the transaction, and if the execution is completed, the number of references of all target object files in the object HASH is set to be reduced by 1 (i.e., tabi.ref-1). And if the log sequence number of the second target operation is not greater than the version number of the target object file, executing the step of judging whether the execution of the transaction to which the second target operation belongs is finished.
If the second target operation is in the object HASH, synchronizing the second target operation to a destination-end database, further judging whether the execution of the transaction to which the second target operation belongs is completed, if not, continuing to acquire the next second target operation from the transaction, and if so, setting Tabi.ref-1.
Example 2:
in the present embodiment, the synchronization process of the second execution thread is explained based on the foregoing embodiment 1:
in a practical application scenario, the commit transaction table is used to register transaction information, mainly to prevent a situation where the synchronization data is inconsistent after a data synchronization failure (a synchronization program exception, a database service exception, an operating system exception, or a hardware failure) is recovered, for example, when the synchronization service completes a transaction commit in a destination database, the synchronization service crashes abnormally when a commit command is successfully sent to the database service but an execution result returned by the database service is not received, because the synchronization service and the database are two independent entities, although the synchronization service is abnormal, the database may still be a commit command sent by the synchronization service, and in this situation, the success or failure of the execution of the commit command sent before the last failure cannot be directly obtained from the database after the synchronization service is recovered, and the last commit can only be determined by querying whether the last executed transaction information is registered in the commit transaction table through the transaction information Whether the transaction is successful.
At present, the synchronization service organizes records in a commit transaction table according to the transaction unit, but the method is not suitable for all application scenarios, and an access hot spot of the commit transaction table can be caused under the synchronization scenario in which large-scale small transactions exist.
To solve the foregoing problem, the present embodiment synchronizes DML operations in the second task queue as follows:
under the actual application scene, the target end data synchronization system judges whether a submitted transaction table exists in a target end database; if not, the table identifier is taken as a primary key, and the log sequence number is taken as an additional column to create a commit transaction table.
Wherein, the table building sentence is: CREATE TABLE TX (TABLE _ ID INT PRIMARY KEY, LSN NUMBER).
And if the submitted transaction table exists, loading the table identifier and the corresponding log serial number in the submitted transaction table into a filtering container, wherein the submitted transaction table takes the table identifier as a main key and takes the log serial number as an additional column.
Wherein, the table identification refers to an ID of a certain table.
The log sequence number corresponding to each table identifier is used as a filtering log sequence number, and the transactions related to the table identifier are filtered to judge whether the transactions are synchronized before the fault.
The log sequence number is a log sequence number corresponding to the submitting operation. In actual use, a transaction includes multiple DML operations (e.g., an insert operation, a delete operation, and an update operation), the table involved in each operation may be different, each transaction corresponds to a commit operation, and the log sequence number of the appended column is updated with the commit log sequence number corresponding to the commit operation.
Then, acquiring a transaction identifier to which the DML operation belongs and a table identifier related to the DML operation, and performing classification management on the DML operation by taking the transaction identifier and the table identifier as a combined key so as to divide one transaction into at least one sub-transaction.
In this embodiment, the second execution thread acquires a transaction identifier to which the DML operation belongs and a table identifier related to the DML operation, and performs classification management on the DML operation by using the transaction identifier and the table identifier as a combination key, so as to divide one transaction into at least one sub-transaction.
For example, transaction a (transaction ID1) includes operation 1, operation 2, operation 3, operation 4, and operation 5, where the table referred to by operation 1 and operation 2 is identified as ID1, the table referred to by operation 3 and operation 4 is identified as ID2, and the table referred to by operation 5 is identified as ID3, then the combination keys (ID 1) of operation 1 and operation 2 are the same, operation 1 and operation 2 are divided into one sub-transaction B, the combination keys (ID1, ID2) of operation 3 and operation 4 are the same, operation 3 and operation 4 are divided into another sub-transaction C, the combination key (ID1, ID3) of operation 5, and operation 5 is divided into another sub-transaction D.
In an actual application scenario, if the operation is a commit operation, a commit transaction identifier and a commit log serial number of the commit operation are obtained. Acquiring a submitted transaction identifier and a submitted log serial number of a submitted operation, acquiring a target sub-transaction with the same transaction identifier and submitted transaction identifier from the sub-transactions subjected to classified management, and sequentially extracting the table identifier of the target sub-transaction.
For example, the commit transaction of the commit operation is identified as ID1, indicating that transaction A was committed at the source. And acquiring a target sub-transaction with the same transaction identifier as the submitted transaction identifier ID1 according to the combined key of the sub-transaction in the sub-transactions subjected to classified management, acquiring a sub-transaction B, a sub-transaction C and a sub-transaction D, and sequentially extracting the table identifiers of the target sub-transaction to obtain table identifiers ID1, ID2 and ID 3.
And finally, filtering the target sub-transaction based on the commit log serial number, the table identifier of the target sub-transaction and the filter container so as to ensure the consistency of data synchronization.
Specifically, when filtering is performed through the filtering container, a filtering log serial number is determined according to the table identifier of the target sub-transaction, the submitted log serial number is compared with the filtering log serial number to determine whether the target sub-transaction is synchronized before the fault, if so, the target sub-transaction is discarded, if not, the corresponding filtering log serial number is updated in the submitted transaction table and the filtering container according to the submitted log serial number of the target sub-transaction, and then, the synchronization operation of the target sub-transaction is performed.
In the embodiment, the information in the submitted transaction table is organized based on the table identification and the serial number of the submitted log of the transaction, and only one line of information is registered in the submitted transaction table in each table in the data synchronous operation process, so that the data scale in the submitted transaction table is greatly reduced; when the transaction operation is executed, the operations in each transaction are classified according to the table, one large transaction is divided into at least one small transaction (sub-transaction), the operations of the same table in a plurality of small transactions are combined into one transaction to be executed, and N small transactions are combined into one large transaction to be executed, so that the access times of the submitted transaction table can be effectively reduced, and the access hot spot is prevented from being generated.
In addition, when the operation between the tables is put in storage in parallel, the transaction information of each table is updated to maintain and submit different rows in the transaction table, and the conflict between the put-in threads can be effectively prevented.
Example 3:
referring to fig. 4, fig. 4 is a schematic structural diagram of a synchronization system according to an embodiment of the present invention. The synchronization system of the present embodiment includes one or more processors 41 and a memory 42. In fig. 4, one processor 41 is taken as an example.
The processor 41 and the memory 42 may be connected by a bus or other means, such as the bus connection in fig. 4.
The memory 42, which is a non-volatile computer-readable storage medium based on a two-way synchronization method, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, the methods of the above embodiments, and corresponding program instructions. The processor 41 implements the methods of the foregoing embodiments by executing non-volatile software programs, instructions, and modules stored in the memory 42 to thereby execute various functional applications and data processing.
The memory 42 may include, among other things, high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 42 may optionally include memory located remotely from processor 41, which may be connected to processor 41 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
It should be noted that, because the contents of information interaction, execution process, and the like between modules and units in the apparatus and the system are based on the same concept as the processing method embodiment of the present invention, specific contents may refer to the description in the method embodiment of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (9)

1. A two-way synchronization method based on priority is characterized by comprising the following steps:
acquiring an operation to be synchronized from a source end, and attributing the operation to be synchronized to a first task queue or a second task queue according to the operation type of the operation to be synchronized, wherein the priority of the first task queue is higher than that of the second task queue; the obtaining of the operation to be synchronized from the source end, and attributing the operation to be synchronized to the first task queue or the second task queue according to the operation type of the operation to be synchronized includes: acquiring an operation to be synchronized from a source end, and analyzing the operation to be synchronized to obtain an operation type of the operation to be synchronized; if the operation to be synchronized is a DROP operation or a TRUNCATE operation, adding the operation to be synchronized to the first task queue; if the operation to be synchronized is a DML operation, adding the operation to be synchronized to the second task queue;
a first execution thread acquires a first target operation from the first task queue, analyzes the first target operation to obtain an operation object related to the first target operation, and updates the version number of an object file of the operation object based on the log sequence number of the first target operation;
a second execution thread acquires a second target operation from the second task queue and analyzes the second target operation to obtain an object identification number ID1 of the second target operation and a log serial number SCN1 of the second target operation;
and acquiring a target object file based on the object identification number ID1, further acquiring the version number of the target object file, and selectively synchronizing according to the size relationship between the log serial number SCN1 and the version number of the target object file.
2. The two-way synchronization method according to claim 1, wherein the selectively synchronizing according to the size relationship between the log sequence number SCN1 and the version number of the target object file comprises:
judging whether the log serial number SCN1 is greater than the version number of the target object file;
and if the log serial number SCN1 is greater than the version number of the target object file, synchronizing the second target operation.
3. The two-way synchronization method according to claim 2, wherein said determining whether the log sequence number SCN1 is greater than the version number of the target object file further comprises:
and if the log serial number SCN1 is smaller than the version number of the target object file, discarding the second target operation.
4. The two-way synchronization method according to claim 1, wherein the object file of each operand includes the number of references of the object file and the object identification number of the operand;
the two-way synchronization method further comprises the following steps:
the second execution thread acquires a target object file based on the object identification number ID1, and adds one to the reference times of the target object file;
and after the transaction to which the second target operation belongs is ended, performing a subtraction operation on the reference times of the target object file.
5. The two-way synchronization method of claim 4, further comprising:
the first execution thread judges whether the reference times of the operation object related to the first target operation are 0 or not;
and if the reference times of the operation objects related to the first target operation are 0, executing the first target operation.
6. The two-way synchronization method according to claim 5, wherein after the first execution thread determines whether the number of references to the operation object involved in the first target operation is 0, the method further comprises:
and if the reference times of the operation objects related to the first target operation are not 0, waiting until the reference times are 0, and then executing the first target operation.
7. The two-way synchronization method of claim 1, further comprising:
after a first execution thread acquires a first target operation from the first task queue, setting a first synchronization mark for the first target operation, wherein the first synchronization mark indicates that the first target operation is not synchronized to a destination;
after the first target operation is synchronized to a destination, setting a second synchronization mark for the first target operation, wherein the second synchronization mark indicates that the second target operation is synchronized to the destination;
the first execution thread obtains a next operation from the first task queue.
8. The two-way synchronization method according to any one of claims 1 to 7, wherein the operation object comprises a table.
9. A synchronization system, characterized in that the synchronization system comprises at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform the two-way synchronization method of any of claims 1-8.
CN202011446567.3A 2020-12-11 2020-12-11 Priority-based two-way synchronization method and system Active CN112559473B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011446567.3A CN112559473B (en) 2020-12-11 2020-12-11 Priority-based two-way synchronization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011446567.3A CN112559473B (en) 2020-12-11 2020-12-11 Priority-based two-way synchronization method and system

Publications (2)

Publication Number Publication Date
CN112559473A CN112559473A (en) 2021-03-26
CN112559473B true CN112559473B (en) 2022-06-21

Family

ID=75061458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011446567.3A Active CN112559473B (en) 2020-12-11 2020-12-11 Priority-based two-way synchronization method and system

Country Status (1)

Country Link
CN (1) CN112559473B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329006B (en) * 2022-08-31 2023-08-04 保利和悦生活科技服务有限公司 Data synchronization method and system for background and third party interface of network mall
CN117520458A (en) * 2023-12-08 2024-02-06 北京优炫软件股份有限公司 Parallel increment synchronization system based on database log analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783573A (en) * 2018-12-18 2019-05-21 北京华夏电通科技有限公司 The method of data synchronization and terminal of multichannel push
CN110647579A (en) * 2019-08-16 2020-01-03 北京百度网讯科技有限公司 Data synchronization method and device, computer equipment and readable medium
CN111930828A (en) * 2020-05-29 2020-11-13 武汉达梦数据库有限公司 Data synchronization method and data synchronization system based on log analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8023934B2 (en) * 2008-03-28 2011-09-20 Ianywhere Solutions, Inc. Synchronizing communications and data between mobile devices and servers
CN103617176B (en) * 2013-11-04 2017-03-15 广东电子工业研究院有限公司 One kind realizes the autosynchronous method of multi-source heterogeneous data resource

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783573A (en) * 2018-12-18 2019-05-21 北京华夏电通科技有限公司 The method of data synchronization and terminal of multichannel push
CN110647579A (en) * 2019-08-16 2020-01-03 北京百度网讯科技有限公司 Data synchronization method and device, computer equipment and readable medium
CN111930828A (en) * 2020-05-29 2020-11-13 武汉达梦数据库有限公司 Data synchronization method and data synchronization system based on log analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于日志解析的数据库海量数据同步***的研究与实现;宋芳利;《中国优秀硕士学位论文全文数据库》;20170515;正文第34-75页 *

Also Published As

Publication number Publication date
CN112559473A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
US11921746B2 (en) Data replication method and apparatus, computer device, and storage medium
EP3120261B1 (en) Dependency-aware transaction batching for data replication
CN112559473B (en) Priority-based two-way synchronization method and system
CN110196856B (en) Distributed data reading method and device
CN106909563B (en) Distributed system
CN110580258B (en) Big data free query method and device
CN109086382B (en) Data synchronization method, device, equipment and storage medium
CN112559626B (en) Synchronous method and synchronous system of DDL operation based on log analysis
CN112035463B (en) Bidirectional synchronization method and synchronization device of heterogeneous database based on log analysis
CN114661816B (en) Data synchronization method and device, electronic equipment and storage medium
CN111694798B (en) Data synchronization method and data synchronization system based on log analysis
CN115438122A (en) Data heterogeneous synchronization system
US7958083B2 (en) Interacting methods of data summarization
CN108090056B (en) Data query method, device and system
CN111694893A (en) Partial rollback analysis method based on log analysis and data synchronization system
CN111858504B (en) Operation merging execution method based on log analysis synchronization and data synchronization system
CN114138833A (en) Method and system for data synchronization of relational database and cache database
CN113946628A (en) Data synchronization method and device based on interceptor
WO2024041191A1 (en) Data processing method and apparatus
WO2022253131A1 (en) Data parsing method and apparatus, computer device, and storage medium
CN112035464B (en) Data synchronization filtering method and synchronization device based on log analysis
CN112307118B (en) Method for guaranteeing data consistency based on log analysis synchronization and synchronization system
CN111930693B (en) Transaction merging execution method and device based on log analysis synchronization
CN113806448B (en) Configuration method for automatic synchronization of data
CN117349371A (en) Method and device for statically modifying data synchronization packet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Liu Qichun

Inventor after: Yu Yuanlan

Inventor after: Sun Feng

Inventor before: Fu Quan

Inventor before: Liu Qichun

Inventor before: Yu Yuanlan

Inventor before: Sun Feng

GR01 Patent grant
GR01 Patent grant