CN110597914A - Data transmission system, method, device and equipment - Google Patents

Data transmission system, method, device and equipment Download PDF

Info

Publication number
CN110597914A
CN110597914A CN201910879649.8A CN201910879649A CN110597914A CN 110597914 A CN110597914 A CN 110597914A CN 201910879649 A CN201910879649 A CN 201910879649A CN 110597914 A CN110597914 A CN 110597914A
Authority
CN
China
Prior art keywords
change log
service change
service
message
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910879649.8A
Other languages
Chinese (zh)
Inventor
高元胜
刘少伟
陈璇
徐嘉亮
徐唐
沈仁奎
邓鑫鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mind Creation Information Technology Co Ltd
Original Assignee
Beijing Mind Creation Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mind Creation Information Technology Co Ltd filed Critical Beijing Mind Creation Information Technology Co Ltd
Priority to CN201910879649.8A priority Critical patent/CN110597914A/en
Publication of CN110597914A publication Critical patent/CN110597914A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the specification discloses a data transmission system, a data transmission method, a data transmission device and data transmission equipment. The system comprises: the system comprises a service database, a message middleware and a data warehouse, wherein the service database is used for compiling service change logs in a service table change log set and then sending the service change logs to the message middleware, and the data warehouse is used for acquiring each compiled service change log from the message middleware according to a set period. The technical scheme is that the service change log in the service table change log set is obtained from the service database, and the compiled service change log is sent to the data warehouse for storage. The service change log is transmitted to the data warehouse, so that data does not need to be directly acquired from the service MySQL library, and other normal services on a service line are not influenced. Therefore, the scheme can realize real-time transmission of the service data and greatly improve the timeliness of data transmission.

Description

Data transmission system, method, device and equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data transmission system, method, apparatus, and device.
Background
In the prior art, most data sources of a data warehouse are business databases such as MySQL, and the construction process depends on data synchronization. The commonly used data synchronization scheme is to directly connect MySQL to perform batch query to obtain data, then store the data in an intermediate medium, and finally load the file into a multi-bin hive table. With the continuous increase of the business scale, a large amount of data is directly inquired from the business MySQL library, so that slow inquiry on the MySQL library is easily caused, and the normal service on a business line is influenced. Therefore, the existing data synchronization method generally adopts a T +1 delay scheduling-based mode, and data synchronization is uniformly performed at night, so that the timeliness of data in a data warehouse is greatly reduced, and some data query services with high requirements on timeliness cannot be met.
Therefore, there is a need to provide a less time consuming, more flexible method of data transmission.
Disclosure of Invention
In view of this, embodiments of the present application provide a data transmission system, method, apparatus, and device, which are used to improve timeliness of data transmission.
In order to solve the above technical problem, the embodiments of the present specification are implemented as follows:
a data transmission system provided in an embodiment of the present specification includes: the system comprises a service database, a message middleware and a data warehouse, wherein the service database is used for compiling service change logs in a service table change log set and then sending the service change logs to the message middleware, and the data warehouse is used for acquiring each compiled service change log from the message middleware according to a set period.
A data transmission method provided in an embodiment of the present specification includes:
acquiring a service change log set in a service database;
compiling each service change log in the service change log set to obtain a message with a set format, and storing the message in a message queue;
and sending the messages with the set format in the message queue to a data warehouse according to a set period.
Another data transmission method provided in an embodiment of this specification includes:
capturing a service change log set snapshot;
reading each service change log in the service change log set according to the service change log set snapshot;
and sending each service change log in the service change log set to a message middleware.
An embodiment of this specification provides a data transmission apparatus, including:
the service change log set acquisition module is used for acquiring a service change log set in a service database;
the service change log compiling module is used for compiling each service change log in the service change log set to obtain a message with a set format and storing the message into a message queue;
and the message sending module with the set format is used for sending the messages with the set format in the message queue to a data warehouse according to a set period.
Another data transmission apparatus provided in an embodiment of this specification, includes:
the snapshot capturing module is used for capturing a snapshot of the service change log set;
a reading module, configured to read each service change log in the service change log set according to the service change log set snapshot;
and the sending module is used for sending each service change log in the service change log set to the message middleware.
An embodiment of this specification provides a data transmission device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
acquiring a service change log set in a service database;
compiling each service change log in the service change log set to obtain a message with a set format, and storing the message in a message queue;
and sending the messages with the set format in the message queue to a data warehouse according to a set period.
Another data transmission device provided in an embodiment of this specification includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
capturing a service change log set snapshot;
reading each service change log in the service change log set according to the service change log set snapshot;
and sending each service change log in the service change log set to a message middleware.
The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:
the technical scheme is that the service change log in the service table change log set is obtained from the service database, and the compiled service change log is sent to the data warehouse for storage. The service change log is transmitted to the data warehouse, so that data does not need to be directly acquired from the service MySQL library, and other normal services on a service line are not influenced. Because the service change log is a record of operation performed on the data in the service MySQL library and can also represent the data in the service MySQL library, the scheme can realize real-time transmission of service data and greatly improve the timeliness of data transmission.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic structural diagram of a data transmission system provided in an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a data transmission method provided in an embodiment of the present disclosure;
fig. 3 is a schematic flow chart of another data transmission method provided in the embodiments of the present disclosure;
fig. 4 is a schematic flowchart of an embodiment of a data transmission method provided in an embodiment of the present specification;
FIG. 5 is a schematic flow chart of the data real-time ingestion module starting from initialization snapshot
FIG. 6 is a schematic of binlog;
FIG. 7 is a message formatted in a message queue;
fig. 8 is a schematic structural diagram of a data transmission device corresponding to fig. 2 provided in an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of a data transmission device corresponding to fig. 3 provided in an embodiment of the present disclosure;
fig. 10 is a schematic structural diagram of a data transmission device corresponding to fig. 2 provided in an embodiment of the present specification.
Fig. 11 is a schematic structural diagram of a data transmission device corresponding to fig. 3 provided in an embodiment of this specification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The data sources of the data warehouse are mostly business databases such as MySQL, and the construction process depends on data synchronization. The common data synchronization scheme is that direct MySQL is connected to perform batch query to obtain data, the data is stored in an intermediate medium, and finally files are loaded into a hive table of a data warehouse. With the continuous increase of the business scale, the time consumed by the data stream of batch query- > intermediate storage- > loading hive is longer and longer, and the time requirement of downstream data warehouse production cannot be met.
In order to solve the above problem, an embodiment of this specification provides a scheme that can perform data transmission in real time, a data source of a data warehouse is changed, instead of MySQL, a service change log in a service database is used, transmission of service data is replaced by the service change log, and then the stored service change log is restored to the service data. Therefore, the method avoids directly inquiring a large amount of data from the service MySQL library, so that slow inquiry to the MySQL library is not caused, and normal service on a service line is not influenced. Therefore, the data transmission scheme provided by the embodiment of the specification can provide real-time data transmission without influencing other online operations.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a data transmission system according to an embodiment of the present disclosure. As shown in fig. 1, the system includes: the service data base 11 is used for compiling service change logs in a service table change log set and then sending the service change logs to the message middleware 12, and the data warehouse 13 is used for acquiring each compiled service change log from the message middleware according to a set period.
The service database 11 may be a database in a service server, and is used for storing all service data, and may include data source information and may also include a change log. The change log is used for recording the process of the change of the service data, and is updated along with the change of the service data, so that the service data at any time can be obtained according to the original data structure table and the change log, and other online operations are not influenced. Therefore, when a large amount of data is transmitted, the data transmission system provided by the invention is adopted to transmit the data, so that the timeliness can be improved.
The message middleware utilizes an efficient and reliable message transfer mechanism for platform-independent data communication and integration of a distributed system based on data communication. By providing a messaging and message queuing model, it can extend inter-process communication in a distributed environment.
Message middleware is suitable for distributed environments where reliable data transfer is required. In the system adopting the message middleware mechanism, different objects activate the event of the other side by transmitting messages to finish corresponding operation. The sender sends the message to the message server, and the message server stores the message in a plurality of queues and forwards the message to the receiver when appropriate. Message middleware, which is often used to mask features between various platforms and protocols, enables collaboration between applications, has the advantage of providing synchronous and asynchronous connections between clients and servers, and can deliver or store-and-forward messages at any time, which is a further reason than remote procedure calls.
In one or more embodiments of the present description, message middleware 12 may employ kafka. The purpose of Kafka is to unify online and offline message processing through the parallel loading mechanism of Hadoop, and also to provide real-time messages through clustering.
The data warehouse 13 is used to store the service change log, and may adopt hive, etc., without any other limitation.
Fig. 2 is a schematic flowchart of a data transmission method provided in an embodiment of the present disclosure. In the data transmission system provided by the invention, the data middleware is taken as an execution main body. From the viewpoint of a program, the main body of execution of the flow may be a program installed in a server or a client.
As shown in fig. 2, the process may include the following steps:
step 202: and acquiring a service change log set in a service database.
In this embodiment of the present specification, when the data acquisition operation is performed for the first time, a client needs to be triggered, where the client is not a user client in the conventional definition but an operation end of an operator for service data, and there may be a plurality of operation ends. The data collection process continues as long as there is a client triggering collection instruction. In the data acquisition process, if special conditions exist, operations such as termination or pause can be selected, problems are solved or faults are eliminated, and then tasks are restarted.
The data service repository may be stored in a service server. The business change logs in the business change log set are continuously increased along with the change of the business data.
In addition, in order to alleviate the read-write pressure caused by data growth, the prior art adopts operations of database splitting and table splitting of data sources, which makes the process of querying and importing the data warehouse become complex and heavy. The embodiment of the specification does not directly acquire data from a data source, but indirectly acquires the service change log, so that the problem caused by database division and table division is avoided.
Step 204: and compiling each service change log in the service change log set to obtain a message with a set format, and storing the message into a message queue.
In order to facilitate transmission and storage, after the service change logs are acquired, each change log needs to be compiled into a format suitable for storage, or storage requirements of a data warehouse are met. Such as messages that may include data before and after changes, and then deposit the compiled service change log into a message queue in chronological order.
Step 206: and sending the messages with the set format in the message queue to a data warehouse according to a set period.
In this embodiment, the condition for triggering the message middleware to send the message with the set format in the message queue to the data warehouse may be initiated by the client and may be continuously performed as long as triggering is performed once. The period can be customized, or determined according to the quantity or speed of the newly added data of the service change log, or determined according to the requirement of the downstream service on the timeliness of the data. For example, if the downstream traffic has a requirement for data timeliness of half an hour, then the period may be set to half an hour, or less than half an hour, with the shortest period reaching the order of minutes.
It should be noted that the operation of "obtaining the service change log set" in step 202 is performed in real time, and the operation of "sending the message with the set format in the message queue to the data warehouse" in step 206 is performed according to a set period. In theory, the time interval between the two "acquisition operations" of step 202 is less than the "set period" of step 206. However, the set period may also reach the minute level, i.e. every minute the message in the message queue is sent to the data warehouse.
In addition, the messages in the message queue are always saved, and the corresponding data in the message queue is not deleted because the messages are sent to the data warehouse. In addition, the data in the message queue is arranged in time sequence.
In the method in fig. 2, the service change log in the service table change log set is obtained from the service database, and the compiled service change log is sent to the data warehouse for storage. The service change log is transmitted to the data warehouse, so that data does not need to be directly acquired from the service MySQL library, and other normal services on a service line are not influenced. Because the service change log is a record of operation performed on the data in the service MySQL library and can also represent the data in the service MySQL library, the scheme can realize real-time transmission of service data and greatly improve the timeliness of data transmission.
Based on the method of fig. 2, the embodiments of the present specification also provide some specific implementations of the method, which are described below.
In one or more embodiments of the present specification, before compiling each service change log in the service change log set to obtain a message with a set format and storing the message in a message queue, the method further includes:
acquiring the structure information of a service data table in a service database;
after compiling each service change log in the service change log set, obtaining a message in a set format, which specifically includes:
compiling each service change log in the service change log set according to the service data table structure information to obtain a message with a set format, wherein the set format comprises: pre-change data, post-change data, data source information, and data type.
In order to obtain a message conforming to the set format according to the service change log, the service data table structure information needs to be acquired from the service database. The service data table structure information may include formats of a plurality of service data tables, such as which entries to include, and the like. And then, according to the service data table structure and the service table update log, the specific data before and after the change aiming at the service table update log can be recovered.
In one or more embodiments of the present description, setting the format may include: pre-change data, post-change data, data source information, and data type. The data types may include: update, insert, and delete, etc.
In one or more embodiments of the present specification, the sending the message in the set format in the message queue to a data warehouse according to a set period specifically includes:
determining a first breakpoint position in the message queue;
acquiring messages with a set format behind the first breakpoint position in the message queue, wherein the messages with the set format in the message queue are sequenced according to the time sequence of entering the message queue;
and sending the message with the set format after the first breakpoint position to a data warehouse.
To avoid repeatedly sending messages in the message queue to the data store, a marker may be placed in the message queue after each data transmission to the data store indicates the location of the last message sent. Therefore, when data is sent to the data warehouse next time, the data can be sent from the position behind the marked position, and after the sending of the period is finished, the last data is marked. For example, Kafka's offsets may be saved in a file to manage the offset (i.e., mark breakpoint locations) so that when a task fails and restarts, it may begin with the offset that was last submitted.
In one or more embodiments of the present specification, after compiling each change log in the service change log set to obtain a message with a set format, and storing the message in a message queue, the method further includes:
and merging and storing the messages which accord with the set format according to the theme based on the name regular expression of the sub-library and the sub-table.
To facilitate the transmission and storage of data, after the service change log is compiled, these messages are then stored according to subject. The messages stored under one theme are all changed aiming at the same service data table, so that the data storage is beneficial to storing the data in the data warehouse and calling the service data from the outside.
Fig. 3 is a schematic flow chart of another data transmission method provided in the embodiments of the present disclosure. The execution subject of fig. 3 is a business database. As shown in fig. 3, the method may include the steps of:
step 302: a snapshot of a set of business change logs is captured.
When a service change log in a service change log set is obtained, a snapshot of the service change log set needs to be obtained first, so that which service change logs need to be obtained are determined.
Step 304: and reading each service change log in the service change log set according to the service change log set snapshot.
After the snapshot of the service change log set is obtained, the service change logs of the service change log set can be read one by one. In the service change log set, the service change logs are arranged in chronological order.
Step 306: and sending each service change log in the service change log set to a message middleware.
The operations of steps 302-306 are performed in real time, and once the client triggers data transmission, the operations of steps 302-306 are continuously performed.
In one or more embodiments of the present specification, after the reading of each service change log in the service change log set according to the service change log set snapshot, the method further includes;
and marking the position of the last service change log in the service change log set.
Because the service change log set is continuously increased, in order to avoid repeatedly reading the service change logs in the service change logs, after each reading, the position of the last service change log read is marked, so that the service change logs are read from the marked position next time and sent to the message middleware.
In one or more embodiments of the present specification, the reading, according to the service change log set snapshot, each service change log in the service change log set specifically includes:
determining a second breakpoint position in the service change log set;
and acquiring one or more service change logs positioned after the second breakpoint position in the service change log set.
The method defines a method for obtaining the service change log in the service change log set for non-first time, that is, a method for reading the service change log every time after the first time. The second breakpoint position marks the last position of the last read data, the service change log before the second breakpoint position is read and sent to the message middleware, and the service and log after the second breakpoint position are newly generated and need to be read and sent to the message middleware. This avoids repeated reading of data and can also be done from the last breakpoint position when a task fails or is restarted.
In one or more embodiments of the present specification, after the capturing the snapshot of the service change log set, the method further includes:
locking the service change log set snapshot and shielding write operations of other clients;
after reading each service change log in the service change log set according to the service change log set snapshot, the method further includes:
and releasing the lock on the service change log set snapshot.
In order to prevent other users from carrying out malicious operations on the logs of the service change log set, such as operations of modifying or deleting the service change log set, the method also sets a wind control measure, namely locking the snapshot of the service change log set and preventing write operations of other clients. And releasing the lock after reading each service change log in the service change log set.
Fig. 4 is a flowchart schematically illustrating an embodiment of a data transmission method provided in an embodiment of the present disclosure. As shown in fig. 4, it includes 3 entities (data source, message queue and data warehouse) and 2 modules (data real-time ingest module and data timing load module). The data real-time intake module and the data timing loading module are constructed into a rest service and support the custom DSL syntax configuration. The data real-time intake module realizes seamless synchronization of MySQL based on binlog analysis of MySQL in a full snapshot and continuous binlog analysis mode, and compiles the MySQL into the content of table data required by Hive before and after change. And the data timing loading module is used for acquiring the kafka message queue accumulation message at a timing, writing each row of data into Hive and performing partition storage with minute granularity.
The platform unified processing data real-time intake module and the data timing loading module manage the whole life cycle, and can carry out interactive friendly operation through a web system, or transfer highly abstract DSL language to call restful api to establish, start, stop, hot update and other operations. And meanwhile, the condition of a transmission link is monitored, and the functions of monitoring the dashboard and monitoring the alarm are provided.
Fig. 6 is a schematic flow chart of the data real-time ingestion module from initialization of snapshot. As shown in fig. 6, the data real-time ingestion module is configured to:
(1) acquiring a full-table snapshot: obtaining a global read lock, reading the positions of a table schema and a current binlog (binary log), releasing the read lock, scanning the current full table data in a transaction and writing the data into a specific message queue theme together with a table building statement. (FIG. 5 is a schematic diagram of binlog)
(2) Reading binlog in real time: beginning from a recording breakpoint during global snapshot, reading binlog in real time, generating table data before and after change, continuously writing the table data into a message queue, and merging the table data into a theme based on a name regular expression of a sub-library sub-table.
Fig. 7 is a message formatted in a message queue. The setting format comprises: pre-change data, post-change data, data source information, and data type.
The data real-time shooting module constructs the grammar as follows:
wherein the content of the first and second substances,
1. "name" designates a module.
2. "config" indicates the configuration:
(1) source specifies the database address and port number;
(2) "user" indicates the MySQL account name with the corresponding authority;
(3) "password" indicates the password for the account name;
(4) "database. include" indicates the name of the library that needs to be monitored, if the parameter is "database. include" indicates the name of the library that needs to be filtered, and the two configurations cannot exist simultaneously;
(5) the table.include indicates the table name to be monitored, the table name to be filtered is indicated when the parameter is the table.include, the two configurations can not exist simultaneously, and the data of the sub-database sub-tables are uniformly recorded into the kafka message queue through the regular expression;
(6) column. exception "indicates the column name that needs to be filtered, i.e., the message queue is not written when the column changes;
(7) servers specifies the address port number where kafka message queue brokers is located;
(8) topic specifies the kafka message queue topic;
(9) "heartbeat. interval. ms" indicates a probing period.
And the data timing loading module acquires data from the kafka message queue at a customizable frequency and creates a Hive external partition table for each topic. The wal mechanism is used to ensure that each record is exported to HDFS exactly once. In addition, Kafka's offsets are saved in a file to manage the offsets so that on failure and task restart, one can start with the last committed offset. And the partition mode is customized, and the partition storage with minute granularity can be performed.
The timing loading module constructs grammar:
wherein,
"name" specifies the module.
"config" specifies the configuration:
(1) ms indicates the interval period of the timing scheduling;
(2) servers specifies the address port number where kafka message queue brokers is located;
(3) topic specifies the message queue topic;
(4) url indicates the hdfs address written;
(5) "live. metadata. uri" designates a metadata storage address;
(6) database "designates the hive pool;
(7) format "specifies the format of the storage path.
The embodiment of the specification can achieve the following beneficial effects: the advantages of real-time flow are fully exerted, the pressure of a large amount of data taken in a single time is shared, the influence on-line service is avoided, meanwhile, the time for completing a single synchronous task is greatly reduced, the downstream multi-level dependence production of a data warehouse is not influenced, and the data synchronization delay of the data warehouse is reduced from an antenna level to a minute level.
Based on the same idea, the embodiment of the present specification further provides a device corresponding to the above method. Fig. 8 is a schematic structural diagram of a data transmission device corresponding to fig. 2 provided in an embodiment of the present disclosure. As shown in fig. 8, the apparatus may include:
a service change log set obtaining module 801, configured to obtain a service change log set in a service database;
a service change log compiling module 802, configured to compile each service change log in the service change log set to obtain a message in a set format, and store the message in a message queue;
a message sending module 803 with a set format, configured to send the message with the set format in the message queue to a data warehouse according to a set period.
In one or more embodiments of the present description, the apparatus may further include:
a service data table structure information obtaining module, configured to obtain a message in a set format after compiling each service change log in the service change log set, and obtain service data table structure information in a service database before storing the message in a message queue;
the service change log compiling module 802 may be specifically configured to: compiling each service change log in the service change log set according to the service data table structure information to obtain a message with a set format, wherein the set format comprises: pre-change data, post-change data, data source information, and data type.
In one or more embodiments of the present specification, the format setting message sending module 803 may specifically include:
a first breakpoint position determination unit, configured to determine a first breakpoint position in the message queue;
the acquisition unit is used for acquiring messages with a set format in the message queue after the first breakpoint position, wherein the messages with the set format in the message queue are sequenced according to the time sequence of entering the message queue;
and the sending unit is used for sending the message with the set format after the first breakpoint position to a data warehouse.
In one or more embodiments of the present description, the apparatus may further include: and the merging and storing module is used for compiling each change log in the service change log set to obtain a message with a set format, storing the message into a message queue, and merging and storing the message conforming to the set format according to the theme based on the name regular expression of the sub-library and sub-table.
Based on the same idea, the embodiment of the present specification further provides a device corresponding to the above method. Fig. 9 is a schematic structural diagram of a data transmission device corresponding to fig. 3 provided in an embodiment of the present disclosure. As shown in fig. 9, the apparatus may include:
a snapshot capturing module 901, configured to capture a snapshot of a service change log set;
a reading module 902, configured to read each service change log in the service change log set according to the service change log set snapshot;
a sending module 903, configured to send each service change log in the service change log set to the message middleware.
In one or more embodiments of the present description, the apparatus may further include:
and the marking module is used for marking the position of the last service change log in the service change log set after reading each service change log in the service change log set according to the service change log set snapshot.
In one or more embodiments of the present specification, the reading module 902 may specifically include:
a second breakpoint position determination unit, configured to determine a second breakpoint position in the service change log set;
and the reading unit is used for acquiring one or more service change logs positioned behind the second breakpoint position in the service change log set.
In one or more embodiments of the present description, the apparatus may further include:
the locking module is used for locking the service change log set snapshot after the service change log set snapshot is captured, and shielding the write operation of other clients;
and the unlocking module is used for releasing the lock on the service change log set snapshot after reading each service change log in the service change log set according to the service change log set snapshot.
Based on the same idea, the embodiment of the present specification further provides a device corresponding to the above method.
Fig. 10 is a schematic structural diagram of an apparatus corresponding to fig. 2 provided in an embodiment of the present disclosure. As shown in fig. 10, the apparatus 1000 may include:
at least one processor 1010; and the number of the first and second groups,
a memory 1030 communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory 1030 stores instructions 1020 executable by the at least one processor 1010 to enable the at least one processor 1010 to:
acquiring a service change log set in a service database;
compiling each service change log in the service change log set to obtain a message with a set format, and storing the message in a message queue;
and sending the messages with the set format in the message queue to a data warehouse according to a set period.
Fig. 11 is a schematic structural diagram of an apparatus corresponding to fig. 3 provided in an embodiment of the present disclosure. As shown in fig. 11, the device 1100 may include:
at least one processor 1110; and the number of the first and second groups,
a memory 1130 communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory 1130 stores instructions 1120 executable by the at least one processor 1110, the instructions being executable by the at least one processor 1110 to enable the at least one processor 1110 to:
capturing a service change log set snapshot;
reading each service change log in the service change log set according to the service change log set snapshot;
and sending each service change log in the service change log set to a message middleware.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (13)

1. A data transmission system, the system comprising: the system comprises a service database, a message middleware and a data warehouse, wherein the service database is used for compiling service change logs in a service table change log set and then sending the service change logs to the message middleware, and the data warehouse is used for acquiring each compiled service change log from the message middleware according to a set period.
2. A method of data transmission, the method comprising:
acquiring a service change log set in a service database;
compiling each service change log in the service change log set to obtain a message with a set format, and storing the message in a message queue;
and sending the messages with the set format in the message queue to a data warehouse according to a set period.
3. The method of claim 2, wherein after compiling each service change log in the service change log set, obtaining a message with a set format, and before storing the message in a message queue, further comprising:
acquiring the structure information of a service data table in a service database;
after compiling each service change log in the service change log set, obtaining a message in a set format, which specifically includes:
compiling each service change log in the service change log set according to the service data table structure information to obtain a message with a set format, wherein the set format comprises: pre-change data, post-change data, data source information, and data type.
4. The method according to claim 2, wherein the sending the messages in the set format in the message queue to a data warehouse according to a set period specifically comprises:
determining a first breakpoint position in the message queue;
acquiring messages with a set format behind the first breakpoint position in the message queue, wherein the messages with the set format in the message queue are sequenced according to the time sequence of entering the message queue;
and sending the message with the set format after the first breakpoint position to a data warehouse.
5. The method of claim 2, wherein after compiling each change log in the service change log set to obtain a message with a set format, and storing the message in a message queue, the method further comprises:
and merging and storing the messages which accord with the set format according to the theme based on the name regular expression of the sub-library and the sub-table.
6. A method of data transmission, the method comprising:
capturing a service change log set snapshot;
reading each service change log in the service change log set according to the service change log set snapshot;
and sending each service change log in the service change log set to a message middleware.
7. The method of claim 6, wherein after said reading each service change log in said service change log set according to said service change log set snapshot, further comprising:
and marking the position of the last service change log in the service change log set.
8. The method according to claim 6, wherein said reading each service change log in the service change log set according to the service change log set snapshot specifically comprises:
determining a second breakpoint position in the service change log set;
and acquiring one or more service change logs positioned after the second breakpoint position in the service change log set.
9. The method of claim 6, after said capturing a snapshot of a set of business change logs, further comprising:
locking the service change log set snapshot and shielding write operations of other clients;
after reading each service change log in the service change log set according to the service change log set snapshot, the method further includes:
and releasing the lock on the service change log set snapshot.
10. A data transmission apparatus, characterized in that the apparatus comprises:
the service change log set acquisition module is used for acquiring a service change log set in a service database;
the service change log compiling module is used for compiling each service change log in the service change log set to obtain a message with a set format and storing the message into a message queue;
and the message sending module with the set format is used for sending the messages with the set format in the message queue to a data warehouse according to a set period.
11. A data transmission apparatus, characterized in that the apparatus comprises:
the snapshot capturing module is used for capturing a snapshot of the service change log set;
a reading module, configured to read each service change log in the service change log set according to the service change log set snapshot;
and the sending module is used for sending each service change log in the service change log set to the message middleware.
12. A data transmission device, characterized in that the device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
acquiring a service change log set in a service database;
compiling each service change log in the service change log set to obtain a message with a set format, and storing the message in a message queue;
and sending the messages with the set format in the message queue to a data warehouse according to a set period.
13. A data transmission device, characterized in that the device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
capturing a service change log set snapshot;
reading each service change log in the service change log set according to the service change log set snapshot;
and sending each service change log in the service change log set to a message middleware.
CN201910879649.8A 2019-09-18 2019-09-18 Data transmission system, method, device and equipment Pending CN110597914A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910879649.8A CN110597914A (en) 2019-09-18 2019-09-18 Data transmission system, method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910879649.8A CN110597914A (en) 2019-09-18 2019-09-18 Data transmission system, method, device and equipment

Publications (1)

Publication Number Publication Date
CN110597914A true CN110597914A (en) 2019-12-20

Family

ID=68860347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910879649.8A Pending CN110597914A (en) 2019-09-18 2019-09-18 Data transmission system, method, device and equipment

Country Status (1)

Country Link
CN (1) CN110597914A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625552A (en) * 2020-05-20 2020-09-04 北京百度网讯科技有限公司 Data collection method, device, equipment and readable storage medium
CN112052227A (en) * 2020-09-25 2020-12-08 郑州阿帕斯数云信息科技有限公司 Data change log processing method and device and electronic equipment
CN112162904A (en) * 2020-09-25 2021-01-01 同程网络科技股份有限公司 Order change process integration method, order change process extraction method, order change process integration device and order change process extraction device
CN112417018A (en) * 2020-11-23 2021-02-26 中国工商银行股份有限公司 Data sharing method and device
CN112445863A (en) * 2020-11-30 2021-03-05 永辉云金科技有限公司 Real-time data synchronization method and system
CN112883367A (en) * 2021-01-26 2021-06-01 北京高因科技有限公司 Trigger data secure transmission method and device
CN113268540A (en) * 2021-03-26 2021-08-17 北京视博云信息技术有限公司 Data synchronization method and device
CN113434600A (en) * 2021-06-30 2021-09-24 青岛海尔科技有限公司 Data synchronization method and device
CN113743697A (en) * 2020-08-21 2021-12-03 西安京迅递供应链科技有限公司 Risk alarm method and device
CN113760845A (en) * 2020-08-17 2021-12-07 北京沃东天骏信息技术有限公司 Log processing method, system, device, client and storage medium
CN113765984A (en) * 2021-01-04 2021-12-07 北京沃东天骏信息技术有限公司 Data pushing method and device
CN113821407A (en) * 2021-09-15 2021-12-21 浙江网新恩普软件有限公司 Storm distributed real-time computing method and system
CN114328750A (en) * 2021-12-31 2022-04-12 北京发现角科技有限公司 Method and device for synchronizing service data with ODS (oxide dispersion strengthened) layer
CN114422577A (en) * 2020-10-12 2022-04-29 腾讯科技(深圳)有限公司 Method and device for processing service change message

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262662A (en) * 2011-07-22 2011-11-30 浪潮(北京)电子信息产业有限公司 System, device and method for realizing database data migration in heterogeneous platform
CN105472042A (en) * 2016-01-15 2016-04-06 中煤电气有限公司 WEB terminal controlled message middleware system and data transmission method thereof
CN107783975A (en) * 2016-08-24 2018-03-09 北京京东尚科信息技术有限公司 The method and apparatus of distributed data base synchronization process
CN108009252A (en) * 2017-12-04 2018-05-08 传神语联网网络科技股份有限公司 The method and device of data synchronization
CN109582731A (en) * 2018-10-18 2019-04-05 恒峰信息技术有限公司 A kind of real time data synchronization method and system
CN109753531A (en) * 2018-12-26 2019-05-14 深圳市麦谷科技有限公司 A kind of big data statistical method, system, computer equipment and storage medium
CN109800207A (en) * 2019-01-14 2019-05-24 深圳前海微众银行股份有限公司 Log analytic method, device, equipment and computer readable storage medium
CN110083660A (en) * 2019-04-29 2019-08-02 重庆天蓬网络有限公司 A kind of method, apparatus of synchrodata, medium and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262662A (en) * 2011-07-22 2011-11-30 浪潮(北京)电子信息产业有限公司 System, device and method for realizing database data migration in heterogeneous platform
CN105472042A (en) * 2016-01-15 2016-04-06 中煤电气有限公司 WEB terminal controlled message middleware system and data transmission method thereof
CN107783975A (en) * 2016-08-24 2018-03-09 北京京东尚科信息技术有限公司 The method and apparatus of distributed data base synchronization process
CN108009252A (en) * 2017-12-04 2018-05-08 传神语联网网络科技股份有限公司 The method and device of data synchronization
CN109582731A (en) * 2018-10-18 2019-04-05 恒峰信息技术有限公司 A kind of real time data synchronization method and system
CN109753531A (en) * 2018-12-26 2019-05-14 深圳市麦谷科技有限公司 A kind of big data statistical method, system, computer equipment and storage medium
CN109800207A (en) * 2019-01-14 2019-05-24 深圳前海微众银行股份有限公司 Log analytic method, device, equipment and computer readable storage medium
CN110083660A (en) * 2019-04-29 2019-08-02 重庆天蓬网络有限公司 A kind of method, apparatus of synchrodata, medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
中国通信学会学术工作委员会: "《第九届中国通信学会学术年会论文集》", 31 December 2012 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625552A (en) * 2020-05-20 2020-09-04 北京百度网讯科技有限公司 Data collection method, device, equipment and readable storage medium
CN111625552B (en) * 2020-05-20 2024-01-02 北京百度网讯科技有限公司 Data collection method, device, equipment and readable storage medium
CN113760845A (en) * 2020-08-17 2021-12-07 北京沃东天骏信息技术有限公司 Log processing method, system, device, client and storage medium
CN113743697A (en) * 2020-08-21 2021-12-03 西安京迅递供应链科技有限公司 Risk alarm method and device
CN112052227A (en) * 2020-09-25 2020-12-08 郑州阿帕斯数云信息科技有限公司 Data change log processing method and device and electronic equipment
CN112162904A (en) * 2020-09-25 2021-01-01 同程网络科技股份有限公司 Order change process integration method, order change process extraction method, order change process integration device and order change process extraction device
CN114422577A (en) * 2020-10-12 2022-04-29 腾讯科技(深圳)有限公司 Method and device for processing service change message
CN112417018A (en) * 2020-11-23 2021-02-26 中国工商银行股份有限公司 Data sharing method and device
CN112417018B (en) * 2020-11-23 2023-09-22 中国工商银行股份有限公司 Data sharing method and device
CN112445863A (en) * 2020-11-30 2021-03-05 永辉云金科技有限公司 Real-time data synchronization method and system
CN112445863B (en) * 2020-11-30 2024-06-18 永辉云金科技有限公司 Data real-time synchronization method and system
CN113765984A (en) * 2021-01-04 2021-12-07 北京沃东天骏信息技术有限公司 Data pushing method and device
CN112883367A (en) * 2021-01-26 2021-06-01 北京高因科技有限公司 Trigger data secure transmission method and device
CN113268540A (en) * 2021-03-26 2021-08-17 北京视博云信息技术有限公司 Data synchronization method and device
CN113434600A (en) * 2021-06-30 2021-09-24 青岛海尔科技有限公司 Data synchronization method and device
CN113434600B (en) * 2021-06-30 2023-06-09 青岛海尔科技有限公司 Data synchronization method and device
CN113821407A (en) * 2021-09-15 2021-12-21 浙江网新恩普软件有限公司 Storm distributed real-time computing method and system
CN113821407B (en) * 2021-09-15 2023-08-01 浙江浙大网新软件产业集团有限公司 Storm distributed real-time computing method and system
CN114328750A (en) * 2021-12-31 2022-04-12 北京发现角科技有限公司 Method and device for synchronizing service data with ODS (oxide dispersion strengthened) layer

Similar Documents

Publication Publication Date Title
CN110597914A (en) Data transmission system, method, device and equipment
EP3602341B1 (en) Data replication system
CN110321387B (en) Data synchronization method, equipment and terminal equipment
CN108228814B (en) Data synchronization method and device
US20200089666A1 (en) Secure data isolation in a multi-tenant historization system
KR102051692B1 (en) Telemetry system for a cloud synchronization system
CN104809201B (en) A kind of method and apparatus of database synchronization
CN112507029B (en) Data processing system and data real-time processing method
US7899783B1 (en) Monitoring data integrity
CN105095365A (en) Information flow data processing method and device
CN111143382B (en) Data processing method, system and computer readable storage medium
CN111526188B (en) System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka
CN104765840A (en) Big data distributed storage method and device
CN104794190A (en) Method and device for effectively storing big data
CN107040576A (en) Information-pushing method and device, communication system
US20220391368A1 (en) Cryptography system for using associated values stored in different locations to encode and decode data
CN107291938B (en) Order inquiry system and method
US11422789B2 (en) System and method for implementing software release version update automation tool
CN114416868A (en) Data synchronization method, device, equipment and storage medium
US9374437B2 (en) Schema validation proxy
US20200099788A1 (en) Context data management interface for contact center
CN111782721A (en) Data synchronization method and device, electronic equipment and storage medium
CN111274316A (en) Execution method and device of multi-level data flow task, electronic equipment and storage medium
CN106897365B (en) Data processing method and device
CN117176582B (en) Data transmission management method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191220

RJ01 Rejection of invention patent application after publication