CN111526188A - System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka - Google Patents

System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka Download PDF

Info

Publication number
CN111526188A
CN111526188A CN202010281180.0A CN202010281180A CN111526188A CN 111526188 A CN111526188 A CN 111526188A CN 202010281180 A CN202010281180 A CN 202010281180A CN 111526188 A CN111526188 A CN 111526188A
Authority
CN
China
Prior art keywords
data
kafka
copy
leader
offset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010281180.0A
Other languages
Chinese (zh)
Other versions
CN111526188B (en
Inventor
王婧妍
徐晶
石波
胡佳
谢小明
施雪成
丁卫星
李渊
杨坤崇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Computer Technology and Applications
Original Assignee
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Computer Technology and Applications filed Critical Beijing Institute of Computer Technology and Applications
Priority to CN202010281180.0A priority Critical patent/CN111526188B/en
Publication of CN111526188A publication Critical patent/CN111526188A/en
Application granted granted Critical
Publication of CN111526188B publication Critical patent/CN111526188B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0407Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the identity of one or more communicating identities is hidden
    • H04L63/0421Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a system and a method for ensuring zero data loss based on Spark Streaming combined with Kafka, and belongs to the technical field of real-time processing and sequencing of Streaming data. The invention designs a new anonymous communication system by combining the cluster idea and the SDN network centralized control idea, thereby ensuring that the anonymous communication service is safer and more reliable. The network system architecture based on the SDN improves the difficulty of attackers in obtaining user privacy and the response rate of network requests; by adopting a cluster mode and a node selection limiting strategy, safety threats such as malicious node injection, flow analysis and single-point attack are avoided to a great extent, and the defense capability of the system is improved.

Description

System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka
Technical Field
The invention belongs to the technical field of streaming data real-time processing and sorting, and particularly relates to a system and a method for ensuring zero data loss based on spark streaming and Kafka.
Background
With the advent and popularization of the information age, data informatization is closely related to life and work. The daily operation of an enterprise often generates TB-level data, the sources covering various types of data that an internet appliance can capture. In the face of huge log quantity, the traditional log processing system framework cannot meet the current demand. Big data analysis is the analysis of data on a huge scale. The real-time requirement of system services on data is gradually increased. The real-time big data analysis analyzes the data with huge scale, and the big data technology is utilized to efficiently and quickly complete the analysis, thereby achieving the effect of approximate real-time and reflecting the value and significance of the data more timely. The real-time processing is widely applied, such as real-time recommendation scenes of business departments, real-time reports of data departments and real-time monitoring of operation and maintenance departments. Kafka is a high-throughput distributed publish-subscribe messaging system that can handle all streaming data in consumer-scale websites, and can get feedback in time when an event occurs through Kafka's real-time data streaming. The Spark Streaming realizes the processing of real-time Streaming data with high throughput and a fault-tolerant mechanism. The method supports the acquisition of data from multiple data sources, then utilizes a high-level function to perform complex algorithm processing, and finally stores the complex algorithm processing in a file system or a database. The kafka is used as a message system to provide message persistence capability, but the kafka has hidden danger of data loss, how to realize kafka related configuration to ensure zero data loss, and ensuring message transmission reliability in the process of combining with spark timing is a technical problem to be solved urgently.
The support technology of the real-time data platform mainly comprises four aspects: real-time data collection (e.g., FLUME), message middleware (e.g., Kafka), stream computation frameworks (e.g., Storm, Spark, Flink, and Beam), and real-time storage of data (e.g., HBase for column family storage). The most central technology of the real-time data platform is stream computing.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is as follows: how to solve the problem that Kafka loses data in the transmission process and consumption, the reliability in the data consumption process is ensured.
(II) technical scheme
In order to solve the technical problem, the invention provides a system for real-time stream processing based on the Kafka partitioning technology and sparkStreamin, which comprises a data caching module and a stream calculating module, wherein the data caching module is used for caching data acquired from different sources and forwarding the data to the stream calculating module; and the flow calculation module is used for processing the data after the data are read.
Preferably, the data caching module is specifically configured to use Kafka to implement caching of data acquired from different sources and forward the data to the stream computation module, and when Kafka is used to implement caching of data acquired from different sources and forward the data to the stream computation module, parameters are specifically set from three aspects, namely, a data production end, a zookeeper cluster end, and a data consumption end of Kafka, so as to prevent data loss.
Preferably, the setting of the parameters at the zookeeper cluster end by the data caching module to prevent data loss specifically includes: kafka can guarantee the sequence of the messages of the partitions, the messages sent to the Kafka partition firstly can be consumed firstly by the same partition, each topic of the Kafka has a plurality of partitions, each partition has a plurality of copies, each copy is divided into a leader copy and other follower copies, all the messages are sent to the leader copy, the message consumption is also obtained from the leader copy and is then synchronized with other copies, when the leader copy is unavailable, a follower copy is elected to be the leader copy, when the follower copy and the leader copy are kept synchronized, the follower copy is a synchronized copy, when the follower copy is not synchronized, the non-synchronized copy is not, if the leader copy is down, a follower copy needs to be elected to be the leader, if the non-synchronized copy is taken as the leader, a part of data can be lost, and the action is called as follows: for such situations, a parameter is set to false to prevent incomplete preference, or "minimum number of synchronized copies" is set to 1 to ensure that there are 1 synchronized copies when the host is down.
Preferably, the setting of the parameters at the data production end by the data caching module to prevent data loss specifically includes: after receiving the message, Kafka returns an ack parameter, where ack is 1, and after the producer leader copy successfully writes the message, the zookeeper cluster side feeds back a successful response as the server side, so that ack is set to 1.
Preferably, the setting of the parameter at the data consuming end by the data caching module to prevent data loss specifically includes: setting manual update offset, and setting to consume a batch and then submit, or setting an accumulator, and submitting the offset failed in current processing when an exception occurs, and consuming the next time from the submitted offset.
Preferably, the stream calculation module specifically uses Kafka Direct API, sets a data unique ID, and adds a partition offset to the data to solve the data loss problem.
Preferably, the stream calculation module solves the data loss problem by specifically using a method of Kafka Direct API: the Kafka Direct API uses Spark Drive to calculate the range of offsets that the next batch needs to process in Kafka, consuming data directly from the Kafka topoic partition.
Preferably, the stream computation module specifically uses a method of setting a data unique ID to solve the data loss problem: when writing into the database, an update statement is adopted, if the update statement exists, the update statement is updated, and if the update statement does not exist, the insertion is performed, and the method sets an offset mode of Direct DStream consumption.
Preferably, the stream computation module specifically uses a method of adding a partition offset to solve the data loss problem: and adding the offset of each partition into each piece of data, and if the program is down, obtaining the latest partition offset information read from the database after the program is restarted.
The invention also provides a method for realizing zero data loss in the data storage and transmission process by using the system.
(III) advantageous effects
The invention designs a new anonymous communication system by combining the cluster idea and the SDN network centralized control idea, thereby ensuring that the anonymous communication service is safer and more reliable. The network system architecture based on the SDN improves the difficulty of attackers in obtaining user privacy and the response rate of network requests; by adopting a cluster mode and a node selection limiting strategy, safety threats such as malicious node injection, flow analysis and single-point attack are avoided to a great extent, and the defense capability of the system is improved.
Drawings
FIG. 1 is a schematic diagram of zookeeper cluster replica classification in the present invention;
FIG. 2 is a schematic diagram of the mechanism of the kafka production terminal ack in the present invention;
FIG. 3 is a schematic diagram of the Kafka Direct API of the present invention.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
The invention provides a system and a corresponding method for ensuring zero data loss based on Kafka and Spark Streaming combination. The invention provides a system for real-time stream processing based on Kafka partitioning technology and Spark Streamin, which comprises a data caching module and a stream calculating module. The data caching module is used for caching the data acquired from different sources and forwarding the data to the stream calculating module. And after the stream calculation module reads the data, processing the received data to ensure that no data is lost and no repeated data is consumed. The two modules supplement each other, and zero data loss is guaranteed. The invention uses Kafka offset mechanism to combine with Spark Streaming to solve the problem of data loss in the transmission process and consumption of Kafka and ensure the reliability in the data consumption process.
First, data buffer module
Kafka is a distributed, highly available, high throughput, distributed messaging system. The method has excellent message persistence capability, and can persist the messages to a local disk and a partition to prevent loss. Constant time access performance can be maintained for data above the TB level. Based on zookeeper operation, the method has fault tolerance and allows nodes in the cluster to fail without data loss. By means of sending compressed data in batches, data transmission overhead is reduced, and throughput is improved. Partitions are supported and message ordering is performed for the same partition, but global message ordering cannot be achieved. The messages in the kafka partition all have a continuous sequence number offset that uniquely identifies a message and records the next message sequence number to be provided to the consumer. Kafka has a hidden danger of data loss, if a consumer finishes reading, the offset is already submitted, but SparkStreaming is hung up when the processing is not finished, the offset is updated at the moment, and the data lost before can not be consumed any more, so that the data loss can be caused. The invention prevents data loss from three aspects of Kafka's data production end, zookeeper cluster end, and data consumption end.
1.1 zookeeper Cluster end
Kafka may guarantee the order of the partition messages, and messages sent to Kafka partitions first may be consumed first by the same partition. Kafka has multiple partitions per topic, with each partition having multiple copies, one leader copy, and the remaining follower copies. All messages are sent to the captain copy, and message consumption is also obtained from the captain copy and then synchronized with the other copies. When the leader copy is unavailable, a follower copy is elected to become the leader copy. As with FIG. 1, the follower copy is a synchronized copy when the follower copy remains synchronized with the captain copy, and is a non-synchronized copy when synchronization is not possible. If the leader copy is down, a follower copy needs to be elected as the leader, and if the asynchronous copy is taken as the leader, a part of data is lost. This behavior is called: incomplete top-collar elections. For such situations, a parameter needs to be set, and setting this parameter to false prevents incomplete preference. Or the minimum number of the synchronous copies can be set to be 1, so that 1 synchronous copy is ensured when the host is down.
1.2 data production end
After Kafka receives the message, it returns an ack parameter. ack parameter 0, 1, all, representing different acknowledgement modes. As shown in fig. 2, ack is 0, and the producer of the data producer does not wait for the response of the server after sending the message. and ack is 1, after the producer leader copy successfully writes the message, the zookeeper cluster terminal is used as a server side to feed back a successful response. and after all copies in the Kafka successfully write the data, the server side feeds back a successful response. ack-all ensures high reliability but reduces throughput. Therefore, the ack is set to 1 in the step, so that the reliability of the data is ensured, and the high throughput of the Kafka is ensured.
1.3 data consuming side
A manual update offset is set. Match is false. When the offset is only automatically submitted, when 30 pieces of data are pulled, the data are automatically submitted when 20 pieces of data are processed, abnormality occurs in processing 21 pieces of data, when the data are pulled again, pulling is started from the position after 30 pieces of data, and 21-30 pieces of data are lost. To prevent data loss, the automatic commit is modified to a manual commit. It can be set to consume a batch of post-commit, or set to an accumulator, when an exception occurs, commit the offset of the current processing failure, and the next consumption starts from the committed offset.
Two, flow calculating module
Spark Streaming is an extension of Spark core api, and supports scalable, high-throughput and fault-tolerant Streaming of real-time data streams. The Spark Streaming receives the real-time data stream and decomposes the real-time data stream into a series of short batch processing operations, namely DStream, converts each verse into RDD (elastic distributed data set), then converts the RDD, and stores the result in the memory. The data of Kafka is received by a receiver of Spark Streaming and then stored in Spark. Once the data is stored in Spark, the Kafka offset in zookeeper is updated. There may be a scenario of data loss. The invention uses Kafka Direct API, sets the unique ID of the data, and adds the partition offset in the data to solve the problem of data loss.
2.1 Kafka Direct API: and consumption processing is carried out on the streaming data of the Kafka, so that zero loss of the data is ensured and repeated consumption is prevented. As shown in FIG. 3, the Kafka Direct API uses Spark Drive to calculate the range of offset amounts in Kafka that the next batch process needs to process, consuming data directly from the Kafka topoic partition.
2.2 setting data unique ID: when writing into the database, adopting an update statement, if the update statement exists, the update statement is updated, and if the update statement does not exist, the insertion statement is inserted. The method needs to set an offset mode of Direct DStream consumption, commit.
2.3 Add partition offset: the offset of the partition is added to each piece of data. If the program is down, after the program is restarted, the latest partition offset information is read from the database, the atomicity of the data and the offset is ensured, and the problems of data loss and repeated consumption are solved.
The invention collects and summarizes the data acquired by the data source and persists the data to the disk by utilizing the high expansibility and the high reliability of the Kafka, thereby reducing the data loss probability. The method is combined with Spark Streaming data processing, so that the data can be efficiently processed in real time, and the reliability of the data is guaranteed.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A system based on Kafka partitioning technology and Spark Streamin real-time stream processing is characterized by comprising a data caching module and a stream calculating module, wherein the data caching module is used for caching data acquired from different sources and forwarding the data to the stream calculating module; and the flow calculation module is used for processing the data after the data are read.
2. The system according to claim 1, wherein the data caching module is specifically configured to implement caching of data acquired from different sources by using Kafka and forward the data to the stream computation module, and when implementing caching of data acquired from different sources by using Kafka and forward the data to the stream computation module, parameters are set to prevent data loss in three aspects, specifically, from a data production end, a zookeeper cluster end, and a data consumption end of Kafka.
3. The system of claim 2, wherein the data caching module sets parameters at the zookeeper cluster end to prevent data loss specifically: kafka can guarantee the sequence of the messages of the partitions, the messages sent to the Kafka partition firstly can be consumed firstly by the same partition, each topic of the Kafka has a plurality of partitions, each partition has a plurality of copies, each copy is divided into a leader copy and other follower copies, all the messages are sent to the leader copy, the message consumption is also obtained from the leader copy and is then synchronized with other copies, when the leader copy is unavailable, a follower copy is elected to be the leader copy, when the follower copy and the leader copy are kept synchronized, the follower copy is a synchronized copy, when the follower copy is not synchronized, the non-synchronized copy is not, if the leader copy is down, a follower copy needs to be elected to be the leader, if the non-synchronized copy is taken as the leader, a part of data can be lost, and the action is called as follows: for such situations, a parameter is set to false to prevent incomplete preference, or "minimum number of synchronized copies" is set to 1 to ensure that there are 1 synchronized copies when the host is down.
4. The system of claim 2, wherein the data caching module sets parameters at the data production end to prevent data loss specifically: after receiving the message, Kafka returns an ack parameter, where ack is 1, and after the producer leader copy successfully writes the message, the zookeeper cluster side feeds back a successful response as the server side, so that ack is set to 1.
5. The system of claim 2, wherein the data caching module sets parameters at the data consuming end to prevent data loss specifically: setting manual update offset, and setting to consume a batch and then submit, or setting an accumulator, and submitting the offset failed in current processing when an exception occurs, and consuming the next time from the submitted offset.
6. The system of claim 2, wherein the stream calculation module solves the data loss problem by using one of Kafka DirectAPI, setting a data unique ID, and adding a partition offset to data.
7. The system of claim 6, wherein the stream computation module solves the data loss problem specifically using the method of Kafka DirectAPI: kafka DirectAPI uses Spark Drive to calculate the range of offset amounts in Kafka that the next batch process needs to process, consuming data directly from the Kafkatopic partition.
8. The system of claim 6, wherein the stream computation module solves the data loss problem specifically using a method of setting a data unique ID: when writing into the database, an update statement is adopted, if the update statement exists, the update statement is updated, and if the update statement does not exist, the insertion is performed, and the method sets an offset mode of Direct DStream consumption.
9. The system of claim 6, wherein the stream computation module solves the data loss problem specifically using the method of adding partition offsets: and adding the offset of each partition into each piece of data, and if the program is down, obtaining the latest partition offset information read from the database after the program is restarted.
10. A method of achieving zero data loss during data storage and transmission using the system of any one of claims 1 to 9.
CN202010281180.0A 2020-04-10 2020-04-10 System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka Active CN111526188B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010281180.0A CN111526188B (en) 2020-04-10 2020-04-10 System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010281180.0A CN111526188B (en) 2020-04-10 2020-04-10 System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka

Publications (2)

Publication Number Publication Date
CN111526188A true CN111526188A (en) 2020-08-11
CN111526188B CN111526188B (en) 2022-11-22

Family

ID=71901685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010281180.0A Active CN111526188B (en) 2020-04-10 2020-04-10 System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka

Country Status (1)

Country Link
CN (1) CN111526188B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112087501A (en) * 2020-08-28 2020-12-15 北京明略昭辉科技有限公司 Transmission method and system for keeping data consistency
CN112269765A (en) * 2020-11-13 2021-01-26 中盈优创资讯科技有限公司 Method and device for improving data source reading performance of Spark structured stream file
CN115604290A (en) * 2022-12-13 2023-01-13 云账户技术(天津)有限公司(Cn) Kafka message execution method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160321308A1 (en) * 2015-05-01 2016-11-03 Ebay Inc. Constructing a data adaptor in an enterprise server data ingestion environment
CN106776855A (en) * 2016-11-29 2017-05-31 上海轻维软件有限公司 The processing method of Kafka data is read based on Spark Streaming
US20190102266A1 (en) * 2017-09-29 2019-04-04 Oracle International Corporation Fault-tolerant stream processing
CN110908788A (en) * 2019-12-02 2020-03-24 北京锐安科技有限公司 Spark Streaming based data processing method and device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160321308A1 (en) * 2015-05-01 2016-11-03 Ebay Inc. Constructing a data adaptor in an enterprise server data ingestion environment
CN106776855A (en) * 2016-11-29 2017-05-31 上海轻维软件有限公司 The processing method of Kafka data is read based on Spark Streaming
US20190102266A1 (en) * 2017-09-29 2019-04-04 Oracle International Corporation Fault-tolerant stream processing
CN110908788A (en) * 2019-12-02 2020-03-24 北京锐安科技有限公司 Spark Streaming based data processing method and device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王岩等: "一种基于Kafka的可靠的Consumer的设计方案", 《软件》 *
韩德志等: "基于Spark Streaming的实时数据分析***及其应用", 《计算机应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112087501A (en) * 2020-08-28 2020-12-15 北京明略昭辉科技有限公司 Transmission method and system for keeping data consistency
CN112087501B (en) * 2020-08-28 2023-10-24 北京明略昭辉科技有限公司 Transmission method and system for maintaining data consistency
CN112269765A (en) * 2020-11-13 2021-01-26 中盈优创资讯科技有限公司 Method and device for improving data source reading performance of Spark structured stream file
CN115604290A (en) * 2022-12-13 2023-01-13 云账户技术(天津)有限公司(Cn) Kafka message execution method, device, equipment and storage medium
CN115604290B (en) * 2022-12-13 2023-03-24 云账户技术(天津)有限公司 Kafka message execution method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111526188B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
CN111526188B (en) System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka
CN105959151B (en) A kind of Stream Processing system and method for High Availabitity
CN106953901B (en) Cluster communication system and method for improving message transmission performance
US9917913B2 (en) Large message support for a publish-subscribe messaging system
US20210112013A1 (en) Message broker system with parallel persistence
US20200059376A1 (en) Eventually consistent data replication in queue-based messaging systems
CN112507029B (en) Data processing system and data real-time processing method
CN101277272B (en) Method for implementing magnanimity broadcast data warehouse-in
US11595474B2 (en) Accelerating data replication using multicast and non-volatile memory enabled nodes
CN105493474B (en) System and method for supporting partition level logging for synchronizing data in a distributed data grid
CN110795503A (en) Multi-cluster data synchronization method and related device of distributed storage system
Spirovska et al. Wren: Nonblocking reads in a partitioned transactional causally consistent data store
CN111787055A (en) Redis-based transaction mechanism and multi-data center oriented data distribution method and system
CN112527844A (en) Data processing method and device and database architecture
Oleson et al. Operational information systems: An example from the airline industry
CN112965839A (en) Message transmission method, device, equipment and storage medium
US8359601B2 (en) Data processing method, cluster system, and data processing program
US20210357275A1 (en) Message stream processor microbatching
CN114676199A (en) Synchronization method, synchronization system, computer equipment and storage medium
CN116226139B (en) Distributed storage and processing method and system suitable for large-scale ocean data
US8201017B2 (en) Method for queuing message and program recording medium thereof
EP2025133B1 (en) Repository synchronization in a ranked repository cluster
CN116304390B (en) Time sequence data processing method and device, storage medium and electronic equipment
CN108989465B (en) Consensus method, server, storage medium and distributed system
US20100275217A1 (en) Global attribute uniqueness (gau) using an ordered message service (oms)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant