CN112486655A

CN112486655A - High-concurrency data processing system and method

Info

Publication number: CN112486655A
Application number: CN202011423413.2A
Authority: CN
Inventors: 黄泽鑫; 罗晓; 王敉佳; 汪雄飞
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2020-12-08
Filing date: 2020-12-08
Publication date: 2021-03-12

Abstract

The invention discloses a high-concurrency data processing system and a method, wherein the system comprises a database, a load balancing server, a plurality of message queue servers and a plurality of stream processing servers, wherein the stream processing servers are arranged in one-to-one correspondence with the message queue servers; each message queue server is provided with a subject channel composed of message queues and used for temporarily storing one type of data requests, and the subject channel is used for temporarily storing the data requests distributed by the load balancing server; each stream processing server is used for processing the data request in the theme channel of the message queue server correspondingly arranged by adopting the stream processing technology and storing the processing result into the database. By adopting the technical scheme of the invention, a large amount of high-concurrency data processing requirements can be met.

Description

High-concurrency data processing system and method

Technical Field

The invention relates to the field of servers, in particular to a high-concurrency data processing system and a high-concurrency data processing method.

Background

With the rapid development of the industrial internet, the number of industrial devices is increasing, device data needs to be reported and stored, the frequency of data reporting is increasing, great pressure is applied to a server, and meanwhile, real-time service processing is also accompanied in the data reporting process. In order to improve the processing capability of the server and process and store data in real time in the face of high concurrent reporting and processing of multiple devices, a problem to be solved is first of all, and therefore an architecture with high concurrent processing capability is needed.

Disclosure of Invention

The invention aims to solve the technical problem that the server pressure is high due to the fact that the output processing amount is increased rapidly in the prior art, and provides a high-concurrency data processing system and method.

In an embodiment of the present invention, a highly concurrent data processing system is provided, which includes a database, a load balancing server, a plurality of message queue servers, and a plurality of stream processing servers arranged in one-to-one correspondence with the message queue servers,

the load balancing server is used for distributing an external data request to the corresponding message queue server according to a pre-configured IP mapping relation and a scheduling algorithm;

each message queue server is provided with a subject channel composed of message queues and used for temporarily storing one type of data requests, and the subject channel is used for temporarily storing the data requests distributed by the load balancing server;

each stream processing server is used for processing the data request in the theme channel of the message queue server correspondingly arranged by adopting the stream processing technology and storing the processing result into the database.

In the embodiment of the invention, the load balancing server adopts an Nginx server.

In the embodiment of the invention, the message queue server adopts a Kafka server.

In the embodiment of the invention, the stream processing server adopts a Flink server.

In the embodiment of the invention, the load balancing server sets the message queue servers corresponding to different types of data requests and the IP addresses of the message queue servers in the pre-configured IP mapping relation and scheduling algorithm.

In an embodiment of the present invention, a method for processing high-concurrency data is further provided, where the method includes:

the load balancing server distributes the external data request to the corresponding message queue server according to the pre-configured IP mapping relation and the scheduling algorithm;

the message queue server temporarily stores the data requests distributed by the load balancing server in a subject channel for temporarily storing the data requests of the same type;

the stream processing server obtains the data request in the topic channel of the corresponding message queue server, processes the data request by adopting a stream processing technology, and stores the processing result into a database.

Compared with the prior art, the high-concurrency data processing system and the method thereof are adopted, and the load balancing server distributes the external data request to the corresponding message queue server according to the pre-configured IP mapping relation and the scheduling algorithm; the message queue server temporarily stores the data requests distributed by the load balancing server in a subject channel for temporarily storing the data requests of the same type; the stream processing server obtains the data request in the subject channel of the corresponding message queue server, processes the data request by adopting a stream processing technology, and stores the processing result into the database.

Drawings

FIG. 1 is a block diagram of a highly concurrent data processing system according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of a data processing flow of a highly concurrent data processing system according to an embodiment of the present invention.

Detailed Description

As shown in fig. 1, in the embodiment of the present invention, a highly concurrent data processing system is provided, which includes a database, a load balancing server, a plurality of message queue servers, and a plurality of stream processing servers arranged in one-to-one correspondence with the plurality of message queue servers. The following description will be made separately.

The database is used for storing data. In the embodiment of the invention, the database adopts a MySQL database.

And the load balancing server is used for distributing the external data request to the corresponding message queue server according to the pre-configured IP mapping relation and the scheduling algorithm.

It should be noted that, in the embodiment of the present invention, the load balancing server employs an Nginx server. The Nginx server has the advantages of high speed, good expansibility and high reliability. In the embodiment of the invention, the Nginx server is adopted as the load balancing server, and the Nginx server is used for carrying out the distribution operation of the reverse proxy on the data request by utilizing the high reliability special of the Nginx server. In the pre-configured IP mapping relationship and scheduling algorithm, message queue servers corresponding to different types of data requests and IP addresses of the message queue servers are set, so that the load aggregation server can send the data requests to the corresponding message queue servers according to the types of the data requests.

Each message queue server is provided with a subject channel composed of message queues and used for temporarily storing one type of data requests, and the subject channel is used for temporarily storing the data requests distributed by the load balancing server.

It should be noted that, since the highly concurrent data processing system needs to process a large number of highly concurrent data requests, a plurality of message queue servers need to be provided. In the embodiment of the invention, the message queue server adopts a Kafka server. The Kafka server is a high-throughput, distributed, publish-subscribe schema-based messaging system. In the embodiment of the invention, the Kafka server cluster is arranged to process the high-concurrency data request. The Kafka server serves as a producer to establish topic (topic) channels, each topic channel is used for temporarily storing one type of data request, and the data requests distributed by the load balancing server are temporarily stored in the topic channel.

It should be noted that, in the embodiment of the present invention, the stream processing server employs a Flink server. The plurality of Flink servers form a Flink server cluster. The Flink servers can not only respond to high-concurrency requests, but also reduce the coupling degree among programs and perform independent business processing on consumed data, each Flink server in the Flink server cluster serves as a consumer, data requests of corresponding main body channels in the Kafka server are appointed for consumption, high-concurrency logic processing is performed according to business requirements, and processed results are stored in the MySQL database to provide subsequent API requirements.

The configuration process of the high concurrency data processing system is as follows:

the first step is as follows: building a Kafka server cluster, selecting a plurality of computers to install Kafka server software, and configuring the IP of each Kafka server;

the second step is that: building a Nginx server, selecting a machine to install Nginx server software, modifying a configuration file, configuring a mapping relation according to the IP of each Kafka server, and distributing a request to the corresponding Kafka server according to the mapping relation and a scheduling algorithm when the Nginx server obtains the request sent by the equipment;

the third step: building a Flink server cluster, selecting a plurality of computers to build the Flink servers, configuring the stream processing service in each Flink server, and enabling each stream processing service to point to a corresponding theme channel in one Kafka server;

and step four, installing a MySQL database, and associating each Flink server with the MySQL database to realize data transmission.

As shown in fig. 2, the data processing procedure of the highly concurrent data processing system includes:

the load balancing server acquires an external data request;

In summary, with the high-concurrency data processing system and method of the present invention, the load balancing server distributes the external data request to the corresponding message queue server according to the pre-configured IP mapping relationship and scheduling algorithm; the message queue server temporarily stores the data requests distributed by the load balancing server in a subject channel for temporarily storing the data requests of the same type; the stream processing server obtains the data request in the subject channel of the corresponding message queue server, processes the data request by adopting a stream processing technology, and stores the processing result into the database.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A high concurrency data processing system is characterized by comprising a database, a load balancing server, a plurality of message queue servers and a plurality of stream processing servers which are arranged in one-to-one correspondence with the message queue servers,

2. The highly concurrent data processing system according to claim 1, wherein the load balancing server employs a Nginx server.

3. The highly concurrent data processing system according to claim 1, wherein the message queue server employs a Kafka server.

4. The highly concurrent data processing system according to claim 1, wherein the stream processing server employs a Flink server.

5. The system according to claim 1, wherein the load balancing server sets message queue servers corresponding to different types of data requests and IP addresses of the message queue servers in the pre-configured IP mapping relationship and scheduling algorithm.

6. A high concurrency data processing method is characterized by comprising the following steps:

7. The method for highly concurrent data processing according to claim 6, wherein the load balancing server employs a Nginx server.

8. The highly concurrent data processing method according to claim 6, wherein the message queue server employs a Kafka server.

9. The highly concurrent data processing method according to claim 6, wherein the stream processing server employs a Flink server.

10. The method according to claim 6, wherein the load balancing server sets message queue servers corresponding to different types of data requests and IP addresses of the message queue servers in the pre-configured IP mapping relationship and scheduling algorithm.