CN112019386B - Distributed cluster alarm method, device and equipment based on cloud platform - Google Patents

Distributed cluster alarm method, device and equipment based on cloud platform Download PDF

Info

Publication number
CN112019386B
CN112019386B CN202010889242.6A CN202010889242A CN112019386B CN 112019386 B CN112019386 B CN 112019386B CN 202010889242 A CN202010889242 A CN 202010889242A CN 112019386 B CN112019386 B CN 112019386B
Authority
CN
China
Prior art keywords
alarm
target
task
fragments
kapacitor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010889242.6A
Other languages
Chinese (zh)
Other versions
CN112019386A (en
Inventor
张连法
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Inspur Data Technology Co Ltd
Original Assignee
Beijing Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Inspur Data Technology Co Ltd filed Critical Beijing Inspur Data Technology Co Ltd
Priority to CN202010889242.6A priority Critical patent/CN112019386B/en
Publication of CN112019386A publication Critical patent/CN112019386A/en
Application granted granted Critical
Publication of CN112019386B publication Critical patent/CN112019386B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/065Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Alarm Systems (AREA)

Abstract

The application discloses a distributed cluster alarm method based on a cloud platform, wherein the task weight of each alarm type is determined according to the data volume and the calculation complexity of the alarm task of each alarm type processed in unit time; then, all alarm types are segmented according to the weight, and the balance of alarm tasks among the segments is ensured; then generating a certain number of Kapacitor instances, and maintaining the mapping relation between the instances and the fragments; and finally executing the alarm tasks of the corresponding alarm types by utilizing the instances. Therefore, the method can decompose the alarm tasks, utilize a certain number of examples to execute the alarm tasks, realize the distributed alarm cluster supporting the large-scale monitoring data in the cloud platform, and is beneficial to improving the alarm task throughput and the system stability of the alarm system. In addition, the application also provides a distributed cluster alarm device, equipment and a readable storage medium based on the cloud platform, and the technical effect of the distributed cluster alarm device and the equipment corresponds to that of the method.

Description

Distributed cluster alarm method, device and equipment based on cloud platform
Technical Field
The present application relates to the field of computer technologies, and in particular, to a distributed cluster alarm method, apparatus, device, and readable storage medium based on a cloud platform.
Background
Kapacitor is an open source monitoring system alarm component developed by InfluxData corporation, supports high-performance alarm and flexible alarm notification, and is widely applied to alarm modules in monitoring systems. With the continuous improvement of the scale of the cloud platform, how to realize the alarm service of the large-scale monitoring system becomes more and more important, and the single-machine Kapacitor cannot bear the data scale of the large-scale monitoring alarm system.
Disclosure of Invention
The application aims to provide a distributed cluster alarm method, a distributed cluster alarm device, distributed cluster alarm equipment and a readable storage medium based on a cloud platform, and is used for solving the problem that a single-machine Kapacitor cannot bear the data scale of a large-scale monitoring alarm system. The specific scheme is as follows:
in a first aspect, the present application provides a distributed cluster alarm method based on a cloud platform, including:
in a distributed cluster, determining the task weight of each alarm type according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time;
dividing all alarm types of a distributed cluster according to a load balancing strategy and the number of target fragments and the task weight of each alarm type to obtain fragments of the number of the target fragments, wherein the fragments comprise one or more alarm types;
generating a corresponding number of Kapacitor instances according to the number of the target fragments;
determining a mapping relation between the fragments and the Kapacitor instances;
and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation and the target alarm type so as to realize alarm.
Preferably, the generating of the corresponding number of Kapacitor instances according to the number of the target fragments includes:
and generating a corresponding number of Kapacitor instances according to the number of the target fragments and the number of the target copies, wherein the corresponding number is equal to the product of the number of the target fragments and the number of the target copies.
Preferably, the determining a target alarm type of a current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relationship and the target alarm type to implement an alarm includes:
determining a mapping relation between the alarm type and the Kapator instance;
and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation between the alarm type and the Kapacitor instance so as to realize alarm.
Preferably, the dividing, according to the load balancing policy, all the alarm types of the distributed cluster according to the number of target fragments and the task weight of each alarm type to obtain fragments of the number of target fragments includes:
determining the sum of task weights of all alarm types in the distributed cluster;
determining the target bearing value of each fragment according to the number of the target fragments and the sum;
and dividing all alarm types of the distributed cluster according to the target bearing value to obtain fragments of the target fragment number, wherein the difference value between the actual bearing value and the target bearing value of each fragment is smaller than a preset threshold value, and the actual bearing value is the sum of task weights of all alarm types contained in the fragment.
Preferably, the method further comprises the following steps:
after receiving an instruction of deleting the alarm type, updating the actual bearing value of the fragment where the alarm type to be deleted is located;
and/or the presence of a gas in the gas,
and after receiving an instruction of adding the alarm type, adding the newly added alarm type to the fragment with the minimum actual bearing value, and updating the actual bearing value of the fragment.
Preferably, the alarm type includes any one or more of the following items: CPU utilization rate alarm, memory utilization rate alarm, network card flow rate alarm and magnetic disk read-write rate alarm.
Preferably, after the determining the mapping relationship between the shard and the Kapacitor instance, the method further includes:
monitoring the running state of the Kapacitor instance;
and if the running state of the Kapator instance is abnormal, updating the mapping relation between the fragments and the Kapator instance.
In a second aspect, the present application discloses a distributed cluster alarm device based on a cloud platform, including:
a weight calculation module: the method comprises the steps that in a distributed cluster, the task weight of each alarm type is determined according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time;
a slicing module: the system comprises a load balancing strategy, a task weight and a cluster management unit, wherein the load balancing strategy is used for dividing all alarm types of a distributed cluster according to the number of target fragments and the task weight of each alarm type to obtain fragments of the number of the target fragments, and the fragments comprise one or more alarm types;
an instantiation module: the Kapacitor instances are used for generating corresponding numbers according to the number of the target fragments;
a mapping relation determination module: the Kapator instance is used for determining the mapping relation between the fragments and the Kapator instance;
an alarm module: and the target alarm type is used for determining the target alarm type of the current alarm task, and the current alarm task is sent to a target Kapacitor instance according to the mapping relation and the target alarm type so as to realize alarm.
In a third aspect, the present application discloses a distributed cluster alarm device based on a cloud platform, including:
a memory: for storing a computer program;
a processor: for executing the computer program to implement the steps of the cloud platform-based distributed cluster alarm method as described above.
In a fourth aspect, the present application discloses a readable storage medium having stored thereon a computer program for implementing the steps of the cloud platform based distributed cluster alarm method as described above when being executed by a processor.
The application provides a distributed cluster alarm method based on a cloud platform, which comprises the following steps: in the distributed cluster, determining the task weight of each alarm type according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time; dividing all alarm types of the distributed cluster according to the number of target fragments and the task weight of each alarm type according to a load balancing strategy to obtain fragments of the number of the target fragments, wherein the fragments comprise one or more alarm types; generating a corresponding number of Kapacitor instances according to the number of the target fragments; determining a mapping relation between the fragments and a Kapacitor instance; and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapator instance according to the mapping relation and the target alarm type so as to realize alarm.
In conclusion, the method determines the task weight of each alarm type according to the data volume processed in unit time by the alarm task of each alarm type and the calculation complexity; then, all alarm types are segmented according to the weight, the weight borne by each segment is guaranteed to be as average as possible, and the balance of alarm tasks among the segments is realized; then generating a certain number of Kapacitor instances, and maintaining the mapping relation between the Kapacitor instances and the fragments; and finally, executing the alarm task of the corresponding alarm type by using the Kapacitor instance. Therefore, the method can decompose the alarm tasks, and utilize a certain number of Kapacitor instances to execute the corresponding alarm tasks, so that the distributed alarm cluster supporting large-scale monitoring data in the cloud platform is realized, and the alarm task throughput and the system stability of the alarm system are improved.
In addition, the application also provides a distributed cluster alarm device, equipment and a readable storage medium based on the cloud platform, and the technical effect of the distributed cluster alarm device, the equipment and the readable storage medium correspond to the technical effect of the method, and are not repeated herein.
Drawings
For a clearer explanation of the embodiments or technical solutions of the prior art of the present application, the drawings needed for the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a first implementation of a distributed cluster alarm method based on a cloud platform according to an embodiment of the present disclosure;
fig. 2 is a schematic process diagram of a second embodiment of a distributed cluster alarm method based on a cloud platform according to the present application;
fig. 3 is a functional block diagram of an embodiment of a distributed cluster warning device based on a cloud platform according to the present application;
fig. 4 is a schematic structural diagram of an embodiment of a distributed cluster alarm device based on a cloud platform according to the present application.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the following detailed description is given with reference to the accompanying drawings. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, a single machine Kapacitor cannot bear the data scale of a large-scale monitoring and alarming system, and the following problems exist: the Kapacitor distributed cluster solution is a business edition; a solution for realizing the distributed alarm cluster by the single-machine Kapacitor does not exist, and the method is not applied to a cloud platform.
In order to solve the problem, the application provides a distributed cluster alarm method, a distributed cluster alarm device, a distributed cluster alarm equipment and a readable storage medium based on a cloud platform, which can decompose alarm tasks and utilize a certain number of Kapator instances to execute the corresponding alarm tasks, so that the distributed cluster alarm supporting large-scale monitoring data in the cloud platform is realized, and the alarm task throughput and the system stability of an alarm system are improved.
Referring to fig. 1, a first embodiment of a distributed cluster alarm method based on a cloud platform is described below, where the first embodiment includes:
s101, in a distributed cluster, determining the task weight of each alarm type according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time;
s102, dividing all alarm types of the distributed cluster according to the number of target fragments and the task weight of each alarm type according to a load balancing strategy to obtain fragments of the number of the target fragments, wherein the fragments comprise one or more alarm types;
s103, generating corresponding number of Kapacitor instances according to the number of the target fragments;
s104, determining a mapping relation between the fragments and a Kapacitor instance;
and S105, determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation and the target alarm type to realize alarm.
The embodiment realizes the fragmentation of the alarm types, and the alarm types can specifically comprise a CPU utilization rate alarm, a memory utilization rate alarm, a network card flow rate alarm, a disk read-write rate alarm and the like. In the fragmentation process, firstly, the task weight of each alarm type is calculated according to the data amount and the calculation complexity of the alarm task of each alarm type processed in unit time, and then the alarm types are fragmented according to the weight and the target fragment number, so that the average weight borne in each fragment is ensured as much as possible, and further the balance of the alarm tasks among the fragments is realized.
In practical application, the number of target fragments is determined according to system design requirements, and mainly depends on the size of the data scale and the type and the number of alarm tasks, and the larger the data scale, the larger the number of target fragments.
The alarm task balance among the fragments can be realized specifically according to the following modes: determining the sum of task weights of all alarm types in the distributed cluster; determining a target bearing value of each fragment according to the sum of the target fragment number and the task weight; dividing all alarm types of the distributed cluster according to the target bearing value to obtain fragments of the target fragment number, wherein the difference value between the actual bearing value and the target bearing value of each fragment is smaller than a preset threshold value, and the actual bearing value is the sum of task weights of all alarm types contained in the fragment.
And then, generating a corresponding number of Kapator instances according to the number of the target fragments, wherein each fragment supports one or more Kapator alarm instances. Therefore, the corresponding number may be equal to the target number of slices, or may be an integer multiple of the target number of slices. For example, a certain number of copies may be set for each instance of kappa actor, and the specific process is as follows: and generating a corresponding number of Kapacitor instances according to the number of the target fragments and the number of the target copies, wherein the corresponding number is equal to the product of the number of the target fragments and the number of the target copies.
And maintaining the mapping relation between the fragments and the Kapator instance, determining the target alarm type of the current alarm task in the alarm process, and then sending the current alarm task to the target Kapator instance according to the mapping relation so as to realize alarm.
As a preferred implementation, on the basis of maintaining the mapping relationship between the fragment and the Kapator instance, the mapping relationship between the alarm type and the Kapator instance may be further maintained. At this time, the alarm process is as follows: and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation between the alarm type and the Kapacitor instance so as to realize alarm.
The distributed cluster alarm method based on the cloud platform provided by the embodiment can decompose alarm tasks, and execute the respective corresponding alarm tasks by using a certain number of Kapator instances, so that the distributed alarm cluster supporting large-scale monitoring data in the cloud platform is realized, and the alarm task throughput and the system stability of an alarm system are improved.
An embodiment two of the distributed cluster alarm method based on the cloud platform provided by the present application is described in detail below, and the embodiment two is implemented based on the foregoing embodiment one and is expanded to a certain extent on the basis of the embodiment one.
The embodiment supports disaster recovery processing of the alarm task. Specifically, the running state of the Kapacitor instance is periodically detected, and if the Kapacitor instance is abnormal, the alarm type of the Kapacitor instance under the fragment is redistributed, that is, a task of the alarm type corresponding to the abnormal Kapacitor instance is dispatched to other instances under the same fragment, so that disaster recovery processing of the alarm task is realized.
As shown in fig. 2, the specific implementation process is as follows:
and S21, slicing the alarm task.
The method comprises the following steps of utilizing an alarm task fragmentation component to fragment all alarm types:
and S211, calculating the task weight of each alarm type.
And carrying out weight evaluation according to the monitoring data volume, the alarm list, the alarm type and the alarm task in the system. And determining the system data amount according to the cloud platform scale, such as the number of hosts, the number of cloud hosts, the type of collected data and the data collection period, further counting all alarm tasks and types thereof, and finally calculating the task weight of the alarm type.
TABLE 1
Figure BDA0002656435700000071
Figure BDA0002656435700000081
The number of data pieces in unit time is derived from data statistics, and the calculation complexity is determined by the nature of the alarm type. If the alarm type is divided into 2 pieces, as shown in table 1, the task weights of the CPU utilization alarm, the memory utilization alarm, the network card flow rate alarm and the disk read-write rate alarm are 8000 in total, and the target bearing value of each piece is 4000. The CPU utilization rate alarm, the memory utilization rate alarm and the network card flow rate alarm belong to a shard0 (shard 0) which are obtained by sequentially distributing; the disc read-write rate alarm belongs to slice 1 (shard 1), as shown in table 2.
TABLE 2
Figure BDA0002656435700000082
And S212, generating a mapping table of the fragments and the Kapacitor instances.
And calculating the product of the number of the target fragments and the number of the target copies to serve as the total number of the Kapacitor instances, and generating the Kapacitor instances. The number of target fragments and the number of target copies are determined according to the design requirements of the system and depend on the size of the data size and the type and the number of the alarm tasks. For example, if the data size is large, the number of fragments and the number of copies are large, and if the alarm service security requirement is high, the number of copies is large. Taking the number of target fragments as 2 and the number of target copies as 2 as an example, 4 Kapacitor instances are required in total, as shown in table 3:
TABLE 3
Figure BDA0002656435700000083
And S213, generating a mapping table of the alarm types and the Kapator examples.
When the system is initialized or regenerated, all alarm types, namely the alarm type list, are traversed in sequence, the alarm types are distributed from the fragment 0 until the target bearing capacity of the fragment is reached, the next fragment is distributed continuously until all the fragments are distributed, and the table 4 is obtained.
TABLE 4
Figure BDA0002656435700000091
And S22, scheduling the alarm task.
And executing the following operations on the alarm fragment and the Kapator instance according to the task change and the system state:
s221, when redeployment or manual triggering is caused by initialization operation and disaster recovery triggering, requesting an alarm type fragment, regenerating a mapping table of an alarm type and a Kapator instance, sending alarm tasks according to all the Kapator instances in the mapping table fragment, and starting the tasks in the corresponding Kapator instance;
s222, when the alarm type is increased, the fragment and the Kapacitor instance with the lowest actual bearing weight are sequentially selected, the alarm type and Kapacitor instance mapping table is updated, the alarm task is sent to the Kapacitor instance in the fragment, and the task is started in the corresponding Kapacitor instance.
S223, when the alarm type is deleted: and deleting the alarm types of all Kapacitor instances in the fragments, and updating the bearing weight of the fragments and the Kapacitor instances.
S224, when the alarm type is updated: and directly updating the alarm types in all Kapacitor instances in the sub-slice.
And S23, warning disaster recovery.
The available states of all Kapacitor instances are periodically detected, and when one Kapacitor is abnormal or recovered, the following processing is carried out: setting the state of the Kapacitor instance as abnormal or normal; regenerating an alarm type and Kapacitor instance mapping table; and redeploying the alarm task.
The embodiment is mainly applied to a solution of a distributed alarm cluster in a cloud platform monitoring system, and the embodiment supports the distributed alarm cluster in a large-scale monitoring alarm system in a cloud platform, supports alarm task fragmentation, high expandability, load balancing and alarm disaster recovery, and improves the alarm task throughput and system stability of the alarm system. The embodiment is applied to a cloud platform and supports X86, MIPS and ARM architectures.
In the following, a distributed cluster warning device based on a cloud platform provided in an embodiment of the present application is introduced, and a distributed cluster warning device based on a cloud platform described below and a distributed cluster warning method based on a cloud platform described above may be referred to in a corresponding manner.
As shown in fig. 3, the distributed cluster alarm device based on the cloud platform of this embodiment includes:
weight calculation module 301: the method comprises the steps that in a distributed cluster, the task weight of each alarm type is determined according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time;
the slicing module 302: the distributed cluster alarm system comprises a load balancing strategy, a task weight calculating strategy and a distributed cluster alarm type calculating strategy, wherein the load balancing strategy is used for dividing all alarm types of the distributed cluster according to the target fragment number and the task weight of each alarm type to obtain fragments of the target fragment number, and the fragments comprise one or more alarm types;
the instantiation module 303: the Kapacitor instances are used for generating corresponding numbers according to the number of the target fragments;
mapping relationship determination module 304: the method is used for determining the mapping relation between the fragments and the Kapacitor instances;
the alert module 305: and the method is used for determining the target alarm type of the current alarm task and sending the current alarm task to the target Kapacitor instance according to the mapping relation and the target alarm type so as to realize alarm.
In some specific embodiments, the instantiation module is specifically configured to:
and generating a corresponding number of Kapacitor instances according to the number of the target fragments and the number of the target copies, wherein the corresponding number is equal to the product of the number of the target fragments and the number of the target copies.
In some specific embodiments, the alarm module is specifically configured to:
determining a mapping relation between the alarm type and a Kapacitor instance; and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation between the alarm type and the Kapacitor instance so as to realize alarm.
In some specific embodiments, the fragmentation module is specifically configured to:
determining the sum of task weights of all alarm types in the distributed cluster; determining a target bearing value of each fragment according to the sum of the number of the target fragments; and dividing all alarm types of the distributed cluster according to the target bearing value to obtain fragments of the target fragment number, wherein the difference value between the actual bearing value and the target bearing value of each fragment is smaller than a preset threshold value, and the actual bearing value is the sum of task weights of all alarm types contained in the fragment.
In some specific embodiments, the method further comprises:
a deletion module: the method comprises the steps of updating the actual bearing value of the fragment where the alarm type to be deleted is located after receiving an instruction for deleting the alarm type;
and/or the presence of a gas in the gas,
adding a module: and the method is used for adding the newly added alarm type to the fragment with the minimum actual bearing value after receiving the instruction of adding the alarm type, and updating the actual bearing value of the fragment.
In some specific embodiments, the alert type includes any one or more of: CPU utilization rate alarm, memory utilization rate alarm, network card flow rate alarm and magnetic disk read-write rate alarm.
In some specific embodiments, the method further comprises:
the disaster recovery module: the device is used for monitoring the running state of the Kapacitor instance; and if the running state of the Kapacitor instance is abnormal, updating the mapping relation between the fragments and the Kapacitor instance.
The distributed cluster alarm device based on the cloud platform in this embodiment is used to implement the foregoing distributed cluster alarm method based on the cloud platform, and therefore a specific implementation manner in the device may be seen in the foregoing embodiment of the distributed cluster alarm method based on the cloud platform, for example, the weight calculation module 301, the fragmentation module 302, the instantiation module 303, the mapping relationship determination module 304, and the alarm module 305 are respectively used to implement steps S101, S102, S103, S104, and S105 in the distributed cluster alarm method based on the cloud platform. Therefore, the detailed description thereof may refer to the description of the respective partial embodiments, which will not be presented herein.
In addition, since the distributed cluster alarm device based on the cloud platform of this embodiment is used to implement the foregoing distributed cluster alarm method based on the cloud platform, the role of the device corresponds to that of the method described above, and details are not described here.
In addition, the present application further provides a distributed cluster alarm device based on a cloud platform, as shown in fig. 4, including:
the memory 100: for storing a computer program;
the processor 200: for executing a computer program for implementing the steps of the cloud platform based distributed cluster alarm method as described above.
Finally, the present application provides a readable storage medium having stored thereon a computer program for implementing the steps of the cloud platform based distributed cluster alarm method as described above when being executed by a processor.
In the present specification, the embodiments are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same or similar parts between the embodiments are referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed descriptions of the solutions provided in the present application, and the specific examples applied herein are set forth to explain the principles and implementations of the present application, and the above descriptions of the examples are only used to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A distributed cluster alarm method based on a cloud platform is characterized by comprising the following steps:
in a distributed cluster, determining the task weight of each alarm type according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time;
dividing all alarm types of a distributed cluster according to a load balancing strategy and the number of target fragments and the task weight of each alarm type to obtain fragments of the number of the target fragments, wherein the fragments comprise one or more alarm types;
generating corresponding number of Kapacitor instances according to the number of the target fragments;
determining a mapping relation between the fragments and the Kapacitor instance;
and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation and the target alarm type so as to realize alarm.
2. The method according to claim 1, wherein said generating a corresponding number of Kapacitor instances according to the number of target shards comprises:
and generating a corresponding number of Kapacitor instances according to the number of the target fragments and the number of the target copies, wherein the corresponding number is equal to the product of the number of the target fragments and the number of the target copies.
3. The method according to claim 2, wherein the determining a target alarm type of a current alarm task and sending the current alarm task to a target Kapator instance according to the mapping relationship and the target alarm type to implement an alarm comprises:
determining a mapping relation between the alarm type and the Kapator instance;
and determining a target alarm type of the current alarm task, and sending the current alarm task to a target Kapacitor instance according to the mapping relation between the alarm type and the Kapacitor instance so as to realize alarm.
4. The method according to claim 1, wherein the dividing all alarm types of the distributed cluster according to the target fragment number and the task weight of each alarm type according to the load balancing policy to obtain fragments of the target fragment number comprises:
determining the sum of task weights of all alarm types in the distributed cluster;
determining a target bearing value of each fragment according to the number of the target fragments and the sum;
and dividing all alarm types of the distributed cluster according to the target bearing value to obtain fragments of the target fragment number, wherein the difference value between the actual bearing value of each fragment and the target bearing value is smaller than a preset threshold value, and the actual bearing value is the sum of task weights of all alarm types contained in the fragment.
5. The method of claim 4, further comprising:
after receiving an instruction of deleting the alarm type, updating the actual bearing value of the fragment where the alarm type to be deleted is located;
and/or the presence of a gas in the atmosphere,
and after receiving an instruction of increasing the alarm type, adding the newly increased alarm type to the fragment with the minimum actual bearing value, and updating the actual bearing value of the fragment.
6. The method of claim 1, wherein the alert type comprises any one or more of: CPU utilization rate alarm, memory utilization rate alarm, network card flow rate alarm and magnetic disk read-write rate alarm.
7. The method according to any of claims 1-6, wherein after said determining the mapping relationship between the shard and the Kapacitor instance, further comprising:
monitoring the running state of the Kapacitor instance;
and if the running state of the Kapacitor instance is abnormal, updating the mapping relation between the fragments and the Kapacitor instance.
8. A distributed cluster alarm device based on a cloud platform is characterized by comprising:
a weight calculation module: the method comprises the steps that in a distributed cluster, the task weight of each alarm type is determined according to the data volume and the calculation complexity of the alarm task of each alarm type in unit time;
a slicing module: the system comprises a load balancing strategy, a task weight and a cluster management unit, wherein the load balancing strategy is used for dividing all alarm types of a distributed cluster according to the number of target fragments and the task weight of each alarm type to obtain fragments of the number of the target fragments, and the fragments comprise one or more alarm types;
an instantiation module: the Kapator instances with corresponding quantity are generated according to the quantity of the target fragments;
a mapping relation determination module: the Kapacitor is used for determining the mapping relation between the fragments and the Kapacitor instances;
an alarm module: and the target alarm type is used for determining the target alarm type of the current alarm task, and the current alarm task is sent to a target Kapacitor instance according to the mapping relation and the target alarm type so as to realize alarm.
9. A distributed cluster alarm device based on a cloud platform is characterized by comprising:
a memory: for storing a computer program;
a processor: for executing the computer program for implementing the steps of the cloud platform based distributed cluster alarm method according to any of claims 1-7.
10. A readable storage medium, having stored thereon a computer program for implementing the steps of the cloud platform based distributed cluster alerting method according to any one of claims 1-7 when being executed by a processor.
CN202010889242.6A 2020-08-28 2020-08-28 Distributed cluster alarm method, device and equipment based on cloud platform Active CN112019386B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010889242.6A CN112019386B (en) 2020-08-28 2020-08-28 Distributed cluster alarm method, device and equipment based on cloud platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010889242.6A CN112019386B (en) 2020-08-28 2020-08-28 Distributed cluster alarm method, device and equipment based on cloud platform

Publications (2)

Publication Number Publication Date
CN112019386A CN112019386A (en) 2020-12-01
CN112019386B true CN112019386B (en) 2022-12-06

Family

ID=73502971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010889242.6A Active CN112019386B (en) 2020-08-28 2020-08-28 Distributed cluster alarm method, device and equipment based on cloud platform

Country Status (1)

Country Link
CN (1) CN112019386B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559179A (en) * 2020-12-15 2021-03-26 建信金融科技有限责任公司 Job processing method and device
CN115002205B (en) * 2022-08-04 2022-11-08 浩鲸云计算科技股份有限公司 Kapacitor clustering method based on table routing proxy mode

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809724A (en) * 2018-06-14 2018-11-13 郑州云海信息技术有限公司 Alarm management method and device in cloud data system
CN109245927A (en) * 2018-09-06 2019-01-18 郑州云海信息技术有限公司 Warning system and method in cloud data system
WO2020015092A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Instance monitoring method and apparatus, terminal device and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809724A (en) * 2018-06-14 2018-11-13 郑州云海信息技术有限公司 Alarm management method and device in cloud data system
WO2020015092A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Instance monitoring method and apparatus, terminal device and medium
CN109245927A (en) * 2018-09-06 2019-01-18 郑州云海信息技术有限公司 Warning system and method in cloud data system

Also Published As

Publication number Publication date
CN112019386A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
CN112019386B (en) Distributed cluster alarm method, device and equipment based on cloud platform
US9645756B2 (en) Optimization of in-memory data grid placement
US10162843B1 (en) Distributed metadata management
WO2018001110A1 (en) Method and device for reconstructing stored data based on erasure coding, and storage node
US20050246362A1 (en) System and method for dynamci log compression in a file system
JP2013514559A (en) Storage system
JP4506520B2 (en) Management server, message extraction method, and program
CN107087031B (en) Storage resource load balancing method and device
CN101964820A (en) Method and system for keeping data consistency
WO2015196686A1 (en) Data storage method and data storage management server
CN109241023A (en) Distributed memory system date storage method, device, system and storage medium
CN112286903A (en) Containerization-based relational database optimization method and device
CN105183399A (en) Data writing and reading method and device based on elastic block storage
CN109831508B (en) Caching method and equipment and storage medium
CN109597800B (en) Log distribution method and device
CN106775470B (en) Data storage method and system
CN110727508A (en) Task scheduling system and scheduling method
CN105095495A (en) Distributed file system cache management method and system
CN115981562A (en) Data processing method and device
CN109388335B (en) Data storage method and system
KR20220113710A (en) GPU Packet Aggregation System
US11855868B2 (en) Reducing the impact of network latency during a restore operation
CN115442262B (en) Resource evaluation method and device, electronic equipment and storage medium
CN107368355B (en) Dynamic scheduling method and device of virtual machine
US11442793B1 (en) Fully dynamic virtual proxies for data protection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant