CN111459763B - Cross-kubernetes cluster monitoring system and method - Google Patents

Cross-kubernetes cluster monitoring system and method Download PDF

Info

Publication number
CN111459763B
CN111459763B CN202010258248.3A CN202010258248A CN111459763B CN 111459763 B CN111459763 B CN 111459763B CN 202010258248 A CN202010258248 A CN 202010258248A CN 111459763 B CN111459763 B CN 111459763B
Authority
CN
China
Prior art keywords
monitoring
component
data
cluster
alcor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010258248.3A
Other languages
Chinese (zh)
Other versions
CN111459763A (en
Inventor
董黎阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202010258248.3A priority Critical patent/CN111459763B/en
Publication of CN111459763A publication Critical patent/CN111459763A/en
Application granted granted Critical
Publication of CN111459763B publication Critical patent/CN111459763B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/301Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3096Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents wherein the means or processing minimize the use of computing system or of computing system component resources, e.g. non-intrusive monitoring which minimizes the probe effect: sniffing, intercepting, indirectly deriving the monitored data from other directly available data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/323Visualisation of programs or trace data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/815Virtual

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer And Data Communications (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention provides a cross-kubernetes cluster monitoring system and a method, comprising the following steps: a plurality of open-sun Alcor clusters, precursor-out and grafana-out components disposed outside the Alcor clusters; prometheus, alertmanager and grafana monitoring components, node-exporter, process-exporter and blackbox data acquisition components are installed in the Alcor cluster; the promethaus-out component synchronizes monitoring data from the promethaus monitoring component; the grafana-out component presents the monitoring data. The scheme solves the monitoring and data presentation of cross-cluster data collection.

Description

Cross-kubernetes cluster monitoring system and method
Technical Field
The invention relates to the technical field of kubernetes cluster monitoring, in particular to a cross-kubernetes cluster monitoring system and method.
Background
Container technology is currently a popular technology, and is also the forefront technology. Since the push of Docker, the deployment of software is easy, and the one-time deployment and operation are truly realized. kubernetes is an open source Docker container cluster management system across host clusters, and provides a whole set of functions such as resource scheduling, deployment operation, service discovery, capacity expansion and capacity contraction for containerized applications. The open sun Alcor is a container cloud platform developed and packaged based on native kubernetes, massive applications and services are arranged in the container cloud platform, and with the increase of the applications, the increase of the complexity of user demands inevitably forms the condition that multiple clusters run simultaneously, and the kubernetes clusters are monitored and displayed through the matching of prometheus, alertmanager, grafana in the prior art. However, when multiple clusters occur, each cluster has a set of precursor monitoring components, whether a system administrator or a tenant, the complete monitoring data can be observed only by switching back and forth under the multiple clusters inevitably, the complexity of extracting the monitoring data by the open sun Alcor can be improved, the monitoring data in the multiple clusters are difficult to gather, and great challenges are brought to analysis and comparison of the monitoring data in the later failure. The prior art lacks a cross-cluster solution for monitoring alarms and data presentation for massive applications and services on multiple clusters.
Disclosure of Invention
The embodiment of the invention provides a cross-kubernetes cluster monitoring system and a cross-kubernetes cluster monitoring method, which solve the technical problem that a cross-cluster solution for monitoring, alarming and data displaying of massive applications and services on a plurality of clusters is lacking in the prior art.
The embodiment of the invention provides a cross-kubernetes cluster monitoring system, which comprises the following components:
a plurality of open-air Alcor clusters, a precursor-out component and a grafana-out component, the precursor-out component and the grafana-out component being disposed outside the plurality of open-air Alcor clusters;
a precursor monitoring component, an alert manager monitoring component, a grafana monitoring component, a node-exporter data acquisition component, a process-export data acquisition component and a blackbox data acquisition component are installed in each open-sun Alcor cluster;
the promethaus monitoring component is for: monitoring data are obtained from the open-air Alcor cluster assembly and a cluster container Docker, monitoring data of the open-air Alcor cluster physical server are obtained from the node-exporter data acquisition assembly, the process-exporter data acquisition assembly and the blackbox data acquisition assembly, alarm information is generated according to the monitoring data, and the alarm information is sent to an alert manager monitoring assembly;
the alert manager monitoring component is to: managing the alarm information;
the grafana monitoring component is for: acquiring monitoring data from a prometaus monitoring component for display;
the promethaus-out assembly is used for: synchronizing monitoring data from a precursor monitoring component in a plurality of open-air Alcor clusters, and adding a data distinguishing tag to the monitoring data of each open-air Alcor cluster;
the grafana-out component is used to: and acquiring monitoring data from the precursor-out component for display.
The embodiment of the invention also provides a cross-kubernetes cluster monitoring method, which comprises the following steps:
installing a precursor monitoring component, an alert manager monitoring component, a grafana monitoring component, a node-exporter data acquisition component, a process-export data acquisition component, a blackbox data acquisition component in each of a plurality of open-air Alcor clusters, and deploying a precursor-out component and a grafana-out component outside the plurality of open-air Alcor clusters;
the method comprises the steps that a precursor monitoring component acquires monitoring data from an open-air Alcor cluster component and a cluster container Docker, acquires monitoring data of an open-air Alcor cluster physical server from a node-exporter data acquisition component, a process-exporter data acquisition component and a blackbox data acquisition component, generates alarm information according to the monitoring data, and sends the alarm information to an alert manager monitoring component;
an alert manager monitoring component manages the alarm information;
the grafana monitoring component acquires monitoring data from the promethaus monitoring component for display;
the prometheus-out component synchronizes monitoring data from prometheus monitoring components in a plurality of open-air Alcor clusters, and adds a data distinguishing tag to the monitoring data of each open-air Alcor cluster;
the grafana-out component obtains monitoring data from the promethaus-out component for display.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the method when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, which stores a computer program for executing the method.
In the embodiment of the invention, a prometaus monitoring component, an alert manager monitoring component and a grafana monitoring component are arranged in each open-air Alcor cluster, so that the acquisition pressure of monitoring information and the calculation pressure of monitoring items are separated from a set of prometaus, and the prometaus in each open-air Alcor cluster is only responsible for the data acquisition and the calculation of the monitoring items of the own cluster and the pressure is controllable; the method comprises the steps that a precursor-out component and a grafana-out component are deployed outside a plurality of open-sun Alcor clusters, wherein the precursor-out is only responsible for synchronizing monitoring data of all clusters, monitoring item calculation is not performed, and calculation pressure is reduced; the user can view the monitoring data of the containers deployed on different clusters through the grafana-out portal, so that the user experience is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a block diagram of a cross-kubernetes cluster monitoring system according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Technical term interpretation
Dock is an open-source application container engine that allows developers to package their applications and rely on packages into a portable container and then release them to any popular Linux machine, and also allows virtualization.
kubernetes, abbreviated as K8s, is an open-source application for managing containerization on multiple hosts in a cloud platform, and the goal of kubernetes is to make deploying containerized applications simple and efficient (powerfull), and kubernetes provides a mechanism for application deployment, planning, updating, and maintenance.
Prometheus is a set of open-source monitoring & alarm & time series database combinations. The method is used for collecting and storing the monitoring data of the docker and kubernetes.
grafana is an open source application written in go language, is mainly used for visual display of large-scale index data, is the most popular time sequence data display tool in network architecture and application analysis, and currently supports most of commonly used time sequence databases. The method is used for graphically displaying the monitoring data stored in the promethaus.
The alert manager is an independent alarm module, receives alarms sent by clients such as Prometheus, processes such as grouping, deleting duplicates and the like, and sends the alarms to a correct receiver through a route; the alarm mode can be sent to different module responsible persons according to different rules. The alarm of Prometheus is divided into two parts. Alert rules in the promethaus server send alerts to the alert manager. The alert manager then manages these alerts, including silencing, suppression, aggregation, and pass-through methods such as email notification, notification to call systems, and instant messaging platforms.
Apiserver is a bus for open-air Alcor cluster message communication and an external API interface.
The Controller is a Controller of the open-sun Alcor cluster state.
Schedulers are schedulers for open-air Alcor cluster container services.
Coreds is an open-air Alcor cluster internal DNS resolution service.
Helm char is a kubernetes service template orchestration tool.
The main steps of setting alarms and notifications are:
setting and configuring an alert manager;
configuring Prometaus to talk to an alert manager;
alarm rules are created in Prometheus.
Exporter: all programs that can provide monitoring sample data to Prometaus can be referred to as an Exporter.
The Node exporter is mainly used for exposing metrics to Prometaus, wherein the metrics comprise: load of cpu, use of memory, network, etc.
The Process-exporter is mainly used to expose metrics to Prometaus, where metrics include: the state of the process running on the server.
Blackbox is mainly used to expose metrics to promethaus, where metrics include: server port status.
In an embodiment of the present invention, a method for monitoring a system across kubernetes clusters is provided, as shown in fig. 1, including:
a plurality of open-air Alcor clusters, a precursor-out component and a grafana-out component, the precursor-out component and the grafana-out component being disposed outside the plurality of open-air Alcor clusters;
a precursor monitoring component, an alert manager monitoring component, a grafana monitoring component, a node-exporter data acquisition component, a process-export data acquisition component and a blackbox data acquisition component are installed in each open-sun Alcor cluster;
the promethaus monitoring component is for: monitoring data are obtained from the open-air Alcor cluster assembly and a cluster container Docker, monitoring data of the open-air Alcor cluster physical server are obtained from the node-exporter data acquisition assembly, the process-exporter data acquisition assembly and the blackbox data acquisition assembly, alarm information is generated according to the monitoring data, and the alarm information is sent to an alert manager monitoring assembly;
the alert manager monitoring component is to: managing the alarm information;
the grafana monitoring component is for: acquiring monitoring data from a prometaus monitoring component for display;
the promethaus-out assembly is used for: synchronizing monitoring data from a precursor monitoring component in a plurality of open-air Alcor clusters, and adding a data distinguishing tag to the monitoring data of each open-air Alcor cluster;
the grafana-out component is used to: and acquiring monitoring data from the precursor-out component for display.
In the embodiment of the invention, the promethaus monitoring component is specifically used for:
monitoring data are obtained from the open-air Aclor cluster component (apiserver, controller, scheduler, coredns) and the cluster container Docker according to the set data collection rules, and the monitoring data of the open-air Alcor cluster physical server are obtained from the node-exporter data collection component, the process-exporter data collection component and the blackbox data collection component. Wherein Docker, kubernetes is a component of an open sun alcor cluster.
Presetting a monitoring data storage period, and storing the monitoring data according to the preset monitoring data storage period. Wherein the storage period of the monitoring data is 7 days;
based on the set alarm item rule, alarm information is generated according to the monitoring data.
In the embodiment of the invention, the grafana monitoring component and the grafana-out component are specifically used for: and displaying the monitoring data through a preset data display diagram.
In the embodiment of the invention, a prometaus-out component and a grafana-out component can be deployed 2 sets outside a plurality of open-sun Alcor clusters. By using the promethaus federation mechanism, a pometelus-out is configured to synchronize monitoring data from promethaus in all open-air Alcor clusters to maintain a preset monitoring data storage period (e.g., 30 days) and to add a distinguishable label to the data of each cluster. The grafana-out is configured to define a data presentation generic view applicable to all open-air Alcor clusters.
Based on the same inventive concept, the embodiment of the invention also provides a cross-kubernetes cluster monitoring method, as described in the following embodiment.
Installing a precursor monitoring component, an alert manager monitoring component, a grafana monitoring component, a node-exporter data acquisition component, a process-export data acquisition component, a blackbox data acquisition component in each of a plurality of open-air Alcor clusters, and deploying a precursor-out component and a grafana-out component outside the plurality of open-air Alcor clusters;
the method comprises the steps that a precursor monitoring component acquires monitoring data from an open-air Alcor cluster component and a cluster container Docker, acquires monitoring data of an open-air Alcor cluster physical server from a node-exporter data acquisition component, a process-exporter data acquisition component and a blackbox data acquisition component, generates alarm information according to the monitoring data, and sends the alarm information to an alert manager monitoring component;
an alert manager monitoring component manages the alarm information;
the grafana monitoring component acquires monitoring data from the promethaus monitoring component for display;
the prometheus-out component synchronizes monitoring data from prometheus monitoring components in a plurality of open-air Alcor clusters, and adds a data distinguishing tag to the monitoring data of each open-air Alcor cluster;
the grafana-out component obtains monitoring data from the promethaus-out component for display.
In the embodiment of the invention, the realization method of the cross-kubernetes cluster monitoring system based on data acquisition in the open sun Alcor environment is a large-scale container cloud platform cross-cluster monitoring and data display method, and specifically comprises the following installation steps:
and I, writing configuration files of the promethaus and the alert manager of prometheus operator, and predefining data acquisition rules and alarm item rules to realize the installation of monitoring and data display components in each cluster in the form of an operator.
And II, writing a configuration file of grafana, and predefining a data display diagram for later integration into an installation program of the open sun Alcor through a palm char.
And III, writing a helm character, integrating the installation of prometheus operator and node-exporter, process-exporter, blackbox components into an installation program of the open-air Alcor, and automatically installing a cluster monitoring and data display component when the open-air Alcor cluster is installed.
And IV, independently deploying 2 sets of prometases-out and grfana-out outside the cluster, modifying the configuration prometases-out configuration file to form federation with prometases in all the clusters, acquiring monitoring data of the prometases in the clusters in real time, and uniformly displaying the data by grafana-out.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the method when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, which stores a computer program for executing the method.
In summary, the cross-kubernetes cluster monitoring system and the method provided by the invention relate to the computer technology, the docker container technology, the kubernetes large-scale container management technology, the promethaus system monitoring and alarming technology and the grafana data display technology, and the cross-kubernetes cluster monitoring system and the method enable the open sun Alcor cluster to run healthier, enable cluster data to be displayed more clearly and specifically, enhance the effectiveness of monitoring information, reduce the time required for maintenance and provide data sources and initial classification for big data analysis and cloud computing.
Specifically, the acquisition pressure of the monitoring information and the calculation pressure of the monitoring items are separated from a set of prometases, and the prometases in each set of open-sun Alcor clusters are only responsible for data acquisition and monitoring item calculation of the own clusters, so that the pressure is controllable, and the prometases-out are only responsible for synchronizing the monitoring data of all the clusters, do not perform monitoring item calculation, and reduce the calculation pressure; the promethaus and the alert manager in each set of clusters can be customized differently according to the characteristics of the clusters, and are not limited by other clusters; when the configuration is changed, the total quantity of configuration can be changed after verification in a certain cluster, so that the risk of configuration change is reduced; the prometaus inside the cluster only keeps 7 days of data, so that the pressure stored inside the cluster is reduced, and in the case of failure loss, the lost data can be found in the prometaus-out, so that the high availability of the data is ensured; the user can view the monitoring data of the containers deployed on different clusters through one inlet of the grafana-out, so that the user experience is improved.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A cross kubernetes cluster monitoring system, comprising: a plurality of open-air Alcor clusters, a precursor-out component and a grafana-out component, the precursor-out component and the grafana-out component being disposed outside the plurality of open-air Alcor clusters;
a precursor monitoring component, an alert manager monitoring component, a grafana monitoring component, a node-exporter data acquisition component, a process-exporter data acquisition component and a blackbox data acquisition component are installed in each open-sun Alcor cluster;
the promethaus monitoring component is for: monitoring data are obtained from the open-air Alcor cluster assembly and a cluster container Docker, monitoring data of the open-air Alcor cluster physical server are obtained from the node-exporter data acquisition assembly, the process-exporter data acquisition assembly and the blackbox data acquisition assembly, alarm information is generated according to the monitoring data, and the alarm information is sent to an alert manager monitoring assembly;
the alert manager monitoring component is to: managing the alarm information;
the grafana monitoring component is for: acquiring monitoring data from a prometaus monitoring component for display;
the promethaus-out assembly is used for: synchronizing monitoring data from a precursor monitoring component in a plurality of open-air Alcor clusters, and adding a data distinguishing tag to the monitoring data of each open-air Alcor cluster;
the grafana-out component is used to: acquiring monitoring data from a prometaus-out component for display;
the method for realizing the cross-kubernetes cluster monitoring system based on data acquisition under the open sun Alcor cluster specifically comprises the following steps:
writing a profile of promethaus and alert manager of prometheus operator, and predefining a data acquisition rule and an alarm item rule to realize that monitoring and data display components in each cluster are installed in an operator mode;
writing a configuration file of grafana, predefining a data display diagram, and integrating the data display diagram into an installation program of an open sun Alcor cluster through a help character;
writing a helm character, integrating the installation of prometheus operator and node-exporter, process-exporter, blackbox components into an installation program of an open-air Alcor cluster, and automatically installing a cluster monitoring and data display component when the open-air Alcor cluster is installed;
and (3) independently deploying a set of precursor-out component and a set of grfana-out component outside the cluster, modifying the configuration file for configuring the precursor-out component and the precursor monitoring components in all the clusters to form a federation, acquiring monitoring data of the precursor monitoring components in the clusters in real time, and uniformly displaying the data by the grfana-out component.
2. The cross-kubernetes cluster monitoring system of claim 1, wherein the promethaus monitoring component is specifically configured to:
and acquiring monitoring data from the open-air Alcor cluster component and the cluster container Docker according to the set data acquisition rule, and acquiring the monitoring data of the open-air Alcor cluster physical server from the node-exporter data acquisition component, the process-exporter data acquisition component and the blackbox data acquisition component.
3. The cross-kubernetes cluster monitoring system of claim 1, wherein the promethaus monitoring component and the promethaus-out component are further configured to:
presetting a monitoring data storage period, and storing the monitoring data according to the preset monitoring data storage period.
4. The cross-kubernetes cluster monitoring system of claim 3, wherein the monitoring data in the precursor monitoring component has a shelf life of 7 days; the storage period of the monitoring data in the precursor-out assembly is 30 days.
5. The cross-kubernetes cluster monitoring system of claim 1, wherein the promethaus monitoring component is specifically configured to:
based on the set alarm item rule, alarm information is generated according to the monitoring data.
6. The cross-kubernetes cluster monitoring system of claim 1, wherein the grafana monitoring component and the grafana-out component are specifically configured to: and displaying the monitoring data through a preset data display diagram.
7. A cross-kubernetes cluster monitoring method applied to the cross-kubernetes cluster monitoring system of any one of claims 1-6, comprising:
installing a precursor monitoring component, an alert manager monitoring component, a grafana monitoring component, a node-exporter data acquisition component, a process-export data acquisition component, a blackbox data acquisition component in each of a plurality of open-air Alcor clusters, and deploying a precursor-out component and a grafana-out component outside the plurality of open-air Alcor clusters;
the method comprises the steps that a precursor monitoring component acquires monitoring data from an open-air Alcor cluster component and a cluster container Docker, acquires monitoring data of an open-air Alcor cluster physical server from a node-exporter data acquisition component, a process-exporter data acquisition component and a blackbox data acquisition component, generates alarm information according to the monitoring data, and sends the alarm information to an alert manager monitoring component;
an alert manager monitoring component manages the alarm information;
the grafana monitoring component acquires monitoring data from the promethaus monitoring component for display;
the prometheus-out component synchronizes monitoring data from prometheus monitoring components in a plurality of open-air Alcor clusters, and adds a data distinguishing tag to the monitoring data of each open-air Alcor cluster;
the grafana-out component obtains monitoring data from the promethaus-out component for display.
8. The cross-kubernetes cluster monitoring method of claim 7, wherein the precursor monitoring component obtains monitoring data from an open-air Alcor cluster component and a cluster container Docker, and obtains monitoring data from a node-exporter data acquisition component, a process-exporter data acquisition component, a blackbox data acquisition component, comprising:
and acquiring monitoring data from the open-air Alcor cluster component and the cluster container Docker according to the set data acquisition rule, and acquiring the monitoring data of the open-air Alcor cluster physical server from the node-exporter data acquisition component, the process-exporter data acquisition component and the blackbox data acquisition component.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 7 to 8 when executing the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 7 to 8.
CN202010258248.3A 2020-04-03 2020-04-03 Cross-kubernetes cluster monitoring system and method Active CN111459763B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010258248.3A CN111459763B (en) 2020-04-03 2020-04-03 Cross-kubernetes cluster monitoring system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010258248.3A CN111459763B (en) 2020-04-03 2020-04-03 Cross-kubernetes cluster monitoring system and method

Publications (2)

Publication Number Publication Date
CN111459763A CN111459763A (en) 2020-07-28
CN111459763B true CN111459763B (en) 2023-10-24

Family

ID=71685848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010258248.3A Active CN111459763B (en) 2020-04-03 2020-04-03 Cross-kubernetes cluster monitoring system and method

Country Status (1)

Country Link
CN (1) CN111459763B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112165502B (en) * 2020-08-06 2022-11-25 中信银行股份有限公司 Service discovery system, method and second server
CN112015753B (en) * 2020-08-31 2023-10-31 北京易捷思达科技发展有限公司 Monitoring system and method suitable for containerized deployment of open source cloud platform
CN112162821B (en) * 2020-09-25 2022-04-26 中国电力科学研究院有限公司 Container cluster resource monitoring method, device and system
CN112286628B (en) * 2020-10-19 2022-05-17 烽火通信科技股份有限公司 System for unifying nanotube Kubernetes heterogeneous applications and operation method
CN112511339B (en) * 2020-11-09 2023-04-07 宝付网络科技(上海)有限公司 Container monitoring alarm method, system, equipment and storage medium based on multiple clusters
CN112711512A (en) * 2020-12-29 2021-04-27 北京浪潮数据技术有限公司 Prometheus monitoring method, device and equipment
CN112698915A (en) * 2020-12-31 2021-04-23 北京千方科技股份有限公司 Multi-cluster unified monitoring alarm method, system, equipment and storage medium
CN112328456B (en) * 2021-01-04 2021-12-03 北京电信易通信息技术股份有限公司 Cluster resource monitoring system based on service discovery
CN114003312A (en) * 2021-10-29 2022-02-01 广东智联蔚来科技有限公司 Big data service component management method, computer device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921551A (en) * 2018-06-11 2018-11-30 西安纸贵互联网科技有限公司 Alliance's block catenary system based on Kubernetes platform
CN109245931A (en) * 2018-09-19 2019-01-18 四川长虹电器股份有限公司 The log management of container cloud platform based on kubernetes and the implementation method of monitoring alarm
CN110086674A (en) * 2019-05-06 2019-08-02 山东浪潮云信息技术有限公司 A kind of application high availability implementation method and system based on container
CN110247810A (en) * 2019-07-09 2019-09-17 浪潮云信息技术有限公司 A kind of system and method for collection vessel service monitoring data
CN110262944A (en) * 2019-06-21 2019-09-20 四川长虹电器股份有限公司 The method that a kind of pair of K8s cluster container resource is monitored and is alerted

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190317824A1 (en) * 2018-04-11 2019-10-17 Microsoft Technology Licensing, Llc Deployment of services across clusters of nodes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921551A (en) * 2018-06-11 2018-11-30 西安纸贵互联网科技有限公司 Alliance's block catenary system based on Kubernetes platform
CN109245931A (en) * 2018-09-19 2019-01-18 四川长虹电器股份有限公司 The log management of container cloud platform based on kubernetes and the implementation method of monitoring alarm
CN110086674A (en) * 2019-05-06 2019-08-02 山东浪潮云信息技术有限公司 A kind of application high availability implementation method and system based on container
CN110262944A (en) * 2019-06-21 2019-09-20 四川长虹电器股份有限公司 The method that a kind of pair of K8s cluster container resource is monitored and is alerted
CN110247810A (en) * 2019-07-09 2019-09-17 浪潮云信息技术有限公司 A kind of system and method for collection vessel service monitoring data

Also Published As

Publication number Publication date
CN111459763A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
CN111459763B (en) Cross-kubernetes cluster monitoring system and method
WO2023142054A1 (en) Container microservice-oriented performance monitoring and alarm method and alarm system
CN105573824B (en) Monitoring method and system for distributed computing system
CN108762900A (en) High frequency method for scheduling task, system, computer equipment and storage medium
Fu et al. Real-time data infrastructure at uber
Malviya et al. A comparative analysis of container orchestration tools in cloud computing
US20190361760A1 (en) Detecting a possible underlying problem among computing devices
CN108243012B (en) Charging application processing system, method and device in OCS (online charging System)
CN108932157B (en) Method, system, electronic device and readable medium for distributed processing of tasks
US10949218B2 (en) Generating an execution script for configuration of a system
US20190250958A1 (en) Remotely managing execution of jobs in a cluster computing framework
US10771562B2 (en) Analyzing device-related data to generate and/or suppress device-related alerts
CN109905286A (en) A kind of method and system of monitoring device operating status
US20220222266A1 (en) Monitoring and alerting platform for extract, transform, and load jobs
US20240202602A1 (en) Dynamic predictive analysis of data sets using an actor-driven distributed computational graph
CN110716802A (en) Cross-cluster task scheduling system and method
US10331484B2 (en) Distributed data platform resource allocator
CN113422692A (en) Method, device and storage medium for detecting and processing node faults in K8s cluster
CN110019214A (en) The method and apparatus that data split result is verified
CN109165261A (en) Method of data synchronization, device, server and computer storage medium
CN115543543A (en) Application service processing method, device, equipment and medium
CN110569113A (en) Method and system for scheduling distributed tasks and computer readable storage medium
CN108009004A (en) The implementation method of service application availability measurement monitoring based on Docker
CN113656239A (en) Monitoring method and device for middleware and computer program product
CN114756301B (en) Log processing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20221010

Address after: 25 Financial Street, Xicheng District, Beijing 100033

Applicant after: CHINA CONSTRUCTION BANK Corp.

Address before: 25 Financial Street, Xicheng District, Beijing 100033

Applicant before: CHINA CONSTRUCTION BANK Corp.

Applicant before: Jianxin Financial Science and Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant