CN111901405B - Multi-node monitoring method and device, electronic equipment and storage medium - Google Patents

Multi-node monitoring method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111901405B
CN111901405B CN202010706635.9A CN202010706635A CN111901405B CN 111901405 B CN111901405 B CN 111901405B CN 202010706635 A CN202010706635 A CN 202010706635A CN 111901405 B CN111901405 B CN 111901405B
Authority
CN
China
Prior art keywords
monitoring
monitoring data
data
node
host
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010706635.9A
Other languages
Chinese (zh)
Other versions
CN111901405A (en
Inventor
郭健伟
季统凯
贺忠堂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
G Cloud Technology Co Ltd
Original Assignee
G Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by G Cloud Technology Co Ltd filed Critical G Cloud Technology Co Ltd
Priority to CN202010706635.9A priority Critical patent/CN111901405B/en
Publication of CN111901405A publication Critical patent/CN111901405A/en
Priority to PCT/CN2021/073799 priority patent/WO2022016845A1/en
Application granted granted Critical
Publication of CN111901405B publication Critical patent/CN111901405B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/101Server selection for load balancing based on network conditions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the application discloses a multi-node monitoring method, a multi-node monitoring device, electronic equipment and a storage medium. According to the technical scheme provided by the embodiment of the application, the monitoring data corresponding to the public resources are started and collected through each host machine according to the pre-configured public resource monitoring item and the monitoring weight, the network delay data of each host machine and the monitoring data storage node are extracted, the reliability value of each monitoring data collected by each host machine is calculated by using a pre-defined reliability calculation formula according to the network delay data and the monitoring weight, and finally the monitoring data with the highest reliability value is screened and stored in the monitoring data storage node. By adopting the technical means, the monitoring data with the highest reliability value is screened and stored, so that the high reliability of the monitoring data acquisition and storage can be ensured, and the system overhead of the cloud computing platform is reduced. And the repeated acquisition and storage of the monitoring data can be avoided by screening the monitoring data storage, so that the occupation of storage resources of the database is reduced.

Description

Multi-node monitoring method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of cloud computing monitoring, in particular to a multi-node monitoring method, a multi-node monitoring device, electronic equipment and a storage medium.
Background
Cloud computing platforms are emerging as business computing models that are typically made up of multiple computing nodes (i.e., hosts) and through which platform-generated resources are managed and monitored. When the cloud computing platform is monitored, the host is generally utilized to acquire related data of resources on the host, and the related data are stored in a corresponding database, so that the cloud computing platform is monitored.
However, because there are multiple hosts, many resources are shared in a cloud computing platform, i.e., different hosts may share the same resource. Thus, when the storage resources are acquired, each host of the cloud computing platform repeatedly acquires the common resources and stores the common resources. Repeatedly acquiring and repeatedly storing the same monitoring data increases the storage volume of the monitoring data and occupies the storage resources of the database.
Disclosure of Invention
The embodiment of the application provides a multi-node monitoring method, a multi-node monitoring device, electronic equipment and a storage medium, which can avoid repeated acquisition and storage of monitoring data and reduce occupation of storage resources.
In a first aspect, an embodiment of the present application provides a multi-node monitoring method, including:
each host machine starts to collect monitoring data of corresponding public resources according to a pre-configured public resource monitoring item and monitoring weight, wherein the monitoring weight is used for adjusting the collection frequency and the collection time point of the monitoring data;
extracting network delay data of each host and each monitoring data storage node, and calculating the reliability value of each monitoring data acquired by each host by using a predefined reliability calculation formula according to the network delay data and the monitoring weight;
and screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node.
Further, each host starts to collect monitoring data of the corresponding public resource according to a pre-configured public resource monitoring item and monitoring weight, including:
and each host machine collects initial data corresponding to the public resources according to the collection frequency and the collection time point, calculates an average value based on the collection frequency and the data quantity of the initial data, and takes the average value as the monitoring data.
Further, screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node, including:
and storing the data value, the data acquisition time, the corresponding public resource monitoring item name and the data acquisition object of the monitoring data with the highest reliability value in the monitoring data storage node.
Further, the monitoring data storage node is a time sequence type database.
Further, after screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node, the method further comprises:
and taking the host machine corresponding to the monitoring data with the highest reliability value as a monitoring node, keeping the monitoring node to collect the monitoring data corresponding to the public resource in a set period of time, and stopping the rest host machines from collecting the monitoring data corresponding to the public resource.
Further, after the host machine corresponding to the monitoring data with the highest reliability value is used as a monitoring node, the monitoring node is kept to collect the monitoring data of the corresponding public resource in a set period, and the rest host machines are stopped from collecting the monitoring data of the corresponding public resource, the method further comprises:
and after a set time period, recalculating the reliability values of the monitoring data of the corresponding public resources collected by each host machine, and redetermining the monitoring nodes according to the recalculated reliability values.
Further, before each host initiates to collect the monitoring data of the corresponding public resource according to the pre-configured public resource monitoring item and the monitoring weight, the method further comprises:
and each host machine sets the monitoring weight corresponding to the public resource according to user definition or random setting.
In a second aspect, embodiments of the present application provide a multi-node monitoring apparatus, comprising:
the system comprises an acquisition module, a monitoring module and a control module, wherein the acquisition module is used for starting to acquire monitoring data of corresponding public resources through each host machine according to a pre-configured public resource monitoring item and a monitoring weight, and the monitoring weight is used for adjusting the acquisition frequency and the acquisition time point of the monitoring data;
the computing module is used for extracting network delay data of each host machine and each monitoring data storage node, and computing the reliability value of each monitoring data acquired by each host machine by using a predefined reliability computing formula according to the network delay data and the monitoring weight;
and the screening module is used for screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node.
In a third aspect, an embodiment of the present application provides an electronic device, including:
a memory and one or more processors;
the memory is used for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the multi-node monitoring method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing the multi-node monitoring method as described in the first aspect.
According to the method, the monitoring data corresponding to the public resources are started and collected through each host according to the preset public resource monitoring items and the monitoring weights, network delay data of each host and the monitoring data storage nodes are extracted, the reliability values of the monitoring data collected by each host are calculated according to the network delay data and the monitoring weights by using a predefined reliability calculation formula, and finally the monitoring data with the highest reliability values are screened and stored in the monitoring data storage nodes. By adopting the technical means, the monitoring data with the highest reliability value is screened and stored, so that the high reliability of the monitoring data acquisition and storage can be ensured, and the system overhead of the cloud computing platform is reduced. And the repeated acquisition and storage of the monitoring data can be avoided by screening the monitoring data storage, so that the occupation of storage resources of the database is reduced.
Drawings
FIG. 1 is a flow chart of a multi-node monitoring method provided in accordance with an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a cloud computing platform according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a multi-node monitoring device according to a second embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the following detailed description of specific embodiments thereof is given with reference to the accompanying drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the application and not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the matters related to the present application are shown in the accompanying drawings. Before discussing exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or at the same time. Furthermore, the order of the operations may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.
The multi-node monitoring method aims at selecting the monitoring data collected by a host machine to store based on the reliability values by calculating the reliability values of the monitoring data collected by each host machine (namely the calculation node), so that the monitoring data are prevented from being repeatedly obtained and stored, and the occupation of storage resources of the database is reduced. In contrast to conventional cloud computing platforms, which generally process services by multiple computing nodes or multiple hosts, cloud computing monitoring monitors resources based on such a structure. The cloud computing is highly scheduled, and is used for managing various resources, sharing various resources, and monitoring the public resources during monitoring cannot be avoided, so that different hosts can repeatedly acquire monitoring data of the public resources. The repeated storage of data easily causes that the volume of the data is too huge, and particularly, the data storage capacity of the monitoring data storage node is too large under the condition that a cloud computing platform has a plurality of hosts. While each host obtains the same monitoring data, all hosts also need to use additional storage overhead for monitoring the data. Based on the above, the multi-node monitoring method of the embodiment of the application is provided to solve the technical problem that the storage resources are excessively occupied when the monitoring data of the existing cloud computing platform are repeatedly stored.
Embodiment one:
fig. 1 shows a flowchart of a multi-node monitoring method according to the first embodiment of the present application, where the multi-node monitoring method provided in the present embodiment may be implemented by a multi-node monitoring device, and the multi-node monitoring device may be implemented by software and/or hardware, and may be formed by two or more physical entities, or may be formed by one physical entity. Generally, the multi-node monitoring device is a cloud computing platform.
The following describes the cloud computing platform as an example of a main body for executing the multi-node monitoring method. Referring to fig. 1, the multi-node monitoring method specifically includes:
s110, each host starts to collect monitoring data of corresponding public resources according to a pre-configured public resource monitoring item and monitoring weight, wherein the monitoring weight is used for adjusting the collection frequency and the collection time point of the monitoring data.
And monitoring various resources by the cloud computing platform in the running process. And the public resource monitoring items pre-configured by the hosts are correspondingly monitored for monitoring the public resources. Referring to fig. 2, a schematic structural diagram of a cloud computing platform is provided. The cloud computing platform connects each host 11 with a central management node 12 for configuration of relevant parameters. And the monitoring data storage node 13 is connected through the host 11, and the monitoring data collected by the host 11 is stored in the monitoring data storage node 13. Specifically, before monitoring the public resources, a public resource monitoring item needs to be configured on each host in advance, the public resource monitoring item indicates the public resources to be monitored by the host, and the host correspondingly performs monitoring data acquisition of the public resources according to the public resource monitoring item. It should be noted that, for different hosts, the common resource monitoring items configured may be the same or different. For a host, there are typically multiple common resource monitoring items. In general, a corresponding number of hosts are configured for a common resource monitoring item according to actual monitoring needs. And subsequently, when the corresponding public resource is monitored, collecting monitoring data by each host machine configured with the public resource monitoring item.
On the other hand, on each host, monitoring weights are configured for different public resource monitoring items, and the monitoring weights represent the system overhead paid by the host for monitoring the corresponding public resources. Specifically, each host machine sets the monitoring weight corresponding to the public resource according to user definition or random setting. It can be understood that the monitoring weights configured for different public resource monitoring items configured on the same host machine can be the same or different. The configuration of the public resource monitoring items and the monitoring weights can be realized by the user interacting with the central management node, so that the public resource monitoring items of all the hosts and the corresponding monitoring weights are configured, and the central management node sends corresponding configuration information to all the hosts according to the configuration operation of the user so as to configure the public resource monitoring items and the monitoring weights. In addition, the user can also directly configure the resource monitoring items and the corresponding monitoring weights on the host machine. It should be noted that if a certain public resource monitoring item is not configured to monitor a host, the host is configured for the public resource monitoring item at random, and the monitoring weight of the public resource monitoring item is set at random correspondingly.
Further, each host machine adjusts the collection frequency and the collection time point of the monitoring data of the corresponding public resource according to the monitoring weight of the pre-configured corresponding public resource monitoring item. The collection frequency represents the frequency of collecting the monitoring data in unit time, and the collection time point represents the time stamp of collecting the monitoring data. After that, when monitoring data acquisition of the corresponding public resource is performed, the monitoring data acquisition is performed based on the acquisition frequency and the acquisition time point.
In one embodiment, each host machine collects initial data corresponding to a public resource according to the collection frequency and the collection time point, calculates an average value based on the collection frequency and the data quantity of the initial data, and uses the average value as the monitoring data. For example, when monitoring a Ceph (distributed file system), a read-write rate of the Ceph (i.e., monitoring data) is acquired. Ceph's read-write rate can be understood as the average data read-write amount of Ceph in 1 minute. Then after the acquisition frequency and the acquisition time point are adjusted according to the monitoring weight, if the monitoring weight is 1, the host only acquires the read-write quantity of Ceph for 1 time within 1 minute, and then the average value is removed; if the monitoring weight is set to 10, the host collects 10 Ceph read-write amounts within 1 minute, and then averages. And finally, taking the obtained average value as monitoring data, so that the universality and representativeness of the monitoring data can be reflected, and the data is more reliable. It should be noted that, in the embodiment of the present application, the monitoring weight generally takes a value of 1-60 corresponding to the monitoring data collection within one minute of any public resource. For example, if the monitoring weight is 30, the acquisition frequency is 60/30, i.e. one time of 2S acquisition of monitoring data. The acquisition time points can be 0S, 2S, 4S, and so on, so as to complete the acquisition of the corresponding monitoring data.
And S120, extracting network delay data of each host and each monitoring data storage node, and calculating the reliability value of each monitoring data acquired by each host by using a predefined reliability calculation formula according to the network delay data and the monitoring weight.
Because one public resource is provided with a plurality of hosts for collecting monitoring data, in order to avoid repeated storage of the monitoring data, the monitoring data with highest reliability is screened for storage so as to save system overhead. And determining the reliability of the monitoring data collected by each host according to the network connectivity of each host and the monitoring data storage node and the monitoring weight of each host to the public resource. In the embodiment of the application, network delay data is used to represent the network connectivity of each host and the monitoring data storage node. It will be appreciated that the better the host's network connectivity with the monitoring data storage node, the smaller the value of its network delay data.
Specifically, a predefined reliability calculation formula is used to calculate the reliability value of the monitoring data collected by each host. The reliability of each monitoring data is represented by a reliability value. It will be appreciated that the higher the reliability value, the higher the reliability of the monitoring data. The reliability calculation formula is as follows:
h=(w*0.3*0.01)+(100/g*0.7)*100
wherein h is a reliability value, w is a monitoring weight, the value is 1-60, and g is network delay data. 0.3 and 0.7 represent the duty cycle of the network connectivity and monitoring weight impact reliability values, which are defined in terms of measured data. It should be noted that the network connectivity accounts for 7 impressions of the corresponding reliability values, because the network connectivity of the host and the monitoring data storage nodes is more consistent with the source of system overhead than the monitoring weights. Since the monitoring weight is set manually or randomly, the objectivity is relatively low, and thus, setting 0.3 and 0.7 respectively represents the duty ratio of the network connectivity and the monitoring weight to influence the reliability value. In addition, since the network delay is inversely proportional to the network connectivity, the reliability value of the portion is calculated using "(100/g 0.7) 100", unlike the monitor weight calculation method in the above reliability calculation formula. It should be noted that, the above-mentioned reliability calculation formula is only one form of calculating the reliability value of the monitoring data in the embodiment of the present application, and in practical application, according to the reliability value calculation requirement, other calculation formulas may be defined in advance to calculate the reliability value. On the other hand, the network delay data g is used in units of "milliseconds", and when extracting the network delay data, only the numerical value thereof needs to be extracted into the formula, without using the units thereof.
Further, based on the reliability calculation formula, network delay data of each host and the monitoring data storage node and monitoring weights corresponding to the monitoring data are extracted, substituted into the reliability calculation formula, and reliability values corresponding to the monitoring data collected by each host are calculated.
And S130, screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node.
And finally, based on the reliability numerical value of the monitoring data of each host computer calculated by the reliability calculation formula, screening out the monitoring data process with the maximum reliability numerical value and storing the monitoring data process by comparing the reliability numerical values. And when the monitoring data with the highest reliability value is stored, storing the data value, the data acquisition time, the corresponding public resource monitoring item name and the data acquisition object of the monitoring data with the highest reliability value in the monitoring data storage node.
Further, in the embodiment of the present application, the monitoring data storage node may be a relational database or a time-series database. Since the overhead of storing the monitoring data is the most time consuming in terms of processing efficiency, if a relational database is utilized, the acquisition of time consumes a part of the resources in the program, and the relational database cannot satisfy the processing of a large amount of data. The time sequence type database influxdb can be used for solving part of time consumption resources, the time sequence type database influxdb can be used for acquiring time of inserted data by itself, and simultaneous processing of a large amount of data can be met. Therefore, the monitoring data storage node of the embodiments of the present application is preferably a time-series database.
Further, after determining the monitoring data with the highest reliability value, the embodiment of the application further uses the host machine corresponding to the monitoring data with the highest reliability value as a monitoring node, keeps the monitoring node to collect the monitoring data corresponding to the public resource in a set period, and stops the rest host machines from collecting the monitoring data corresponding to the public resource. Further, after a set period of time, the reliability values of the corresponding public resource monitoring data collected by the hosts are recalculated, and the monitoring nodes are redetermined according to the recalculated reliability values.
For example, when monitoring public resources in a period of 10 minutes with 10 minutes as a set period, during the first monitoring data acquisition in the period (for example, the first monitoring data acquisition in the first minute), each host machine acquires monitoring data according to the public resource monitoring item and the corresponding monitoring weight, further corresponds to each public resource, determines the monitoring data with the highest reliability value by comparing the reliability values of the monitoring data acquired by each host machine, and stores the monitoring data in the monitoring data storage node. And the host machine corresponding to the monitoring data is taken as a monitoring node, and the monitoring node is used for collecting and storing the monitoring data of the corresponding public resource when the monitoring data is collected in the period. And similarly, the above-mentioned process of determining the monitoring node is circulated every 10 minutes, and the monitoring node collects and stores the flow of monitoring data, and the other hosts pause the monitoring data collection of the corresponding public resources, so as to save the system overhead of the computing cloud platform.
And starting to collect the monitoring data of the corresponding public resources through each host machine according to the pre-configured public resource monitoring item and the monitoring weight, extracting the network delay data of each host machine and the monitoring data storage node, calculating the reliability value of each monitoring data collected by each host machine according to the network delay data and the monitoring weight by using a pre-defined reliability calculation formula, and finally screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node. By adopting the technical means, the monitoring data with the highest reliability value is screened and stored, so that the high reliability of the monitoring data acquisition and storage can be ensured, and the system overhead of the cloud computing platform is reduced. And the repeated acquisition and storage of the monitoring data can be avoided by screening the monitoring data storage, so that the occupation of storage resources of the database is reduced.
Embodiment two:
based on the above embodiments, fig. 3 is a schematic structural diagram of a multi-node monitoring device according to a second embodiment of the present application. Referring to fig. 3, the multi-node monitoring apparatus provided in this embodiment specifically includes: the system comprises an acquisition module 21, a calculation module 22 and a screening module 23.
The collection module 21 is configured to start to collect, by each host, monitoring data of a corresponding public resource according to a pre-configured public resource monitoring item and a monitoring weight, where the monitoring weight is used to adjust a collection frequency and a collection time point of the monitoring data;
the calculation module 22 is configured to extract network delay data of each host and a monitoring data storage node, and calculate, according to the network delay data and the monitoring weight, a reliability value of each monitoring data collected by each host using a predefined reliability calculation formula;
the screening module 23 is configured to screen the monitoring data with the highest reliability value and store the monitoring data in the monitoring data storage node.
And starting to collect the monitoring data of the corresponding public resources through each host machine according to the pre-configured public resource monitoring item and the monitoring weight, extracting the network delay data of each host machine and the monitoring data storage node, calculating the reliability value of each monitoring data collected by each host machine according to the network delay data and the monitoring weight by using a pre-defined reliability calculation formula, and finally screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node. By adopting the technical means, the monitoring data with the highest reliability value is screened and stored, so that the high reliability of the monitoring data acquisition and storage can be ensured, and the system overhead of the cloud computing platform is reduced. And the repeated acquisition and storage of the monitoring data can be avoided by screening the monitoring data storage, so that the occupation of storage resources of the database is reduced.
The multi-node monitoring device provided in the second embodiment of the present application may be used to execute the multi-node monitoring method provided in the first embodiment, and has corresponding functions and beneficial effects.
Embodiment III:
an electronic device according to a third embodiment of the present application, referring to fig. 4, includes: processor 31, memory 32, communication module 33, input device 34 and output device 35. The number of processors in the electronic device may be one or more and the number of memories in the electronic device may be one or more. The processor, memory, communication module, input device, and output device of the electronic device may be connected by a bus or other means.
The memory 32 is used as a computer readable storage medium for storing software programs, computer executable programs and modules, such as program instructions/modules corresponding to the multi-node monitoring method according to any of the embodiments of the present application (e.g., an acquisition module, a calculation module and a screening module in a multi-node monitoring device). The memory may mainly include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the device, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, the memory may further include memory remotely located with respect to the processor, the remote memory being connectable to the device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The communication module 33 is used for data transmission.
The processor 31 executes various functional applications of the device and data processing by running software programs, instructions and modules stored in the memory, i.e., implements the multi-node monitoring method described above.
The input means 34 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the device. The output means 35 may comprise a display device such as a display screen.
The electronic device provided by the above embodiment can be used for executing the multi-node monitoring method provided by the above embodiment, and has corresponding functions and beneficial effects.
Embodiment four:
the present embodiments also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a multi-node monitoring method comprising: each host machine starts to collect monitoring data of corresponding public resources according to a pre-configured public resource monitoring item and monitoring weight, wherein the monitoring weight is used for adjusting the collection frequency and the collection time point of the monitoring data; extracting network delay data of each host and each monitoring data storage node, and calculating the reliability value of each monitoring data acquired by each host by using a predefined reliability calculation formula according to the network delay data and the monitoring weight; and screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node.
Storage media-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk or tape devices; computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, lanbas (Rambus) RAM, etc.; nonvolatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a second, different computer system connected to the first computer system through a network such as the internet. The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media residing in different locations (e.g., in different computer systems connected by a network). The storage medium may store program instructions (e.g., embodied as a computer program) executable by one or more processors.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present application is not limited to the multi-node monitoring method described above, and may also perform the relevant operations in the multi-node monitoring method provided in any embodiment of the present application.
The multi-node monitoring device, the storage medium and the electronic apparatus provided in the foregoing embodiments may perform the multi-node monitoring method provided in any embodiment of the present application, and technical details not described in detail in the foregoing embodiments may be referred to the multi-node monitoring method provided in any embodiment of the present application.
The foregoing description is only of the preferred embodiments of the present application and the technical principles employed. The present application is not limited to the specific embodiments described herein, but is capable of numerous obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the present application. Therefore, while the present application has been described in connection with the above embodiments, the present application is not limited to the above embodiments, but may include many other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the claims.

Claims (10)

1. A multi-node monitoring method, comprising:
each host machine starts to collect monitoring data of corresponding public resources according to a pre-configured public resource monitoring item and monitoring weight, wherein the monitoring weight is used for adjusting the collection frequency and the collection time point of the monitoring data;
extracting network delay data of each host and each monitoring data storage node, and calculating the reliability value of each monitoring data acquired by each host by using a predefined reliability calculation formula according to the network delay data and the monitoring weight;
and for the same public resource, screening the monitoring data with the highest reliability value and storing the monitoring data in the monitoring data storage node.
2. The multi-node monitoring method of claim 1, wherein each host initiates collection of monitoring data of a corresponding common resource according to a pre-configured common resource monitoring item and a monitoring weight, respectively, comprising:
and each host machine collects initial data corresponding to the public resources according to the collection frequency and the collection time point, calculates an average value based on the collection frequency and the data quantity of the initial data, and takes the average value as the monitoring data.
3. The multi-node monitoring method according to claim 1, wherein screening the monitoring data having the highest reliability value and storing the data in the monitoring data storage node comprises:
and storing the data value, the data acquisition time, the corresponding public resource monitoring item name and the data acquisition object of the monitoring data with the highest reliability value in the monitoring data storage node.
4. A multi-node monitoring method according to claim 3, wherein the monitoring data storage node is a time-sequential database.
5. The multi-node monitoring method according to claim 1, further comprising, after screening the monitoring data having the highest reliability value and storing the data in the monitoring data storage node:
and taking the host machine corresponding to the monitoring data with the highest reliability value as a monitoring node, keeping the monitoring node to collect the monitoring data corresponding to the public resource in a set period of time, and stopping the rest host machines from collecting the monitoring data corresponding to the public resource.
6. The multi-node monitoring method according to claim 5, wherein after the host machine corresponding to the monitoring data with the highest reliability value is used as a monitoring node, keeping the monitoring node to collect the monitoring data corresponding to the public resource for a set period of time, and stopping the rest of the host machines from collecting the monitoring data corresponding to the public resource, further comprising:
and after a set time period, recalculating the reliability values of the monitoring data of the corresponding public resources collected by each host machine, and redetermining the monitoring nodes according to the recalculated reliability values.
7. The multi-node monitoring method according to claim 1, further comprising, before each host initiates collection of monitoring data of a corresponding common resource according to a pre-configured common resource monitoring item and a monitoring weight, respectively:
and each host machine sets the monitoring weight corresponding to the public resource according to user definition or random setting.
8. A multi-node monitoring device, comprising:
the system comprises an acquisition module, a monitoring module and a control module, wherein the acquisition module is used for starting to acquire monitoring data of corresponding public resources through each host machine according to a pre-configured public resource monitoring item and a monitoring weight, and the monitoring weight is used for adjusting the acquisition frequency and the acquisition time point of the monitoring data;
the computing module is used for extracting network delay data of each host machine and each monitoring data storage node, and computing the reliability value of each monitoring data acquired by each host machine by using a predefined reliability computing formula according to the network delay data and the monitoring weight;
and the screening module is used for screening the monitoring data with the highest reliability value aiming at the same public resource and storing the monitoring data in the monitoring data storage node.
9. An electronic device, comprising:
a memory and one or more processors;
the memory is used for storing one or more programs;
when executed by the one or more processors, causes the one or more processors to implement the multi-node monitoring method of any of claims 1-7.
10. A storage medium containing computer executable instructions which, when executed by a computer processor, are for performing the multi-node monitoring method of any of claims 1-7.
CN202010706635.9A 2020-07-21 2020-07-21 Multi-node monitoring method and device, electronic equipment and storage medium Active CN111901405B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010706635.9A CN111901405B (en) 2020-07-21 2020-07-21 Multi-node monitoring method and device, electronic equipment and storage medium
PCT/CN2021/073799 WO2022016845A1 (en) 2020-07-21 2021-01-26 Multi-node monitoring method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010706635.9A CN111901405B (en) 2020-07-21 2020-07-21 Multi-node monitoring method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111901405A CN111901405A (en) 2020-11-06
CN111901405B true CN111901405B (en) 2023-05-05

Family

ID=73190386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010706635.9A Active CN111901405B (en) 2020-07-21 2020-07-21 Multi-node monitoring method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111901405B (en)
WO (1) WO2022016845A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111901405B (en) * 2020-07-21 2023-05-05 国云科技股份有限公司 Multi-node monitoring method and device, electronic equipment and storage medium
CN115473834B (en) * 2022-09-14 2024-04-02 中国电信股份有限公司 Monitoring task scheduling method and system
CN115685817B (en) * 2022-10-17 2024-07-05 南京邮电大学 Method, device and medium for processing data concurrency during CAN (controller area network) multi-node communication

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102761454A (en) * 2011-04-28 2012-10-31 中兴通讯股份有限公司 Method and system for monitoring internet of things
CN109062699A (en) * 2018-08-15 2018-12-21 郑州云海信息技术有限公司 A kind of resource monitoring method, device, server and storage medium
CN109688106A (en) * 2018-11-19 2019-04-26 中国科学院信息工程研究所 A kind of data collaborative acquisition method and system
CN109714402A (en) * 2018-12-12 2019-05-03 胡书恺 A kind of redundant data acquisition system and its operation application method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104917836A (en) * 2015-06-10 2015-09-16 北京奇虎科技有限公司 Method and device for monitoring and analyzing availability of computing equipment based on cluster
TWI595760B (en) * 2015-12-01 2017-08-11 廣達電腦股份有限公司 Management systems for managing resources of servers and management methods thereof
CN107844402A (en) * 2017-11-17 2018-03-27 北京联想超融合科技有限公司 A kind of resource monitoring method, device and terminal based on super fusion storage system
CN111258870A (en) * 2020-01-17 2020-06-09 中国建设银行股份有限公司 Performance analysis method, device, equipment and storage medium of distributed storage system
CN111901405B (en) * 2020-07-21 2023-05-05 国云科技股份有限公司 Multi-node monitoring method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102761454A (en) * 2011-04-28 2012-10-31 中兴通讯股份有限公司 Method and system for monitoring internet of things
CN109062699A (en) * 2018-08-15 2018-12-21 郑州云海信息技术有限公司 A kind of resource monitoring method, device, server and storage medium
CN109688106A (en) * 2018-11-19 2019-04-26 中国科学院信息工程研究所 A kind of data collaborative acquisition method and system
CN109714402A (en) * 2018-12-12 2019-05-03 胡书恺 A kind of redundant data acquisition system and its operation application method

Also Published As

Publication number Publication date
WO2022016845A1 (en) 2022-01-27
CN111901405A (en) 2020-11-06

Similar Documents

Publication Publication Date Title
CN111901405B (en) Multi-node monitoring method and device, electronic equipment and storage medium
US10248404B2 (en) Managing update deployment
JP4912401B2 (en) System and method for adaptively collecting performance and event information
CN110866167B (en) Task allocation method, device, server and storage medium
CN111522636A (en) Application container adjusting method, application container adjusting system, computer readable medium and terminal device
CN111177165B (en) Method, device and equipment for detecting data consistency
US8973000B2 (en) Determining multiprogramming levels
CN109597764A (en) A kind of test method and relevant apparatus of catalogue quota
CN109885384B (en) Task parallelism optimization method and device, computer equipment and storage medium
CN107193749B (en) Test method, device and equipment
CN112016009B (en) Data processing method, balance acquisition device, equipment and storage medium
CN113608982A (en) Function execution performance monitoring method and device, computer equipment and storage medium
CN113238815A (en) Interface access control method, device, equipment and storage medium
CN108920098A (en) A kind of storage management system collects method, system and the equipment of information
CN111819550A (en) Data processing method and network equipment
CN107483280B (en) Method and device for monitoring service node device
US9606887B2 (en) Persisting large volumes of data in an efficient unobtrusive manner
CN115563160A (en) Data processing method, data processing device, computer equipment and computer readable storage medium
CN115438056A (en) Data acquisition method, device, equipment and storage medium
CN113448747B (en) Data transmission method, device, computer equipment and storage medium
CN109522124B (en) Storage management system loading method, device, equipment and readable storage medium
CN111737083A (en) VMware cluster resource monitoring method and device
CN111143177B (en) Method, system, device and storage medium for collecting RMF III data of IBM host
CN110908886A (en) Data sending method and device, electronic equipment and storage medium
CN114860827A (en) Method and device for establishing connection between application and database in multi-tenant environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant