CN113609139A - Monitoring data management method and device, electronic equipment and storage medium - Google Patents

Monitoring data management method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113609139A
CN113609139A CN202111164311.8A CN202111164311A CN113609139A CN 113609139 A CN113609139 A CN 113609139A CN 202111164311 A CN202111164311 A CN 202111164311A CN 113609139 A CN113609139 A CN 113609139A
Authority
CN
China
Prior art keywords
influxdb
cluster
monitoring data
abnormal
clusters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111164311.8A
Other languages
Chinese (zh)
Inventor
孙辽东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202111164311.8A priority Critical patent/CN113609139A/en
Publication of CN113609139A publication Critical patent/CN113609139A/en
Priority to PCT/CN2022/078205 priority patent/WO2023050705A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/256Integrating or interfacing systems involving database management systems in federated or virtual databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a monitoring data management method, a monitoring data management device, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: creating a plurality of InfluxDB clusters; calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database; and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster. Therefore, the monitoring data management method provided by the application realizes the rapid storage of the monitoring data of the huge number of servers.

Description

Monitoring data management method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of storage technologies, and in particular, to a method and an apparatus for managing monitoring data, an electronic device, and a computer-readable storage medium.
Background
The artificial intelligence platform stores the monitoring data based on the InfluxDB database, and because the InfluxDB is limited by the InfluxDB, a single node cannot support the quick writing and quick query of the monitoring data of a huge number of servers in an supercomputing scene, and the experience of a user is seriously influenced.
Therefore, how to implement fast reading and writing of monitoring data of a huge number of servers is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The present application aims to provide a monitoring data management method, a monitoring data management device, an electronic device, and a computer-readable storage medium, which implement fast reading and writing of monitoring data of a large number of servers.
In order to achieve the above object, the present application provides a monitoring data management method, including:
creating a plurality of InfluxDB clusters;
calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database;
and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
Wherein, still include:
acquiring a data query command;
analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
distributing each data inquiry subcommand to a corresponding InfluxDB cluster for execution to obtain response subcommand information corresponding to each data inquiry subcommand;
and summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain response information corresponding to the data query command.
After the obtaining the data query command, the method further includes:
judging whether the data query command contains an accurate query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried;
if yes, directly responding to the data query command;
if not, the step of analyzing the data query command into a plurality of data query subcommands is executed.
Wherein, still include:
monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information;
and summarizing alarm information generated by the InfluxDB cluster.
Wherein, still include:
monitoring all the InfluxDB clusters for abnormity;
and if the abnormal InfluxDB cluster is monitored, recovering the abnormal InfluxDB cluster according to the abnormal type.
Wherein the recovering the abnormal infiluxdb cluster according to the abnormal type includes:
and if the abnormal type is node abnormality, selecting a normal node in the abnormal InfluxDB cluster to take over the abnormal node.
The method for recovering the abnormal InfluxDB cluster according to the abnormal type includes the following steps:
if the abnormal type is cluster abnormality, a new InfluxDB cluster is created, the index relation of the abnormal InfluxDB cluster is obtained from the relational database, and cached monitoring data is obtained from a server corresponding to the abnormal InfluxDB cluster, so that the new InfluxDB cluster replaces the abnormal InfluxDB cluster.
In order to achieve the above object, the present application provides a monitoring data management apparatus, including:
the system comprises a creating module, a selecting module and a sending module, wherein the creating module is used for creating a plurality of InfluxDB clusters;
the computing module is used for computing the index relationship between the servers and the InfluxDB clusters by utilizing a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters and storing the index relationship into a relational database;
and the write-in module is used for acquiring monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
To achieve the above object, the present application provides an electronic device including:
a memory for storing a computer program;
and a processor for implementing the steps of the monitoring data management method when executing the computer program.
To achieve the above object, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the monitoring data management method as described above.
According to the scheme, the monitoring data management method provided by the application comprises the following steps: creating a plurality of InfluxDB clusters; calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database; and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
According to the monitoring data management method, the monitoring data of a large number of servers are stored through the plurality of InfluxDB clusters, and the monitoring data of each server are written into the corresponding InfluxDB cluster in parallel, so that the monitoring data are written rapidly. Therefore, the monitoring data management method provided by the application realizes the rapid storage of the monitoring data of the huge number of servers. The application also discloses a monitoring data management device, an electronic device and a computer readable storage medium, which can also realize the technical effects.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow diagram illustrating a method of monitoring data management according to an exemplary embodiment;
FIG. 2 is a diagram illustrating an indexing relationship of a server to an InfluxDB cluster in accordance with an illustrative embodiment;
FIG. 3 is a flow diagram illustrating another monitoring data management method in accordance with an exemplary embodiment;
FIG. 4 is a block diagram illustrating a monitoring data management device in accordance with an exemplary embodiment;
FIG. 5 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In addition, in the embodiments of the present application, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order.
The embodiment of the application discloses a monitoring data management method, which realizes the quick reading and writing of monitoring data of a large number of servers.
Referring to fig. 1, a flowchart of a monitoring data management method according to an exemplary embodiment is shown, as shown in fig. 1, including:
s101: creating a plurality of InfluxDB clusters;
s102: calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database;
in this embodiment, a plurality of infixdb clusters are created to store monitoring data of a large number of servers, that is, dynamic expansion of the infixdb clusters is realized based on a library partitioning idea, and ansable (an automated operation and maintenance tool) can be used to complete one-click deployment and library partitioning dynamic expansion of the entire artificial intelligence platform. In specific implementation, the number of the infiluxdb clusters to be created is calculated according to the number of servers and the number of servers corresponding to a single infiluxdb cluster. For example, if the number of servers is 900, and one infiluxdb cluster is used to store monitoring data of 200 servers, that is, the number of servers corresponding to a single infiluxdb cluster is 200, the number of infiluxdb clusters to be created is 5, and the indexes are 0, 1, 2, 3, and 4, respectively. The index relationship between the server and the InfluxDB cluster is calculated by using a Hash algorithm according to the number n of the servers and the number of the InfluxDB cluster, and the calculation mode is as shown in FIG. 2, (Hash (cluster index number) + Hash (node name)), and 2 is included in FIG. 232And each position, wherein the hollow circle represents a node position, when a new InfluxDB cluster is added, a position without a relevant node is required to be found, and then the index is reversely calculated.
It can be understood that each server stores the index relationship between itself and its corresponding infixdb cluster, and each infixdb cluster also stores the index relationship between itself and its corresponding server, and all the index relationships may be stored in a relational database for backup, such as marlabb. When a plurality of nodes exist in the relational database, the memory of the support node directly performs data synchronization by an RPC (Remote Procedure Call) in an incremental updating mode. Further, three script deployment modes are built in each InfluxDB cluster: direct deployment, containerized deployment, k8s (kubernets) deployment in virtual/physical machines.
S103: and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
In this embodiment, each server includes a collecting component telegraff, configured to collect its own monitoring data, and output the monitoring data to the infiluxdb cluster. In a specific implementation, a data output mode of the acquisition component telegraf is modified, output of monitoring data is dynamically completed, that is, a new configuration item is added to a configuration file of an original component of the telegraf for defining a new data output mode, and in the data output mode, for a target server, a corresponding target infiluxdb cluster needs to be determined according to an index relationship, and monitoring data of the target server is written into the target infiluxdb cluster.
As a preferred embodiment, an adapter may also be customized, and is used to convert the collected monitoring data into a data format that the infiluxdb cluster can store. It should be noted that, the data warehousing supports periodic writing and cache space overflow writing, with respect to the minimization condition, if the maximum number of attempts is exceeded or the corresponding infiluxdb cluster cannot be written temporarily, the data is stored in the cache pool, and if the data still cannot be written after the preset time, the data is discarded.
According to the monitoring data management method provided by the embodiment of the application, monitoring data of a large number of servers are stored through a plurality of InfluxDB clusters, and the monitoring data of each server are written into the corresponding InfluxDB clusters in parallel, so that the monitoring data are written quickly. Therefore, the monitoring data management method provided by the application realizes the rapid storage of the monitoring data of the huge number of servers.
This embodiment introduces a data query method, specifically:
referring to fig. 3, a flowchart of another monitoring data management method according to an exemplary embodiment is shown, as shown in fig. 3, including:
s201: acquiring a data query command;
in this embodiment, the data Query command may specifically be SQL (Structured Query Language), a global interceptor (Aspect idea in Java) is added to the infiluxdb layer, and all data Query commands, that is, all Dao methods, are intercepted to determine whether the data Query command includes an accurate Query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried; if yes, directly responding to the data query command; if not, the process proceeds to step S202.
S202: analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
s203: distributing each data inquiry subcommand to a corresponding InfluxDB cluster for execution to obtain response subcommand information corresponding to each data inquiry subcommand;
in specific implementation, the data query command is analyzed into a plurality of data query subcommands according to the infiluxdb cluster to be queried, each data query subcommand is used for querying one infiluxdb cluster, and is distributed to the corresponding infiluxdb cluster to be executed in parallel, and blocking callback is supported, so that response subcommand information in each infiluxdb cluster is obtained.
S204: and summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain response information corresponding to the data query command.
In specific implementation, SQL is dynamically parsed to determine whether data summarization using an analysis function is required. Adding an attribute annotation mark needing to participate in calculation in an Object Relational Mapping (ORM) Object, and customizing an analysis function based on the function of the InfluxDB, wherein the analysis function comprises a mean value, a maximum value, a minimum value, a variance, a latest value and the like. Taking a mean value function as an example, each infiluxdb cluster obtains a calculation result, i.e., response sub-information, using its own mean value function, then performs calculation on the calculation results generated by all the infiluxdb clusters, and divides the calculation results by the number of the infiluxdb clusters to obtain a final calculation result, i.e., response information corresponding to the data query command.
Therefore, the embodiment realizes the parallel data query of a plurality of InfluxDB clusters, and the query results of all the InfluxDB clusters are summarized and analyzed by using the analysis function, so that the data query efficiency is improved.
On the basis of the above embodiment, as a preferred implementation, the method further includes: monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information; and summarizing alarm information generated by the InfluxDB cluster.
In specific implementation, an alarm engine is deployed in each InfluxDB cluster, and when a new InfluxDB cluster is added, the deployment of an alarm engine module is dynamically completed. The alarm engine is used for monitoring the written monitoring data according to threshold information to generate alarm information, the threshold information can be threshold range, enable/disable, alarm frequency and the like, and is issued to each alarm engine by the service module, and in addition, the service module can also update the threshold information in each alarm engine. Furthermore, the alarm information summarizing component is deployed for unified processing of the alarm information, and the unified processing may include data deduplication, alarm mail generation, alarm information storage and the like.
On the basis of the above embodiment, as a preferred implementation, the method further includes: monitoring all the InfluxDB clusters for abnormity; and if the abnormal InfluxDB cluster is monitored, recovering the abnormal InfluxDB cluster according to the abnormal type.
In specific implementation, for example, it is ensured that a plurality of infiluxdb clusters can normally store monitoring data, all infiluxdb clusters are detected at regular time, and if a state is abnormal, a fast recovery monitoring event is triggered. In each server, the monitoring data of the server in the latest period of time is cached for exception recovery.
As a feasible implementation manner, if the exception type is a node exception, that is, an exception node is detected in the exception infiluxdb cluster, a normal node is selected from the exception infiluxdb cluster to take over the exception node.
As another feasible implementation manner, if the exception type is cluster exception, a new infiluxdb cluster is created, the index relationship of the exception infiluxdb cluster is obtained from the relational database, and the cached monitoring data is obtained from the server corresponding to the exception infiluxdb cluster, so that the new infiluxdb cluster replaces the exception infiluxdb cluster.
In the following, a monitoring data management apparatus provided in an embodiment of the present application is introduced, and a monitoring data management apparatus described below and a monitoring data management method described above may be referred to each other.
Referring to fig. 4, a block diagram of a monitoring data management apparatus according to an exemplary embodiment is shown, as shown in fig. 4, including:
a creating module 401, configured to create multiple infiluxdb clusters;
a calculating module 402, configured to calculate an index relationship between a server and an infiluxdb cluster by using a hash algorithm according to the number of servers and the number of infiluxdb clusters, and store the index relationship in a relational database;
the write-in module 403 is configured to collect monitoring data of a target server, determine a target infiluxdb cluster corresponding to the target server according to the index relationship, and write the monitoring data into the target infiluxdb cluster.
The monitoring data management device provided by the embodiment of the application stores monitoring data of a large number of servers through a plurality of InfluxDB clusters, and the monitoring data of each server is written into the corresponding InfluxDB cluster in parallel, so that the monitoring data is written quickly. Therefore, the monitoring data management device provided by the application realizes the rapid storage of the monitoring data of the huge servers.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the acquisition module is used for acquiring a data query command;
the analysis module is used for analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
the execution module is used for distributing each data inquiry subcommand to the corresponding InfluxDB cluster for execution to obtain the response subcommand information corresponding to each data inquiry subcommand;
and the first summarizing module is used for summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain the response information corresponding to the data query command.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the judging module is used for judging whether the data query command contains an accurate query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried; if yes, starting the working process of the response module; if not, starting the working process of the analysis module;
and the response module is used for responding to the data query command.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the monitoring module is used for monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information;
and the second summarizing module is used for summarizing the alarm information generated by the InfluxDB cluster.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the monitoring module is used for monitoring all the InfluxDB clusters for abnormity;
and the recovery module is used for recovering the abnormal InfluxDB cluster according to the abnormal type when the abnormal InfluxDB cluster is monitored.
On the basis of the above embodiment, as a preferred implementation manner, if the exception type is a node exception, the recovery module specifically selects a module in which a normal node takes over the exception node from the exception infiluxdb cluster.
On the basis of the above embodiment, as a preferred implementation manner, the server caches monitoring data of a latest preset duration, if the exception type is cluster exception, the recovery module specifically creates a new infiluxdb cluster, obtains an index relationship of the abnormal infiluxdb cluster from the relational database, and obtains cached monitoring data from a server corresponding to the abnormal infiluxdb cluster, so as to implement a module in which the new infiluxdb cluster replaces the abnormal infiluxdb cluster.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Based on the hardware implementation of the program module, and in order to implement the method according to the embodiment of the present application, an embodiment of the present application further provides an electronic device, and fig. 5 is a structural diagram of an electronic device according to an exemplary embodiment, as shown in fig. 5, the electronic device includes:
a communication interface 1 capable of information interaction with other devices such as network devices and the like;
and the processor 2 is connected with the communication interface 1 to realize information interaction with other equipment, and is used for executing the monitoring data management method provided by one or more technical schemes when running a computer program. And the computer program is stored on the memory 3.
In practice, of course, the various components in the electronic device are coupled together by the bus system 4. It will be appreciated that the bus system 4 is used to enable connection communication between these components. The bus system 4 comprises, in addition to a data bus, a power bus, a control bus and a status signal bus. For the sake of clarity, however, the various buses are labeled as bus system 4 in fig. 5.
The memory 3 in the embodiment of the present application is used to store various types of data to support the operation of the electronic device. Examples of such data include: any computer program for operating on an electronic device.
It will be appreciated that the memory 3 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 3 described in the embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed in the above embodiment of the present application may be applied to the processor 2, or implemented by the processor 2. The processor 2 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 2. The processor 2 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 2 may implement or perform the methods, steps and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 3, and the processor 2 reads the program in the memory 3 and in combination with its hardware performs the steps of the aforementioned method.
When the processor 2 executes the program, the corresponding processes in the methods according to the embodiments of the present application are realized, and for brevity, are not described herein again.
In an exemplary embodiment, the present application further provides a storage medium, i.e. a computer storage medium, specifically a computer readable storage medium, for example, including a memory 3 storing a computer program, which can be executed by a processor 2 to implement the steps of the foregoing method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for managing monitored data, comprising:
creating a plurality of InfluxDB clusters;
calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database;
and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
2. The monitoring data management method according to claim 1, further comprising:
acquiring a data query command;
analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
distributing each data inquiry subcommand to a corresponding InfluxDB cluster for execution to obtain response subcommand information corresponding to each data inquiry subcommand;
and summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain response information corresponding to the data query command.
3. The monitoring data management method according to claim 2, wherein after the obtaining the data query command, the method further comprises:
judging whether the data query command contains an accurate query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried;
if yes, directly responding to the data query command;
if not, the step of analyzing the data query command into a plurality of data query subcommands is executed.
4. The monitoring data management method according to claim 1, further comprising:
monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information;
and summarizing alarm information generated by the InfluxDB cluster.
5. The monitoring data management method according to any one of claims 1 to 4, characterized by further comprising:
monitoring all the InfluxDB clusters for abnormity;
and if the abnormal InfluxDB cluster is monitored, recovering the abnormal InfluxDB cluster according to the abnormal type.
6. The monitoring data management method according to claim 5, wherein the recovering the abnormal infiluxdb cluster according to the abnormal type includes:
and if the abnormal type is node abnormality, selecting a normal node in the abnormal InfluxDB cluster to take over the abnormal node.
7. The monitoring data management method according to claim 5, wherein monitoring data of a latest preset duration of the server is cached in the server, and the recovering the abnormal infiluxdb cluster according to the abnormal type includes:
if the abnormal type is cluster abnormality, a new InfluxDB cluster is created, the index relation of the abnormal InfluxDB cluster is obtained from the relational database, and cached monitoring data is obtained from a server corresponding to the abnormal InfluxDB cluster, so that the new InfluxDB cluster replaces the abnormal InfluxDB cluster.
8. A monitoring data management apparatus, comprising:
the system comprises a creating module, a selecting module and a sending module, wherein the creating module is used for creating a plurality of InfluxDB clusters;
the computing module is used for computing the index relationship between the servers and the InfluxDB clusters by utilizing a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters and storing the index relationship into a relational database;
and the write-in module is used for acquiring monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the monitoring data management method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the monitoring data management method according to any one of claims 1 to 7.
CN202111164311.8A 2021-09-30 2021-09-30 Monitoring data management method and device, electronic equipment and storage medium Pending CN113609139A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111164311.8A CN113609139A (en) 2021-09-30 2021-09-30 Monitoring data management method and device, electronic equipment and storage medium
PCT/CN2022/078205 WO2023050705A1 (en) 2021-09-30 2022-02-28 Monitoring data management method and apparatus, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111164311.8A CN113609139A (en) 2021-09-30 2021-09-30 Monitoring data management method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113609139A true CN113609139A (en) 2021-11-05

Family

ID=78343324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111164311.8A Pending CN113609139A (en) 2021-09-30 2021-09-30 Monitoring data management method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN113609139A (en)
WO (1) WO2023050705A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115604353A (en) * 2022-10-27 2023-01-13 广西电网有限责任公司(Cn) Data processing method and system in power monitoring system and computer equipment
WO2023050705A1 (en) * 2021-09-30 2023-04-06 苏州浪潮智能科技有限公司 Monitoring data management method and apparatus, electronic device and storage medium
CN117349128A (en) * 2023-12-05 2024-01-05 杭州沃趣科技股份有限公司 Fault monitoring method, device and equipment of server cluster and storage medium
CN117472697A (en) * 2023-12-26 2024-01-30 苏州元脑智能科技有限公司 Cluster monitoring method and device, electronic equipment and storage medium

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116566865A (en) * 2023-07-11 2023-08-08 湖南星汉数智科技有限公司 Bag grabbing system and method
CN116561825B (en) * 2023-07-12 2023-09-26 北京亿赛通科技发展有限责任公司 Data security control method and device and computer equipment
CN116595057B (en) * 2023-07-14 2024-02-27 腾讯科技(深圳)有限公司 Data query method, device, computer equipment and computer program product
CN116992065B (en) * 2023-09-26 2024-01-12 之江实验室 Graph database data importing method, system, electronic equipment and medium
CN117573479A (en) * 2023-12-12 2024-02-20 中国科学院计算机网络信息中心 Information system multisource target oriented state monitoring method and system architecture
CN117914738A (en) * 2024-01-18 2024-04-19 北京微控工业网关技术有限公司 Gateway management method and device, electronic equipment and storage medium
CN117632666B (en) * 2024-01-25 2024-05-07 杭州阿里云飞天信息技术有限公司 Alarm method, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107015872A (en) * 2016-12-09 2017-08-04 上海壹账通金融科技有限公司 The processing method and processing device of monitoring data
CN109634519A (en) * 2018-11-28 2019-04-16 平安科技(深圳)有限公司 The method and storage medium of electronic device, monitoring data caching
CN111752807A (en) * 2020-07-01 2020-10-09 浪潮云信息技术股份公司 Resource monitoring method based on Kubernetes
CN113190623A (en) * 2021-05-14 2021-07-30 京东数科海益信息科技有限公司 Data processing method, device, server and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6029951B2 (en) * 2012-11-27 2016-11-24 株式会社日立製作所 Time series database setting automatic generation method, setting automatic generation system and monitoring server
CN111352809A (en) * 2020-03-06 2020-06-30 苏州浪潮智能科技有限公司 Distributed alarm method, system and computer readable storage medium
CN112199249A (en) * 2020-09-16 2021-01-08 中国建设银行股份有限公司 Monitoring data processing method, device, equipment and medium
CN112181942A (en) * 2020-09-22 2021-01-05 中国建设银行股份有限公司 Time sequence database system and data processing method and device
CN113609139A (en) * 2021-09-30 2021-11-05 苏州浪潮智能科技有限公司 Monitoring data management method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107015872A (en) * 2016-12-09 2017-08-04 上海壹账通金融科技有限公司 The processing method and processing device of monitoring data
CN109634519A (en) * 2018-11-28 2019-04-16 平安科技(深圳)有限公司 The method and storage medium of electronic device, monitoring data caching
CN111752807A (en) * 2020-07-01 2020-10-09 浪潮云信息技术股份公司 Resource monitoring method based on Kubernetes
CN113190623A (en) * 2021-05-14 2021-07-30 京东数科海益信息科技有限公司 Data processing method, device, server and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘金: "大规模集群状态时序数据采集、存储与分析", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
陈祖斌: "《电网企业级管理信息***运维体系及实践》", 30 November 2016, 中国财富出版社 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023050705A1 (en) * 2021-09-30 2023-04-06 苏州浪潮智能科技有限公司 Monitoring data management method and apparatus, electronic device and storage medium
CN115604353A (en) * 2022-10-27 2023-01-13 广西电网有限责任公司(Cn) Data processing method and system in power monitoring system and computer equipment
CN115604353B (en) * 2022-10-27 2024-05-17 广西电网有限责任公司 Data processing method, system and computer equipment in power monitoring system
CN117349128A (en) * 2023-12-05 2024-01-05 杭州沃趣科技股份有限公司 Fault monitoring method, device and equipment of server cluster and storage medium
CN117349128B (en) * 2023-12-05 2024-03-22 杭州沃趣科技股份有限公司 Fault monitoring method, device and equipment of server cluster and storage medium
CN117472697A (en) * 2023-12-26 2024-01-30 苏州元脑智能科技有限公司 Cluster monitoring method and device, electronic equipment and storage medium
CN117472697B (en) * 2023-12-26 2024-03-15 苏州元脑智能科技有限公司 Cluster monitoring method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2023050705A1 (en) 2023-04-06

Similar Documents

Publication Publication Date Title
CN113609139A (en) Monitoring data management method and device, electronic equipment and storage medium
US11880340B2 (en) Programmatically choosing preferred storage parameters for files in large-scale distributed storage systems
CN111600746B (en) Network fault positioning method, device and equipment
US9552161B2 (en) Repetitive data block deleting system and method
US11977532B2 (en) Log record identification using aggregated log indexes
US20140164334A1 (en) Data block backup system and method
CN112395157B (en) Audit log acquisition method and device, computer equipment and storage medium
CN104346264A (en) System and method for processing system event logs
US9338057B2 (en) Techniques for searching data associated with devices in a heterogeneous data center
CN111145382A (en) Log data processing method and device of automatic driving system
CN114443441B (en) Storage system management method, device and equipment and readable storage medium
US20150088941A1 (en) Programmatically choosing preferred storage parameters for files in large-scale distributed storage systems based on desired file reliability or availability
US20210397599A1 (en) Techniques for generating a consistent view of an eventually consistent database
CN117376092A (en) Fault root cause positioning method, device, equipment and storage medium
CN110580253B (en) Time sequence data set loading method and device, storage medium and electronic equipment
US20100274764A1 (en) Accessing snapshots of a time based file system
US20190050436A1 (en) Content-based predictive organization of column families
WO2015042531A1 (en) Programmatically choosing preferred storage parameters for files in large-scale distributed storage systems
CN113254269A (en) Method, system, equipment and medium for repairing abnormal event of storage system
CN113886352A (en) Metadata recovery method, device, equipment and medium for distributed file system
CN113590380A (en) Database recovery method and system
CN107463484B (en) Method and system for collecting monitoring records
CN112269677A (en) Rollback operation device, method, equipment and medium under heterogeneous cloud platform
CN114301780B (en) Automatic monitoring method and system suitable for multi-terminal operation and maintenance management system, electronic equipment and readable storage medium
CN113568883B (en) Data writing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211105

RJ01 Rejection of invention patent application after publication