CN113609139A - Monitoring data management method and device, electronic equipment and storage medium - Google Patents
Monitoring data management method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113609139A CN113609139A CN202111164311.8A CN202111164311A CN113609139A CN 113609139 A CN113609139 A CN 113609139A CN 202111164311 A CN202111164311 A CN 202111164311A CN 113609139 A CN113609139 A CN 113609139A
- Authority
- CN
- China
- Prior art keywords
- influxdb
- cluster
- monitoring data
- abnormal
- clusters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 106
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000013523 data management Methods 0.000 title claims abstract description 39
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 9
- 230000002159 abnormal effect Effects 0.000 claims description 38
- 230000004044 response Effects 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 11
- 238000004458 analytical method Methods 0.000 claims description 10
- 230000005856 abnormality Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000001360 synchronised effect Effects 0.000 description 6
- 238000011084 recovery Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/256—Integrating or interfacing systems involving database management systems in federated or virtual databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a monitoring data management method, a monitoring data management device, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: creating a plurality of InfluxDB clusters; calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database; and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster. Therefore, the monitoring data management method provided by the application realizes the rapid storage of the monitoring data of the huge number of servers.
Description
Technical Field
The present application relates to the field of storage technologies, and in particular, to a method and an apparatus for managing monitoring data, an electronic device, and a computer-readable storage medium.
Background
The artificial intelligence platform stores the monitoring data based on the InfluxDB database, and because the InfluxDB is limited by the InfluxDB, a single node cannot support the quick writing and quick query of the monitoring data of a huge number of servers in an supercomputing scene, and the experience of a user is seriously influenced.
Therefore, how to implement fast reading and writing of monitoring data of a huge number of servers is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The present application aims to provide a monitoring data management method, a monitoring data management device, an electronic device, and a computer-readable storage medium, which implement fast reading and writing of monitoring data of a large number of servers.
In order to achieve the above object, the present application provides a monitoring data management method, including:
creating a plurality of InfluxDB clusters;
calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database;
and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
Wherein, still include:
acquiring a data query command;
analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
distributing each data inquiry subcommand to a corresponding InfluxDB cluster for execution to obtain response subcommand information corresponding to each data inquiry subcommand;
and summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain response information corresponding to the data query command.
After the obtaining the data query command, the method further includes:
judging whether the data query command contains an accurate query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried;
if yes, directly responding to the data query command;
if not, the step of analyzing the data query command into a plurality of data query subcommands is executed.
Wherein, still include:
monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information;
and summarizing alarm information generated by the InfluxDB cluster.
Wherein, still include:
monitoring all the InfluxDB clusters for abnormity;
and if the abnormal InfluxDB cluster is monitored, recovering the abnormal InfluxDB cluster according to the abnormal type.
Wherein the recovering the abnormal infiluxdb cluster according to the abnormal type includes:
and if the abnormal type is node abnormality, selecting a normal node in the abnormal InfluxDB cluster to take over the abnormal node.
The method for recovering the abnormal InfluxDB cluster according to the abnormal type includes the following steps:
if the abnormal type is cluster abnormality, a new InfluxDB cluster is created, the index relation of the abnormal InfluxDB cluster is obtained from the relational database, and cached monitoring data is obtained from a server corresponding to the abnormal InfluxDB cluster, so that the new InfluxDB cluster replaces the abnormal InfluxDB cluster.
In order to achieve the above object, the present application provides a monitoring data management apparatus, including:
the system comprises a creating module, a selecting module and a sending module, wherein the creating module is used for creating a plurality of InfluxDB clusters;
the computing module is used for computing the index relationship between the servers and the InfluxDB clusters by utilizing a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters and storing the index relationship into a relational database;
and the write-in module is used for acquiring monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
To achieve the above object, the present application provides an electronic device including:
a memory for storing a computer program;
and a processor for implementing the steps of the monitoring data management method when executing the computer program.
To achieve the above object, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the monitoring data management method as described above.
According to the scheme, the monitoring data management method provided by the application comprises the following steps: creating a plurality of InfluxDB clusters; calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database; and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
According to the monitoring data management method, the monitoring data of a large number of servers are stored through the plurality of InfluxDB clusters, and the monitoring data of each server are written into the corresponding InfluxDB cluster in parallel, so that the monitoring data are written rapidly. Therefore, the monitoring data management method provided by the application realizes the rapid storage of the monitoring data of the huge number of servers. The application also discloses a monitoring data management device, an electronic device and a computer readable storage medium, which can also realize the technical effects.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow diagram illustrating a method of monitoring data management according to an exemplary embodiment;
FIG. 2 is a diagram illustrating an indexing relationship of a server to an InfluxDB cluster in accordance with an illustrative embodiment;
FIG. 3 is a flow diagram illustrating another monitoring data management method in accordance with an exemplary embodiment;
FIG. 4 is a block diagram illustrating a monitoring data management device in accordance with an exemplary embodiment;
FIG. 5 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In addition, in the embodiments of the present application, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order.
The embodiment of the application discloses a monitoring data management method, which realizes the quick reading and writing of monitoring data of a large number of servers.
Referring to fig. 1, a flowchart of a monitoring data management method according to an exemplary embodiment is shown, as shown in fig. 1, including:
s101: creating a plurality of InfluxDB clusters;
s102: calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database;
in this embodiment, a plurality of infixdb clusters are created to store monitoring data of a large number of servers, that is, dynamic expansion of the infixdb clusters is realized based on a library partitioning idea, and ansable (an automated operation and maintenance tool) can be used to complete one-click deployment and library partitioning dynamic expansion of the entire artificial intelligence platform. In specific implementation, the number of the infiluxdb clusters to be created is calculated according to the number of servers and the number of servers corresponding to a single infiluxdb cluster. For example, if the number of servers is 900, and one infiluxdb cluster is used to store monitoring data of 200 servers, that is, the number of servers corresponding to a single infiluxdb cluster is 200, the number of infiluxdb clusters to be created is 5, and the indexes are 0, 1, 2, 3, and 4, respectively. The index relationship between the server and the InfluxDB cluster is calculated by using a Hash algorithm according to the number n of the servers and the number of the InfluxDB cluster, and the calculation mode is as shown in FIG. 2, (Hash (cluster index number) + Hash (node name)), and 2 is included in FIG. 232And each position, wherein the hollow circle represents a node position, when a new InfluxDB cluster is added, a position without a relevant node is required to be found, and then the index is reversely calculated.
It can be understood that each server stores the index relationship between itself and its corresponding infixdb cluster, and each infixdb cluster also stores the index relationship between itself and its corresponding server, and all the index relationships may be stored in a relational database for backup, such as marlabb. When a plurality of nodes exist in the relational database, the memory of the support node directly performs data synchronization by an RPC (Remote Procedure Call) in an incremental updating mode. Further, three script deployment modes are built in each InfluxDB cluster: direct deployment, containerized deployment, k8s (kubernets) deployment in virtual/physical machines.
S103: and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
In this embodiment, each server includes a collecting component telegraff, configured to collect its own monitoring data, and output the monitoring data to the infiluxdb cluster. In a specific implementation, a data output mode of the acquisition component telegraf is modified, output of monitoring data is dynamically completed, that is, a new configuration item is added to a configuration file of an original component of the telegraf for defining a new data output mode, and in the data output mode, for a target server, a corresponding target infiluxdb cluster needs to be determined according to an index relationship, and monitoring data of the target server is written into the target infiluxdb cluster.
As a preferred embodiment, an adapter may also be customized, and is used to convert the collected monitoring data into a data format that the infiluxdb cluster can store. It should be noted that, the data warehousing supports periodic writing and cache space overflow writing, with respect to the minimization condition, if the maximum number of attempts is exceeded or the corresponding infiluxdb cluster cannot be written temporarily, the data is stored in the cache pool, and if the data still cannot be written after the preset time, the data is discarded.
According to the monitoring data management method provided by the embodiment of the application, monitoring data of a large number of servers are stored through a plurality of InfluxDB clusters, and the monitoring data of each server are written into the corresponding InfluxDB clusters in parallel, so that the monitoring data are written quickly. Therefore, the monitoring data management method provided by the application realizes the rapid storage of the monitoring data of the huge number of servers.
This embodiment introduces a data query method, specifically:
referring to fig. 3, a flowchart of another monitoring data management method according to an exemplary embodiment is shown, as shown in fig. 3, including:
s201: acquiring a data query command;
in this embodiment, the data Query command may specifically be SQL (Structured Query Language), a global interceptor (Aspect idea in Java) is added to the infiluxdb layer, and all data Query commands, that is, all Dao methods, are intercepted to determine whether the data Query command includes an accurate Query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried; if yes, directly responding to the data query command; if not, the process proceeds to step S202.
S202: analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
s203: distributing each data inquiry subcommand to a corresponding InfluxDB cluster for execution to obtain response subcommand information corresponding to each data inquiry subcommand;
in specific implementation, the data query command is analyzed into a plurality of data query subcommands according to the infiluxdb cluster to be queried, each data query subcommand is used for querying one infiluxdb cluster, and is distributed to the corresponding infiluxdb cluster to be executed in parallel, and blocking callback is supported, so that response subcommand information in each infiluxdb cluster is obtained.
S204: and summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain response information corresponding to the data query command.
In specific implementation, SQL is dynamically parsed to determine whether data summarization using an analysis function is required. Adding an attribute annotation mark needing to participate in calculation in an Object Relational Mapping (ORM) Object, and customizing an analysis function based on the function of the InfluxDB, wherein the analysis function comprises a mean value, a maximum value, a minimum value, a variance, a latest value and the like. Taking a mean value function as an example, each infiluxdb cluster obtains a calculation result, i.e., response sub-information, using its own mean value function, then performs calculation on the calculation results generated by all the infiluxdb clusters, and divides the calculation results by the number of the infiluxdb clusters to obtain a final calculation result, i.e., response information corresponding to the data query command.
Therefore, the embodiment realizes the parallel data query of a plurality of InfluxDB clusters, and the query results of all the InfluxDB clusters are summarized and analyzed by using the analysis function, so that the data query efficiency is improved.
On the basis of the above embodiment, as a preferred implementation, the method further includes: monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information; and summarizing alarm information generated by the InfluxDB cluster.
In specific implementation, an alarm engine is deployed in each InfluxDB cluster, and when a new InfluxDB cluster is added, the deployment of an alarm engine module is dynamically completed. The alarm engine is used for monitoring the written monitoring data according to threshold information to generate alarm information, the threshold information can be threshold range, enable/disable, alarm frequency and the like, and is issued to each alarm engine by the service module, and in addition, the service module can also update the threshold information in each alarm engine. Furthermore, the alarm information summarizing component is deployed for unified processing of the alarm information, and the unified processing may include data deduplication, alarm mail generation, alarm information storage and the like.
On the basis of the above embodiment, as a preferred implementation, the method further includes: monitoring all the InfluxDB clusters for abnormity; and if the abnormal InfluxDB cluster is monitored, recovering the abnormal InfluxDB cluster according to the abnormal type.
In specific implementation, for example, it is ensured that a plurality of infiluxdb clusters can normally store monitoring data, all infiluxdb clusters are detected at regular time, and if a state is abnormal, a fast recovery monitoring event is triggered. In each server, the monitoring data of the server in the latest period of time is cached for exception recovery.
As a feasible implementation manner, if the exception type is a node exception, that is, an exception node is detected in the exception infiluxdb cluster, a normal node is selected from the exception infiluxdb cluster to take over the exception node.
As another feasible implementation manner, if the exception type is cluster exception, a new infiluxdb cluster is created, the index relationship of the exception infiluxdb cluster is obtained from the relational database, and the cached monitoring data is obtained from the server corresponding to the exception infiluxdb cluster, so that the new infiluxdb cluster replaces the exception infiluxdb cluster.
In the following, a monitoring data management apparatus provided in an embodiment of the present application is introduced, and a monitoring data management apparatus described below and a monitoring data management method described above may be referred to each other.
Referring to fig. 4, a block diagram of a monitoring data management apparatus according to an exemplary embodiment is shown, as shown in fig. 4, including:
a creating module 401, configured to create multiple infiluxdb clusters;
a calculating module 402, configured to calculate an index relationship between a server and an infiluxdb cluster by using a hash algorithm according to the number of servers and the number of infiluxdb clusters, and store the index relationship in a relational database;
the write-in module 403 is configured to collect monitoring data of a target server, determine a target infiluxdb cluster corresponding to the target server according to the index relationship, and write the monitoring data into the target infiluxdb cluster.
The monitoring data management device provided by the embodiment of the application stores monitoring data of a large number of servers through a plurality of InfluxDB clusters, and the monitoring data of each server is written into the corresponding InfluxDB cluster in parallel, so that the monitoring data is written quickly. Therefore, the monitoring data management device provided by the application realizes the rapid storage of the monitoring data of the huge servers.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the acquisition module is used for acquiring a data query command;
the analysis module is used for analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
the execution module is used for distributing each data inquiry subcommand to the corresponding InfluxDB cluster for execution to obtain the response subcommand information corresponding to each data inquiry subcommand;
and the first summarizing module is used for summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain the response information corresponding to the data query command.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the judging module is used for judging whether the data query command contains an accurate query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried; if yes, starting the working process of the response module; if not, starting the working process of the analysis module;
and the response module is used for responding to the data query command.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the monitoring module is used for monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information;
and the second summarizing module is used for summarizing the alarm information generated by the InfluxDB cluster.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the monitoring module is used for monitoring all the InfluxDB clusters for abnormity;
and the recovery module is used for recovering the abnormal InfluxDB cluster according to the abnormal type when the abnormal InfluxDB cluster is monitored.
On the basis of the above embodiment, as a preferred implementation manner, if the exception type is a node exception, the recovery module specifically selects a module in which a normal node takes over the exception node from the exception infiluxdb cluster.
On the basis of the above embodiment, as a preferred implementation manner, the server caches monitoring data of a latest preset duration, if the exception type is cluster exception, the recovery module specifically creates a new infiluxdb cluster, obtains an index relationship of the abnormal infiluxdb cluster from the relational database, and obtains cached monitoring data from a server corresponding to the abnormal infiluxdb cluster, so as to implement a module in which the new infiluxdb cluster replaces the abnormal infiluxdb cluster.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Based on the hardware implementation of the program module, and in order to implement the method according to the embodiment of the present application, an embodiment of the present application further provides an electronic device, and fig. 5 is a structural diagram of an electronic device according to an exemplary embodiment, as shown in fig. 5, the electronic device includes:
a communication interface 1 capable of information interaction with other devices such as network devices and the like;
and the processor 2 is connected with the communication interface 1 to realize information interaction with other equipment, and is used for executing the monitoring data management method provided by one or more technical schemes when running a computer program. And the computer program is stored on the memory 3.
In practice, of course, the various components in the electronic device are coupled together by the bus system 4. It will be appreciated that the bus system 4 is used to enable connection communication between these components. The bus system 4 comprises, in addition to a data bus, a power bus, a control bus and a status signal bus. For the sake of clarity, however, the various buses are labeled as bus system 4 in fig. 5.
The memory 3 in the embodiment of the present application is used to store various types of data to support the operation of the electronic device. Examples of such data include: any computer program for operating on an electronic device.
It will be appreciated that the memory 3 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 3 described in the embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed in the above embodiment of the present application may be applied to the processor 2, or implemented by the processor 2. The processor 2 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 2. The processor 2 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 2 may implement or perform the methods, steps and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 3, and the processor 2 reads the program in the memory 3 and in combination with its hardware performs the steps of the aforementioned method.
When the processor 2 executes the program, the corresponding processes in the methods according to the embodiments of the present application are realized, and for brevity, are not described herein again.
In an exemplary embodiment, the present application further provides a storage medium, i.e. a computer storage medium, specifically a computer readable storage medium, for example, including a memory 3 storing a computer program, which can be executed by a processor 2 to implement the steps of the foregoing method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A method for managing monitored data, comprising:
creating a plurality of InfluxDB clusters;
calculating the index relationship between the servers and the InfluxDB clusters by using a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters, and storing the index relationship into a relational database;
and collecting monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
2. The monitoring data management method according to claim 1, further comprising:
acquiring a data query command;
analyzing the data query command into a plurality of data query subcommands; each data query subcommand corresponds to a single InfluxDB cluster;
distributing each data inquiry subcommand to a corresponding InfluxDB cluster for execution to obtain response subcommand information corresponding to each data inquiry subcommand;
and summarizing and analyzing all the response sub-information by utilizing an analysis function to obtain response information corresponding to the data query command.
3. The monitoring data management method according to claim 2, wherein after the obtaining the data query command, the method further comprises:
judging whether the data query command contains an accurate query condition; the accurate query condition comprises a single server or a single InfluxDB cluster needing to be queried;
if yes, directly responding to the data query command;
if not, the step of analyzing the data query command into a plurality of data query subcommands is executed.
4. The monitoring data management method according to claim 1, further comprising:
monitoring the monitoring data written into the InfluxDB cluster by utilizing an alarm engine in each InfluxDB cluster according to threshold information so as to generate alarm information;
and summarizing alarm information generated by the InfluxDB cluster.
5. The monitoring data management method according to any one of claims 1 to 4, characterized by further comprising:
monitoring all the InfluxDB clusters for abnormity;
and if the abnormal InfluxDB cluster is monitored, recovering the abnormal InfluxDB cluster according to the abnormal type.
6. The monitoring data management method according to claim 5, wherein the recovering the abnormal infiluxdb cluster according to the abnormal type includes:
and if the abnormal type is node abnormality, selecting a normal node in the abnormal InfluxDB cluster to take over the abnormal node.
7. The monitoring data management method according to claim 5, wherein monitoring data of a latest preset duration of the server is cached in the server, and the recovering the abnormal infiluxdb cluster according to the abnormal type includes:
if the abnormal type is cluster abnormality, a new InfluxDB cluster is created, the index relation of the abnormal InfluxDB cluster is obtained from the relational database, and cached monitoring data is obtained from a server corresponding to the abnormal InfluxDB cluster, so that the new InfluxDB cluster replaces the abnormal InfluxDB cluster.
8. A monitoring data management apparatus, comprising:
the system comprises a creating module, a selecting module and a sending module, wherein the creating module is used for creating a plurality of InfluxDB clusters;
the computing module is used for computing the index relationship between the servers and the InfluxDB clusters by utilizing a Hash algorithm according to the number of the servers and the number of the InfluxDB clusters and storing the index relationship into a relational database;
and the write-in module is used for acquiring monitoring data of a target server, determining a target InfluxDB cluster corresponding to the target server according to the index relation, and writing the monitoring data into the target InfluxDB cluster.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the monitoring data management method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the monitoring data management method according to any one of claims 1 to 7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111164311.8A CN113609139A (en) | 2021-09-30 | 2021-09-30 | Monitoring data management method and device, electronic equipment and storage medium |
PCT/CN2022/078205 WO2023050705A1 (en) | 2021-09-30 | 2022-02-28 | Monitoring data management method and apparatus, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111164311.8A CN113609139A (en) | 2021-09-30 | 2021-09-30 | Monitoring data management method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113609139A true CN113609139A (en) | 2021-11-05 |
Family
ID=78343324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111164311.8A Pending CN113609139A (en) | 2021-09-30 | 2021-09-30 | Monitoring data management method and device, electronic equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113609139A (en) |
WO (1) | WO2023050705A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115604353A (en) * | 2022-10-27 | 2023-01-13 | 广西电网有限责任公司(Cn) | Data processing method and system in power monitoring system and computer equipment |
WO2023050705A1 (en) * | 2021-09-30 | 2023-04-06 | 苏州浪潮智能科技有限公司 | Monitoring data management method and apparatus, electronic device and storage medium |
CN117349128A (en) * | 2023-12-05 | 2024-01-05 | 杭州沃趣科技股份有限公司 | Fault monitoring method, device and equipment of server cluster and storage medium |
CN117472697A (en) * | 2023-12-26 | 2024-01-30 | 苏州元脑智能科技有限公司 | Cluster monitoring method and device, electronic equipment and storage medium |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116566865A (en) * | 2023-07-11 | 2023-08-08 | 湖南星汉数智科技有限公司 | Bag grabbing system and method |
CN116561825B (en) * | 2023-07-12 | 2023-09-26 | 北京亿赛通科技发展有限责任公司 | Data security control method and device and computer equipment |
CN116595057B (en) * | 2023-07-14 | 2024-02-27 | 腾讯科技(深圳)有限公司 | Data query method, device, computer equipment and computer program product |
CN116992065B (en) * | 2023-09-26 | 2024-01-12 | 之江实验室 | Graph database data importing method, system, electronic equipment and medium |
CN117573479A (en) * | 2023-12-12 | 2024-02-20 | 中国科学院计算机网络信息中心 | Information system multisource target oriented state monitoring method and system architecture |
CN117914738A (en) * | 2024-01-18 | 2024-04-19 | 北京微控工业网关技术有限公司 | Gateway management method and device, electronic equipment and storage medium |
CN117632666B (en) * | 2024-01-25 | 2024-05-07 | 杭州阿里云飞天信息技术有限公司 | Alarm method, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107015872A (en) * | 2016-12-09 | 2017-08-04 | 上海壹账通金融科技有限公司 | The processing method and processing device of monitoring data |
CN109634519A (en) * | 2018-11-28 | 2019-04-16 | 平安科技(深圳)有限公司 | The method and storage medium of electronic device, monitoring data caching |
CN111752807A (en) * | 2020-07-01 | 2020-10-09 | 浪潮云信息技术股份公司 | Resource monitoring method based on Kubernetes |
CN113190623A (en) * | 2021-05-14 | 2021-07-30 | 京东数科海益信息科技有限公司 | Data processing method, device, server and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6029951B2 (en) * | 2012-11-27 | 2016-11-24 | 株式会社日立製作所 | Time series database setting automatic generation method, setting automatic generation system and monitoring server |
CN111352809A (en) * | 2020-03-06 | 2020-06-30 | 苏州浪潮智能科技有限公司 | Distributed alarm method, system and computer readable storage medium |
CN112199249A (en) * | 2020-09-16 | 2021-01-08 | 中国建设银行股份有限公司 | Monitoring data processing method, device, equipment and medium |
CN112181942A (en) * | 2020-09-22 | 2021-01-05 | 中国建设银行股份有限公司 | Time sequence database system and data processing method and device |
CN113609139A (en) * | 2021-09-30 | 2021-11-05 | 苏州浪潮智能科技有限公司 | Monitoring data management method and device, electronic equipment and storage medium |
-
2021
- 2021-09-30 CN CN202111164311.8A patent/CN113609139A/en active Pending
-
2022
- 2022-02-28 WO PCT/CN2022/078205 patent/WO2023050705A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107015872A (en) * | 2016-12-09 | 2017-08-04 | 上海壹账通金融科技有限公司 | The processing method and processing device of monitoring data |
CN109634519A (en) * | 2018-11-28 | 2019-04-16 | 平安科技(深圳)有限公司 | The method and storage medium of electronic device, monitoring data caching |
CN111752807A (en) * | 2020-07-01 | 2020-10-09 | 浪潮云信息技术股份公司 | Resource monitoring method based on Kubernetes |
CN113190623A (en) * | 2021-05-14 | 2021-07-30 | 京东数科海益信息科技有限公司 | Data processing method, device, server and storage medium |
Non-Patent Citations (2)
Title |
---|
刘金: "大规模集群状态时序数据采集、存储与分析", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
陈祖斌: "《电网企业级管理信息***运维体系及实践》", 30 November 2016, 中国财富出版社 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023050705A1 (en) * | 2021-09-30 | 2023-04-06 | 苏州浪潮智能科技有限公司 | Monitoring data management method and apparatus, electronic device and storage medium |
CN115604353A (en) * | 2022-10-27 | 2023-01-13 | 广西电网有限责任公司(Cn) | Data processing method and system in power monitoring system and computer equipment |
CN115604353B (en) * | 2022-10-27 | 2024-05-17 | 广西电网有限责任公司 | Data processing method, system and computer equipment in power monitoring system |
CN117349128A (en) * | 2023-12-05 | 2024-01-05 | 杭州沃趣科技股份有限公司 | Fault monitoring method, device and equipment of server cluster and storage medium |
CN117349128B (en) * | 2023-12-05 | 2024-03-22 | 杭州沃趣科技股份有限公司 | Fault monitoring method, device and equipment of server cluster and storage medium |
CN117472697A (en) * | 2023-12-26 | 2024-01-30 | 苏州元脑智能科技有限公司 | Cluster monitoring method and device, electronic equipment and storage medium |
CN117472697B (en) * | 2023-12-26 | 2024-03-15 | 苏州元脑智能科技有限公司 | Cluster monitoring method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2023050705A1 (en) | 2023-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113609139A (en) | Monitoring data management method and device, electronic equipment and storage medium | |
US11880340B2 (en) | Programmatically choosing preferred storage parameters for files in large-scale distributed storage systems | |
CN111600746B (en) | Network fault positioning method, device and equipment | |
US9552161B2 (en) | Repetitive data block deleting system and method | |
US11977532B2 (en) | Log record identification using aggregated log indexes | |
US20140164334A1 (en) | Data block backup system and method | |
CN112395157B (en) | Audit log acquisition method and device, computer equipment and storage medium | |
CN104346264A (en) | System and method for processing system event logs | |
US9338057B2 (en) | Techniques for searching data associated with devices in a heterogeneous data center | |
CN111145382A (en) | Log data processing method and device of automatic driving system | |
CN114443441B (en) | Storage system management method, device and equipment and readable storage medium | |
US20150088941A1 (en) | Programmatically choosing preferred storage parameters for files in large-scale distributed storage systems based on desired file reliability or availability | |
US20210397599A1 (en) | Techniques for generating a consistent view of an eventually consistent database | |
CN117376092A (en) | Fault root cause positioning method, device, equipment and storage medium | |
CN110580253B (en) | Time sequence data set loading method and device, storage medium and electronic equipment | |
US20100274764A1 (en) | Accessing snapshots of a time based file system | |
US20190050436A1 (en) | Content-based predictive organization of column families | |
WO2015042531A1 (en) | Programmatically choosing preferred storage parameters for files in large-scale distributed storage systems | |
CN113254269A (en) | Method, system, equipment and medium for repairing abnormal event of storage system | |
CN113886352A (en) | Metadata recovery method, device, equipment and medium for distributed file system | |
CN113590380A (en) | Database recovery method and system | |
CN107463484B (en) | Method and system for collecting monitoring records | |
CN112269677A (en) | Rollback operation device, method, equipment and medium under heterogeneous cloud platform | |
CN114301780B (en) | Automatic monitoring method and system suitable for multi-terminal operation and maintenance management system, electronic equipment and readable storage medium | |
CN113568883B (en) | Data writing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211105 |
|
RJ01 | Rejection of invention patent application after publication |