CN111782341B

CN111782341B - Method and device for managing clusters

Info

Publication number: CN111782341B
Application number: CN202010615162.1A
Authority: CN
Inventors: 李勇; 宋渊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2024-04-05
Anticipated expiration: 2040-06-30
Also published as: CN111782341A

Abstract

The embodiment of the application discloses a method and a device for managing clusters, which can be used in the technical field of cloud computing. The specific implementation scheme is as follows: acquiring cluster information of a target cluster through a user interface; determining node information of nodes in a target cluster and component information of the nodes according to the cluster information, wherein the node information comprises node types and node numbers; creating nodes according to the node information; and configuring the components corresponding to the component information for the nodes. This embodiment improves the efficiency of cluster construction.

Description

Method and device for managing clusters

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to the technical field of cloud computing.

Background

With the development of internet technology and the advent of the cloud age, various industries have generated and accumulated large-scale user data in daily operations. The storage and management of these mass data is of great value while consuming a lot of costs for the enterprise. The traditional relational database and the single machine do not have the capability of processing mass data, and a large data cluster consisting of a plurality of computing nodes becomes a general solution for enterprises.

At present, the big data cluster is built manually mainly based on a physical machine, and needs to undergo lengthy and tedious steps such as hardware purchasing, system installation, ecosystem deployment, development of data analysis application and the like.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment and a storage medium for managing clusters.

In a first aspect, some embodiments of the present application provide a method for managing a cluster, the method comprising: acquiring cluster information of a target cluster through a user interface; determining node information of nodes in a target cluster and component information of the nodes according to the cluster information, wherein the node information comprises node types and node numbers; creating nodes according to the node information; and configuring the components corresponding to the component information for the nodes.

In a second aspect, some embodiments of the present application provide an apparatus for managing a cluster, the apparatus comprising: a first acquisition unit configured to acquire cluster information of a target cluster through a user interface; a first determining unit configured to determine node information of nodes in the target cluster and component information of the nodes according to the cluster information, the node information including a node type and a node number; a creation unit configured to create a node from the node information; the first configuration unit is configured to configure a component corresponding to the component information for the node.

In a third aspect, some embodiments of the present application provide an apparatus comprising: one or more processors; and a storage device having one or more programs stored thereon, which when executed by the one or more processors cause the one or more processors to implement the method as described in the first aspect.

In a fourth aspect, some embodiments of the present application provide a computer readable medium having stored thereon a computer program which when executed by a processor implements a method as described in the first aspect.

According to the technology of the application, the cluster construction efficiency is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

FIG. 1 is an exemplary system architecture diagram to which some of the present application may be applied;

FIG. 2 is a schematic diagram according to a first embodiment of the present application;

FIG. 3 is a schematic diagram of an application scenario according to an embodiment of the present application;

FIG. 4 is a schematic diagram according to a second embodiment of the present application;

FIG. 5 is a schematic diagram according to a third embodiment of the present application;

FIG. 6 is a schematic diagram according to a fourth embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device suitable for use in implementing the method for managing clusters of embodiments of the present application.

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the present application for managing clusters or apparatus for managing clusters may be applied.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various client applications, such as a cloud computing class application, a cloud service class application, a search class application, may be installed on the terminal devices 101, 102, 103. The terminal devices 101, 102, 103 may create a data processing job to be executed by the cluster.

The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smartphones, tablets, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as a plurality of software or software modules, or as a single software or software module. The present invention is not particularly limited herein.

The server 105 may be a server that provides various services, such as a server that manages a cluster. A cluster may include a plurality of physical computers, on which a plurality of Virtual Machines (VMs) may be installed, a complete computer system that is emulated by software and that has complete hardware system functionality and operates in a completely isolated environment.

The server 105 may obtain cluster information of the target clusters through user interfaces of the terminal devices 101, 102, 103; determining node information of nodes in a target cluster and component information of the nodes according to the cluster information, wherein the node information comprises node types and node numbers; creating nodes according to the node information; and configuring the components corresponding to the component information for the nodes.

It should be noted that, the method for managing clusters provided in the embodiments of the present application may be performed by the server 105, and accordingly, the device for managing clusters may be disposed in the server 105.

The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a method for managing clusters according to the present application is shown. The method for managing clusters comprises the following steps:

in step 201, cluster information of a target cluster is obtained through a user interface.

In this embodiment, a method execution body (e.g., a server shown in fig. 1) for managing a cluster may acquire cluster information of a target cluster through a user interface. The target cluster can comprise a cluster to be created by a user, and the user can input cluster information of a required cluster through a user interface in a click input mode. The cluster information may characterize the properties of the clusters and the properties of the nodes in the clusters, for example may comprise several of the following information items: cluster name, cluster management password, cluster log address, node information, component information of nodes, cluster template information, high available option information, mirror version information, private network information, exclusive system environment information and disaster recovery scheme information. The cluster information may be determined according to a specific traffic scenario and budget. As an example, the business scenario may include a batch scenario and a streaming scenario. The data volume to be processed in the batch processing scene is large, but the real-time requirement is not high, namely the machine in the scene has high requirements on a central processing unit (CPU, central processing unit) and storage, and the machine with high disk throughput and high memory is required to be selected for the streaming computing scene.

Step 202, determining node information of nodes in the target cluster and component information of the nodes according to the cluster information.

In this embodiment, the execution body may determine node information of the nodes in the target cluster and component information of the nodes according to the cluster information obtained in step 201, where the node information may include a node type and a node number. The node type can be divided according to the parameters of the node, for example, the parameters of the CPU and the memory of the node; the nodes may be divided according to functions of the nodes, such as general type, calculation type, memory type, and big data type, in which case each type corresponds to parameters suitable for the CPU and memory of the type, and the nodes may be divided according to roles of the nodes in the cluster, such as management node, task node, master node, and slave node. If the cluster information includes node information and component information of the node, the executing body may directly obtain the node information and the component information of the node included in the cluster information. If the cluster information includes other indirect information that can indicate node information and component information, it is necessary to further determine the node information and the component information indicated by the indirect information.

Step 203, creating a node according to the node information.

In this embodiment, the execution body may create a node according to the node information determined in step 202. The nodes in this embodiment may be implemented by virtual machines or other forms of logical devices that are easy to create and easy to destroy. A Virtual Machine (VM) is a complete computer system that operates in a completely isolated environment with complete hardware system functionality through software emulation, and a physical Machine provides a Virtual Machine with a hardware environment, sometimes referred to as a "host" or "host". Taking the example that the node is implemented by a virtual machine, the execution body may create the node by constructing the virtual machine.

Step 204, configuring the components corresponding to the component information for the nodes.

In this embodiment, the execution body may configure a component corresponding to the component information for the node created in step 203. The components may include system components, application components, service components, and the like. The system component may indicate a system on which the node is mounted, the application component may indicate an application on which the node is mounted, and the service component may indicate a service on which the node is mounted. In addition, components of each node on the cluster are inevitably stopped by the kernel due to sudden situations such as instantaneous memory shortage. The service can be automatically pulled up by setting a daemon process, so that the availability of the service is ensured.

In the process 200 of the method for managing a cluster in this embodiment, cluster information of a target cluster is obtained through a user interface, and creation and configuration of nodes in the cluster are automatically completed according to the cluster information, so that a user does not need to build a physical machine, and build the cluster gradually, thereby improving efficiency of cluster building.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario according to the present embodiment. In the application scenario of fig. 3, the user 301 may input the cluster information of the target cluster through a user interface presented in the terminal 302, the terminal 302 sends the cluster information to the server 303, and the server 303 completes the construction of the target cluster according to the cluster information.

With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for managing clusters is shown. In the flow 400 of this embodiment, after step 204 of the corresponding embodiment of fig. 2, a step of the expansion and contraction operation is added, and the flow 400 specifically includes the following steps:

step 401, obtaining a capacity expansion and contraction condition associated with a target cluster.

In this embodiment, a method execution body (e.g., a server shown in fig. 1) for cluster scaling may obtain a scaling condition associated with a target cluster. The target cluster may be a target cluster constructed in step 201, step 202, step 203, or step 204, and the target cluster may start an automatic capacity expansion function, and the construction process of the target cluster may refer to step 201, step 202, step 203, or step 204, which are not described herein.

Here, the expansion and contraction conditions may include an expansion condition and/or a contraction condition. The expansion and contraction conditions can be set on the basis of time or on the basis of indexes. The business scale has a time period rule, obvious peaks and troughs exist, for example, a daily report, a weekly report and other processing analysis scenes of specific time, the expansion and contraction conditions can be set based on time, the business change does not have a time rule, but the timely operation of important operations is required to be ensured, the cluster scale is required to be dynamically adjusted according to cluster load indexes, and the expansion and contraction conditions can be set based on the indexes.

The time-based set of expansion and contraction conditions may include a periodic expansion and contraction condition, for example, 8 a.m. each monday, and may also include an expansion and contraction condition at a specified time point, for example, 8 a.m. 20 a.m. 5 a.2020.

The scaling condition may be established based on an indicator in the monitoring information generated by the monitoring service of the cluster, e.g. the scaling condition is met when the indicator is greater or less than a preset threshold. In addition, statistical manners of cluster indexes, such as average value, maximum value, minimum value and the like, can be set. The monitoring information may include the load of the cluster, such as the memory usage of the resource manager, the cluster CPU load, etc. The monitoring information may also include resource parameters and/or the status of the node at which the resource is located. The resources may include at least one of: virtual machines, persistent databases (e.g., mySQL), cached databases (e.g., redis). The resource parameters may include CPU usage, memory usage, number of host connections, database usage, etc. The status of the node may include both surviving and non-surviving, the status of the node being detectable by means of heartbeat detection.

In step 402, in response to detecting that the capacity expansion and contraction condition is satisfied, a capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition is performed.

In this embodiment, the executing body may execute the expansion and contraction operation corresponding to the expansion and contraction condition in response to detecting that the expansion and contraction condition acquired in step 401 is satisfied. The expansion and contraction operation corresponds to expansion and contraction conditions, for example, the expansion and contraction conditions are that the memory utilization rate of the resource manager is between 70% and 80%, and the expansion and contraction operation corresponding to the expansion and contraction conditions can be to add 2 virtual machines. For another example, from 8 to 9 in the morning each day, if the cluster workload is large, the capacity expansion and contraction condition may be set to be 8 in the morning at the current time, the corresponding capacity expansion and contraction operation may be to increase a plurality of task nodes, the capacity expansion and contraction condition may be set to be 9 in the morning at the current time, and the corresponding capacity expansion and contraction operation may be to decrease a plurality of task nodes.

The capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition can be set by a user, or can be determined based on historical data, for example, when the memory usage rate of the resource manager is higher than 80%, the memory usage rate of the resource manager can be reduced to a reasonable level by adding 2 virtual machines according to the historical data, and when the capacity expansion and contraction condition is set to be that the memory usage rate of the resource manager is higher than 80%, the capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition is added by 2 virtual machines.

In some optional implementations of the present embodiment, in response to detecting that the capacity expansion and contraction condition is satisfied, performing a capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition includes: in response to detecting that the expansion and contraction conditions are met, outputting transaction information of expansion and contraction transaction according to expansion and contraction operations corresponding to the expansion and contraction conditions; and executing the expansion and contraction operation corresponding to the expansion and contraction conditions in response to receiving the indication information representing the success of the expansion and contraction transaction. The transaction information may include information such as transaction number, transaction amount, payment method, etc., and the indication of the success of the transaction may be from an interface of the payment application. In the implementation manner, the transaction information of the expansion and contraction transaction is automatically output according to the expansion and contraction operation, and the expansion and contraction operation is continuously executed in response to the successful transaction, so that a user does not need to develop a transaction flow additionally, and the convenience of the expansion and contraction operation is further improved.

In some optional implementations of this embodiment, the number of the scaling conditions associated with the target cluster may include one or more, and when the scaling conditions are plural, the priority of the scaling condition detection may be determined. In addition, a period for detecting the capacity expansion and contraction condition can be set, if the capacity expansion and contraction condition is not detected, the next detection can be waited for, if the capacity expansion and contraction condition is detected, whether the capacity expansion and contraction operation to be executed is still in the cooling time or not can be judged further, if the capacity expansion and contraction operation is executed, whether the number of resources in the cluster exceeds the preset number range can be judged, and if the capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition is not executed, the capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition can be executed.

In some optional implementations of this embodiment, the priority of each of the expansion and contraction conditions may be determined according to the creation time, where the later the creation time, the higher the priority of the expansion and contraction condition. For example, in the four conditions of ABCD, a is created first, B is created after a, C is created after B, D is created last, and DCBA is the priority order from high to low. Because the newly created expansion and contraction conditions may be created by the user aiming at some newly-appearing problems, and are more important than other expansion and contraction conditions, the implementation mode determines the priority of each expansion and contraction condition in the expansion and contraction conditions through the creation time, and preferentially detects the newly-created expansion and contraction conditions during subsequent detection, so that the expansion and contraction flow is further optimized.

In this embodiment, the executing body may sequentially detect each of the expansion and contraction conditions according to the determined priority, until the satisfied expansion and contraction condition is detected. For example, the four expansion and contraction conditions of ABCD are sorted from high priority to low priority to be DCBA, whether the rule D is satisfied is preferentially judged, if so, the expansion and contraction operation corresponding to D is executed, and if not, whether the rule B is satisfied is continuously detected.

In some optional implementations of the present embodiment, in response to detecting that the capacity expansion and contraction condition is satisfied, before performing the capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition, the method includes: starting a user state thread aiming at a target cluster; and periodically triggering a detection task for detecting the expansion and contraction conditions according to the setting of the user state thread timer. Because the rule is associated with the clusters, a user mode thread can be started for each cluster to manage related logic of automatic capacity expansion and contraction, the user mode thread is very light in magnitude, and capacity expansion and contraction efficiency can be further improved based on the user mode thread triggering detection task.

In step 403, the type of service provided by the target cluster is determined.

In this embodiment, the executing entity may determine the service type provided by the target cluster. The service type may characterize the service provided by the target cluster, which may be determined according to its functionality, and may include storage types such as data stores, operation types such as log processing, hybrid types, and so forth. The service type provided by the target cluster may also be determined according to a cluster template used in the construction or an onboard tool, for example, the onboard tool is a distributed array-oriented open source database (Hbase), the service type may be determined to be the Hbase type, and the onboard tool may further include a programming model (MapReduce) for parallel operation of a large-scale data set, a resource manager (YARN) of a distributed system infrastructure, a data warehouse tool (hive), a distributed file system (HDFS), and the like.

And step 404, configuring the target cluster after the capacity expansion and contraction operation is executed according to the service type.

In this embodiment, the executing body may configure the target cluster after performing the expansion and contraction operation according to the service type determined in step 403. The configuration of the target cluster may include configuration of an environment, configuration of components, etc., parameters related to the configuration of the environment may be preset, and components in the configuration of the components may be pre-written. For example, the service type provided by the target cluster is a storage type, and a preset environment suitable for providing storage service can be configured for the service type, or a component providing storage service can be configured for the service type. For another example, if the target cluster is of Hbase type, a component responsible for storing actual data and a component for reading and writing data in response to a user request may be configured for the target cluster. In addition, the execution body may only configure the resource added by the expansion and contraction operation, or may configure other resources in the target cluster, for example, may reinstall or start a component for other resources in the target cluster according to the service type.

In some optional implementations of this embodiment, configuring the target cluster after performing the scaling operation according to the service type includes: the added resources for the scaling operation in the target cluster are configured with components matching the service type. For example, a cluster with Hbase, mapReduce, YARN installed, after performing the scaling operation, components such as regional server (region server), node management (node manager) and the like required for Hbase, mapReduce, YARN services may be automatically started for added resources, such as added virtual machines. In the implementation manner, only the components matched with the service types are allocated for the added resources, so that more accurate allocation is realized, and the allocation efficiency is improved.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, in the method for managing a cluster in this embodiment, automatic capacity expansion and contraction of a target cluster are implemented in the process 400, and the target cluster after performing capacity expansion and contraction operation is configured according to the service type of the target cluster, so that the problem that the user lacks experience of cluster use and cannot reasonably expand and configure the cluster according to service conditions is avoided, and the efficiency of cluster operation is improved.

With further reference to FIG. 5, a flow 500 of yet another embodiment of a method for managing clusters is illustrated. In this embodiment of the flow 500, further defining that step 204 of the corresponding embodiment of fig. 2 includes step 504, step 505 and step 506, the flow 500 specifically includes the following steps:

in step 501, cluster information of a target cluster is obtained through a user interface.

In this embodiment, a method execution body (e.g., a server shown in fig. 1) for managing a cluster may acquire cluster information of a target cluster through a user interface.

Step 502, determining node information of nodes in the target cluster and component information of the nodes according to the cluster information.

In this embodiment, the execution body may determine node information of the nodes in the target cluster and component information of the nodes according to the cluster information acquired in step 501.

In some optional implementations of the present embodiment, the cluster information includes cluster template information; and determining node information of nodes in the target cluster and component information of the nodes according to the cluster information, wherein the method comprises the following steps: and determining node information and component information of the nodes in the target cluster according to the corresponding relation between the pre-established cluster template information, the node information and the component information. The cluster template information may include information that may identify the cluster template, such as the name of the cluster template or the number of the cluster template. The cluster template can be a preset template in the system or a template customized by a user. As an example, the user inputs cluster template information "123", and the execution subject may query node information corresponding to "123" and component information of the node. In the implementation mode, a user can complete the establishment of the cluster by inputting the cluster template information, and the information such as node information, component information and the like does not need to be input step by step, so that the efficiency of cluster construction is further improved.

Step 503, creating a node according to the node information.

In this embodiment, the execution body may create a node according to the node information determined in step 502.

In some optional implementations of the present embodiment, before creating the node from the node information, the method further includes: outputting transaction information of a cluster construction transaction according to the cluster information; and creating a node from the node information, comprising: in response to receiving the indication that the cluster construction transaction was successful, a node is created from the node information. The transaction information may include information such as transaction number, transaction amount, payment method, etc., and the indication of the success of the transaction may be from an interface of the payment application. In the implementation manner, the transaction information of the cluster construction transaction is automatically output according to the cluster information, and the cluster construction is continued in response to successful transaction, so that a user can complete the transaction in the cluster construction process without developing a transaction flow, and the convenience of the cluster construction is further improved.

In some alternative implementations of the present embodiment, the cluster information includes high availability option information; and creating a node from the node information, comprising: and in response to the high availability option information indicating that the target cluster is a high availability cluster, creating a master management node and a standby management node, wherein the standby management node replaces the master management node when the master management node fails, and the master management node and the standby management node are deployed in different machine rooms. The management node (Master node) is responsible for managing the cluster, the number of the Master management node and the standby management nodes can be one or more, in the implementation manner, the standby management nodes are created in different machine rooms, the probability that the management nodes are unavailable at the same time is reduced, the availability of the cluster is further improved, in addition, the Master and standby management nodes can be provided with a component (NameNode) and a resource management component (resource manager) for managing the naming space of the file system, and the availability of resource management service is ensured through the reliable coordination system (Zookeeper) of the distributed system.

In some alternative implementations of the present embodiment, the node type includes at least one of: general, computing, memory, big data; and creating a node from the node information, comprising: processor parameters and memory parameters of the node to be created are determined according to the node type. The correspondence between the node type and the processor parameters and memory parameters of the node to be created may be determined by historical data analysis, for example, for a compute node, it may be determined by historical data analysis to use 8-core cpu, 16GB memory. In addition, a plurality of options of processor parameters and memory parameters can be provided in the user interface, and the processor parameters and the memory parameters of the node to be created are determined according to the selected options. In the implementation mode, the node types are divided through the functions of the nodes, so that a user can directly select the corresponding node types according to the functional requirements of the user without inputting each parameter, and the efficiency of cluster construction is improved.

And step 504, building a directed acyclic graph according to the dependency relationship among the components corresponding to the component information.

In this embodiment, the execution body may build a directed acyclic graph (Directed Acyclic Graph, DAG) according to a dependency relationship between components corresponding to the component information. For any directed acyclic graph, its topological ordering is a linear ordering of all its nodes (there may be multiple such node ordering for the same directed graph). The ordering satisfies the condition: for any two nodes U and V in the graph, if there is a directed edge pointing from U to V, then U must appear before V in the topological ordering. In addition, the dependency relationship may be stored using a matrix, for example, 1 may indicate that there is a dependency relationship, and 0 may indicate that there is no dependency relationship.

Step 505, determining the parallel configuration sequence of each component according to the directed acyclic graph.

In this embodiment, the execution body may determine the parallel configuration order of each component according to the directed acyclic graph, for example, from the first node of the directed acyclic graph, several components without unconfigured dependent components may be sequentially used as a group of components that may be configured in parallel. In addition, the path with the longest time consumption in the directed acyclic graph can be determined as the critical path, and the configuration time is further shortened by preferentially configuring the components on the critical path.

And step 506, configuring each component for the node according to the parallel configuration sequence.

In this embodiment, the execution body may configure each component for a node in parallel configuration order.

In some alternative implementations of the present embodiment, the component information includes mirrored version information; and configuring a component corresponding to the component information for the node, including: installing the system components indicated by the mirrored version information for the node. In the implementation manner, by inputting the mirror version information, the user can determine the system components used by the user according to the needs of the user, so that the flexibility of system construction is further improved.

In some optional implementations of the present embodiment, the method further includes: acquiring state information of the node, wherein the state information comprises at least one of the following: processor load information, memory information, input/output (IO) information, disk information; generating a chart according to the state information; and outputting a chart. In the implementation manner, the state information of the nodes is acquired, and the corresponding chart is output, so that a user can monitor the states of the nodes in the cluster more conveniently. The execution body can also acquire the state information and the quantity information of the configured components of the nodes and perform the visualization operation. In addition, the acquired information can be monitored by setting a threshold value and the like, and a monitoring item exceeding the threshold value can trigger an alarm and send a notification through a mail or a short message.

In some alternative implementations of the present embodiment, the components include service components; the method further comprises the following steps: acquiring instruction information for a service initiated by a service component, the instruction information comprising at least one of: starting, stopping and restarting; and controlling the service according to the instruction information. In the implementation mode, the user can conveniently control the service in the cluster through inputting the instruction information, so that the cluster management efficiency is improved.

In this embodiment, the operations of step 501, step 502 and step 503 are substantially the same as those of step 201, step 202 and step 203, and will not be described herein. As can be seen from fig. 5, compared with the embodiment corresponding to fig. 2, in the flow 500 of the method for managing clusters in this embodiment, parallel paths are determined through the directed acyclic graph, and the construction time of the clusters can be further shortened through parallel configuration.

With further reference to fig. 6, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of an apparatus for managing a cluster, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied in various electronic devices.

As shown in fig. 6, the apparatus 600 for managing clusters of the present embodiment includes: a first acquisition unit 601, a first determination unit 602, a creation unit 603, a first configuration unit 604. The first acquisition unit is configured to acquire cluster information of a target cluster through a user interface; a first determining unit configured to determine node information of nodes in the target cluster and component information of the nodes according to the cluster information, the node information including a node type and a node number; a creation unit configured to create a node from the node information; the first configuration unit is configured to configure a component corresponding to the component information for the node.

In this embodiment, specific processes of the first obtaining unit 601, the first determining unit 602, the creating unit 603, and the first configuring unit 604 of the apparatus 600 for managing clusters may refer to steps 201, 202, 203, and 204 in the corresponding embodiment of fig. 2.

In some optional implementations of this embodiment, the first configuration unit is further configured to: establishing a directed acyclic graph according to the dependency relationship among the components corresponding to the component information; determining the parallel configuration sequence of each component according to the directed acyclic graph; and configuring each component for the nodes according to the parallel configuration sequence.

In some alternative implementations of the present embodiment, the cluster information includes high availability option information; and a creation unit further configured to: and in response to the high availability option information indicating that the target cluster is a high availability cluster, creating a master management node and a standby management node, wherein the standby management node replaces the master management node when the master management node fails, and the master management node and the standby management node are deployed in different machine rooms.

In some alternative implementations of the present embodiment, the component information includes mirrored version information; and a first configuration unit further configured to: installing the system components indicated by the mirrored version information for the node.

In some alternative implementations of the present embodiment, the node type includes at least one of: general, computing, memory, big data; and a creation unit further configured to: processor parameters and memory parameters of the node to be created are determined according to the node type.

In some optional implementations of the present embodiment, the cluster information includes cluster template information; and a first determination unit further configured to: and determining node information and component information of the nodes in the target cluster according to the corresponding relation between the pre-established cluster template information, the node information and the component information.

In some optional implementations of this embodiment, the apparatus further includes: a first output unit configured to: outputting transaction information of a cluster construction transaction according to the cluster information; and a creation unit further configured to: in response to receiving the indication that the cluster construction transaction was successful, a node is created from the node information.

In some optional implementations of this embodiment, the apparatus further includes: the second acquisition unit is configured to acquire the expansion and contraction conditions associated with the target cluster; a capacity expansion and contraction unit configured to perform capacity expansion and contraction operations corresponding to capacity expansion and contraction conditions in response to detection that the capacity expansion and contraction conditions are satisfied; a second determining unit configured to determine a service type provided by the target cluster; the second configuration unit is configured to configure the target cluster after the expansion and contraction operation is performed according to the service type.

In some optional implementations of this embodiment, the second configuration unit is further configured to: the added resources for the scaling operation in the target cluster are configured with components matching the service type.

In some optional implementations of the present embodiment, the expansion and contraction unit is further configured to: in response to detecting that the expansion and contraction conditions are met, outputting transaction information of expansion and contraction transaction according to expansion and contraction operations corresponding to the expansion and contraction conditions; and executing the expansion and contraction operation corresponding to the expansion and contraction conditions in response to receiving the indication information representing the success of the expansion and contraction transaction.

In some optional implementations of this embodiment, the apparatus further includes: a third acquisition unit configured to acquire status information of the node, the status information including at least one of: processor load information, memory information, input/output information, disk information; a generation unit configured to generate a graph from the state information; and a second output unit configured to output the graph.

In some alternative implementations of the present embodiment, the components include service components; the apparatus further comprises: a fourth acquisition unit configured to acquire instruction information for a service initiated by the service component, the instruction information including at least one of: starting, stopping and restarting; and a control unit configured to control the service according to the instruction information.

According to embodiments of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 7, is a block diagram of an electronic device for managing a cluster according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.

As shown in fig. 7, the electronic device includes: one or more processors 701, memory 702, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 701 is illustrated in fig. 7.

Memory 702 is a non-transitory computer-readable storage medium provided herein. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the methods provided herein for managing clusters. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods for managing clusters provided herein.

The memory 702 is used as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to a method for managing a cluster in an embodiment of the present application (e.g., the first obtaining unit 501, the first determining unit 502, the creating unit 503, and the first configuring unit 504 shown in fig. 5). The processor 701 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 702, i.e., implements the method for managing clusters in the method embodiments described above.

Memory 702 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic devices for managing clusters, and the like. In addition, the memory 702 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 702 may optionally include memory located remotely from processor 701, which may be connected to the electronic devices for managing the clusters via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device for the method of managing clusters may further comprise: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, in fig. 7 by way of example.

The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device used to manage the cluster, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointer stick, one or more mouse buttons, a trackball, a joystick, and the like. The output device 704 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, the cluster construction efficiency is improved.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.

The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. A method for managing a cluster, comprising:

acquiring cluster information of a target cluster through a user interface;

determining node information of nodes in the target cluster and component information of the nodes according to the cluster information, wherein the node information comprises node types and node numbers;

creating a node according to the node information;

configuring the component corresponding to the component information for the node comprises the following steps: establishing a directed acyclic graph according to the dependency relationship among the components corresponding to the component information; determining the parallel configuration sequence of each component according to the directed acyclic graph; and configuring the components for the nodes according to the parallel configuration sequence.

2. The method of claim 1, wherein the cluster information includes high available option information; and

the creating a node according to the node information includes:

and in response to the high availability option information indicating that the target cluster is a high availability cluster, creating a main management node and a standby management node, wherein the standby management node replaces the main management node when the main management node fails, and the main management node and the standby management node are deployed in different machine rooms.

3. The method of claim 1, wherein the component information comprises mirrored version information; and

the configuring the component corresponding to the component information for the node includes:

and installing the system component indicated by the mirror version information for the node.

4. The method of claim 1, wherein the node type comprises at least one of: general, computing, memory, big data; and

the creating a node according to the node information includes:

and determining processor parameters and memory parameters of the node to be created according to the node type.

5. The method of claim 1, wherein the cluster information comprises cluster template information; and

The determining node information of the nodes in the target cluster and the component information of the nodes according to the cluster information includes:

and determining node information of the nodes in the target cluster and component information of the nodes according to the corresponding relation between the pre-established cluster template information, the node information and the component information.

6. The method of claim 1, wherein prior to creating a node from the node information, the method further comprises:

outputting transaction information of a cluster construction transaction according to the cluster information; and

the creating a node according to the node information includes:

and responding to receiving indication information representing success of the cluster construction transaction, and creating nodes according to the node information.

7. The method of claim 1, the method further comprising:

obtaining expansion and contraction conditions associated with a target cluster;

in response to detecting that the capacity expansion and contraction conditions are met, executing capacity expansion and contraction operations corresponding to the capacity expansion and contraction conditions;

determining the service type provided by the target cluster;

and configuring the target cluster after the capacity expansion and contraction operation is executed according to the service type.

8. The method of claim 7, wherein the configuring the target cluster after performing the scaling operation according to the service type comprises:

And configuring components matched with the service type for the resources added for the capacity expansion operation in the target cluster.

9. The method of claim 7, wherein the performing, in response to detecting that the scaling condition is satisfied, a scaling operation corresponding to the scaling condition comprises:

responding to the detection that the expansion and contraction conditions are met, and outputting transaction information of expansion and contraction transaction according to expansion and contraction operations corresponding to the expansion and contraction conditions;

and responding to receiving indication information representing success of the expansion and contraction transaction, and executing expansion and contraction operation corresponding to the expansion and contraction condition.

10. The method of claim 1, the method further comprising:

acquiring state information of the node, wherein the state information comprises at least one of the following: processor load information, memory information, input/output information, disk information;

generating a chart according to the state information;

outputting the chart.

11. The method of any of claims 1-10, wherein the component comprises a service component; and

the method further comprises the steps of:

acquiring instruction information for a service initiated by the service component, wherein the instruction information comprises at least one of the following: starting, stopping and restarting;

And controlling the service according to the instruction information.

12. An apparatus for managing a cluster, comprising:

a first acquisition unit configured to acquire cluster information of a target cluster through a user interface;

a first determining unit configured to determine node information of nodes in the target cluster and component information of the nodes according to the cluster information, wherein the node information includes node types and node numbers;

a creation unit configured to create a node from the node information;

a first configuration unit configured to configure a component corresponding to the component information for the node; the first configuration unit is further configured to: establishing a directed acyclic graph according to the dependency relationship among the components corresponding to the component information; determining the parallel configuration sequence of each component according to the directed acyclic graph; and configuring the components for the nodes according to the parallel configuration sequence.

13. The apparatus of claim 12, wherein the cluster information comprises high available option information; and

the creation unit is further configured to:

14. The apparatus of claim 12, wherein the component information comprises mirrored version information; and

the first configuration unit is further configured to:

15. The apparatus of claim 12, wherein the node type comprises at least one of: general, computing, memory, big data; and

the creation unit is further configured to:

16. The apparatus of claim 12, wherein the cluster information comprises cluster template information; and

the first determination unit is further configured to:

17. The apparatus of claim 12, wherein the apparatus further comprises:

a first output unit configured to: outputting transaction information of a cluster construction transaction according to the cluster information; and

the creation unit is further configured to: and responding to receiving indication information representing success of the cluster construction transaction, and creating nodes according to the node information.

18. The apparatus of claim 12, the apparatus further comprising:

the second acquisition unit is configured to acquire the expansion and contraction conditions associated with the target cluster;

a capacity expansion and contraction unit configured to execute capacity expansion and contraction operation corresponding to the capacity expansion and contraction condition in response to detection that the capacity expansion and contraction condition is satisfied;

a second determining unit configured to determine a type of service provided by the target cluster;

and the second configuration unit is configured to configure the target cluster after the expansion and contraction operation is executed according to the service type.

19. The apparatus of claim 18, wherein the second configuration unit is further configured to:

20. The apparatus of claim 18, wherein the expansion unit is further configured to:

21. The apparatus of claim 12, the apparatus further comprising:

a third acquisition unit configured to acquire status information of the node, the status information including at least one of: processor load information, memory information, input/output information, disk information;

a generation unit configured to generate a graph from the state information;

and a second output unit configured to output the graph.

22. The apparatus of any of claims 12-21, wherein the component comprises a service component; and

the apparatus further comprises:

a fourth acquisition unit configured to acquire instruction information for a service initiated by the service component, the instruction information including at least one of: starting, stopping and restarting;

and a control unit configured to control the service according to the instruction information.

23. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-11.

24. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-11.