CN112235383A - Container service cluster node scheduling method and device, server and storage medium - Google Patents

Container service cluster node scheduling method and device, server and storage medium Download PDF

Info

Publication number
CN112235383A
CN112235383A CN202011073137.1A CN202011073137A CN112235383A CN 112235383 A CN112235383 A CN 112235383A CN 202011073137 A CN202011073137 A CN 202011073137A CN 112235383 A CN112235383 A CN 112235383A
Authority
CN
China
Prior art keywords
node
nodes
container service
cluster
service cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011073137.1A
Other languages
Chinese (zh)
Other versions
CN112235383B (en
Inventor
杨豪
焦劼
崔皓
于东超
朱文乐
阮厦城
杨晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011073137.1A priority Critical patent/CN112235383B/en
Publication of CN112235383A publication Critical patent/CN112235383A/en
Application granted granted Critical
Publication of CN112235383B publication Critical patent/CN112235383B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a container service cluster node scheduling method and device, a server and a storage medium, and the application can be applied to the technical field of cloud, such as the field of cloud education, wherein the method comprises the following steps: acquiring the cluster load resource occupation amount of a target container service cluster; if the cluster load resource occupation is greater than or equal to the load resource occupation threshold, determining the number of nodes to be expanded according to the number of nodes of the target container service cluster and the node number threshold; acquiring node attribute determination information, and determining a node to be expanded according to the node attribute determination information and the number of the nodes to be expanded; and expanding the target container service cluster based on the node to be expanded. The method can be executed by the cloud server, and by adopting the method, the capacity expansion time of the container service cluster can be automatically shortened.

Description

Container service cluster node scheduling method and device, server and storage medium
Technical Field
The present application relates to the field of network technologies, and in particular, to a method and an apparatus for scheduling container service cluster nodes, a server, and a storage medium.
Background
In the prior art, when a container instance which cannot be scheduled due to lack of available resources appears in a container service cluster, a node scheduling system performs automatic capacity expansion operation in a node purchasing mode, in this mode, the time for a user to wait for a capacity expansion node and pull a mirror image is usually more than six minutes, the time consumed by the container service cluster capacity expansion node is too long, and user experience is affected.
Disclosure of Invention
The embodiment of the application provides a method and a device for scheduling nodes of a container service cluster, a server and a storage medium, which can shorten the capacity expansion time of the container service cluster.
An aspect of the present application provides a method for scheduling a container service cluster node, including:
acquiring the cluster load resource occupation amount of a target container service cluster;
if the cluster load resource occupation is greater than or equal to the load resource occupation threshold, determining the number of nodes to be expanded according to the node number of the target container service cluster and the node number threshold;
acquiring node attribute determination information, and determining a node to be expanded according to the node attribute determination information and the number of the nodes to be expanded;
and expanding the capacity of the target container service cluster based on the node to be expanded.
With reference to the first aspect, in a possible implementation manner, the determining, according to the number of nodes of the target container service cluster and the threshold of the number of nodes, the number of nodes to be expanded includes:
if the number of the nodes is smaller than or equal to the threshold value of the number of the nodes, determining the number of the nodes to be expanded based on the target node expansion multiple and the number of the nodes;
and if the number of the nodes is larger than the threshold value of the number of the nodes, determining the threshold value of the number of the nodes as the number of the nodes to be expanded.
With reference to the first aspect, in a possible implementation manner, the node attribute determination information includes a shutdown node number and/or capacity expansion node routing information of the target container service cluster;
the determining a node to be expanded according to the node attribute determination information and the number of the nodes to be expanded includes:
if the number of the nodes to be expanded is less than or equal to the number of the shutdown nodes, determining M shutdown nodes as the nodes to be expanded from the shutdown nodes included in the target container service cluster, wherein M is equal to the number of the nodes to be expanded;
if the number of the nodes to be expanded is greater than the number of the shutdown nodes, acquiring K1 scheduling nodes based on the routing information of the expansion nodes, and determining the K1 scheduling nodes and all the shutdown nodes included in the target container service cluster as the nodes to be expanded, wherein K1 is equal to the difference between the number of the nodes to be expanded and the number of the shutdown nodes.
With reference to the first aspect, in a possible implementation manner, the capacity expansion node routing information includes subnet information;
the obtaining K1 scheduling nodes based on the capacity expansion node routing information includes:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and acquiring scheduling node configuration information from a node configuration information acquisition path indicated by the subnet information, and acquiring K1 scheduling nodes according to the scheduling node configuration information.
With reference to the first aspect, in a possible implementation manner, the capacity expansion node routing information includes a cluster identifier of a container service cluster;
the obtaining K1 scheduling nodes based on the capacity expansion node routing information includes:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and scheduling K1 shutdown nodes as scheduling nodes from the alternative container service cluster identified by the cluster identification.
With reference to the first aspect, in a possible implementation manner, the node to be expanded is a shutdown node;
the expanding the target container service cluster based on the node to be expanded includes:
and sending a first trigger instruction to the shutdown node according to the node identifier of the shutdown node, triggering the shutdown node to start and canceling isolation so as to realize the expansion of the target container service cluster.
With reference to the first aspect, in a possible implementation manner, the node to be expanded is a scheduling node;
the expanding the target container service cluster based on the node to be expanded includes:
and sending a second trigger instruction to the scheduling node according to the node identifier of the scheduling node, triggering the scheduling node to start and join the target container service cluster, so as to realize the expansion of the target container service cluster.
With reference to the first aspect, in one possible implementation, the method further includes:
if the cluster load resource occupation amount is smaller than the load resource occupation amount threshold value, acquiring a cluster state value of the target container service cluster;
determining the number of nodes to be reduced according to the cluster state value and the cluster load resource occupation amount;
acquiring the load resource occupation amount of each node in the target container service cluster, and determining N nodes from the target container service cluster as nodes to be subjected to capacity reduction according to the load resource occupation amount of each node, wherein N is equal to the number of the nodes to be subjected to capacity reduction;
and carrying out capacity reduction on the target container service cluster based on the nodes to be subjected to capacity reduction.
An embodiment of an aspect of the present application provides a container service cluster node scheduling apparatus, including:
the load resource acquisition module is used for acquiring the cluster load resource occupation amount of the target container service cluster;
the capacity expansion quantity determining module is used for determining the quantity of the nodes to be subjected to capacity expansion according to the node quantity of the target container service cluster and the node quantity threshold if the load resource occupation of the cluster is greater than or equal to the load resource occupation threshold;
the acquisition determining module is used for acquiring node attribute determining information and determining the nodes to be expanded according to the node attribute determining information and the number of the nodes to be expanded;
and the capacity expansion module is used for expanding the capacity of the target container service cluster based on the number of the nodes to be expanded.
With reference to the second aspect, in a possible implementation manner, the expansion quantity determining module includes:
a first capacity expansion quantity determining unit, configured to determine, if the number of nodes is less than or equal to the node quantity threshold, the number of nodes to be capacity expanded based on a target node capacity expansion multiple and the number of nodes;
and a second capacity expansion number determining unit, configured to determine the node number threshold as the number of nodes to be subjected to capacity expansion if the number of nodes is greater than the node number threshold.
With reference to the second aspect, in a possible implementation manner, the node attribute determination information includes the number of shutdown nodes and/or capacity expansion node routing information of the target container service cluster;
the above acquisition determining module includes:
a first node-to-be-expanded determining unit, configured to determine, if the number of the nodes to be expanded is less than or equal to the number of the shutdown nodes, M shutdown nodes as nodes to be expanded from the shutdown nodes included in the target container service cluster, where M is equal to the number of the nodes to be expanded;
and a second capacity expansion node determining unit, configured to, if the number of the capacity expansion nodes is greater than the number of the shutdown nodes, obtain K1 scheduling nodes based on the capacity expansion node routing information, and determine the K1 scheduling nodes and all the shutdown nodes included in the target container service cluster as capacity expansion nodes, where K1 is equal to a difference between the number of the capacity expansion nodes and the number of the shutdown nodes.
With reference to the second aspect, in a possible implementation manner, the capacity expansion node routing information includes subnet information;
the second capacity expansion node determining unit is specifically configured to:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and acquiring scheduling node configuration information from a node configuration information acquisition path indicated by the subnet information, and acquiring K1 scheduling nodes according to the scheduling node configuration information.
With reference to the second aspect, in a possible implementation manner, the capacity expansion node routing information includes a cluster identifier of the container service cluster;
the second capacity expansion node determining unit is specifically configured to:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and scheduling K1 shutdown nodes as scheduling nodes from the alternative container service cluster identified by the cluster identification.
With reference to the second aspect, in a possible implementation manner, the node to be expanded is a shutdown node;
above-mentioned dilatation module includes:
and the starting-up capacity expansion unit is used for sending a first trigger instruction to the shutdown node according to the node identifier of the shutdown node, triggering the shutdown node to start up and canceling isolation so as to realize the capacity expansion of the target container service cluster.
With reference to the second aspect, in a possible implementation manner, the node to be expanded is a scheduling node;
above-mentioned dilatation module includes:
and the scheduling capacity expansion unit is used for sending a second trigger instruction to the scheduling node according to the node identifier of the scheduling node, triggering the scheduling node to start and join the target container service cluster, so as to realize the capacity expansion of the target container service cluster.
With reference to the second aspect, in a possible implementation manner, the apparatus further includes: and a capacity reduction module.
The capacity reduction module is used for:
if the cluster load resource occupation amount is smaller than the load resource occupation amount threshold value, acquiring a cluster state value of the target container service cluster;
determining the number of nodes to be reduced according to the cluster state value and the cluster load resource occupation amount;
acquiring the load resource occupation amount of each node in the target container service cluster, and determining N nodes from the target container service cluster as nodes to be subjected to capacity reduction according to the load resource occupation amount of each node, wherein N is equal to the number of the nodes to be subjected to capacity reduction;
and carrying out capacity reduction on the target container service cluster based on the nodes to be subjected to capacity reduction.
An aspect of an embodiment of the present application provides a server, including: the processor, the memory and the transceiver are connected with each other, wherein the memory is used for storing a computer program which supports the container service cluster node scheduling device to execute the container service cluster node scheduling method, and the computer program comprises program instructions; the processor is configured to invoke the program instructions to execute the container service cluster node scheduling method as described in the first aspect of the present application.
An aspect of an embodiment of the present application provides a computer storage medium, where a computer program is stored in the computer storage medium, and the computer program includes program instructions; the program instructions, when executed by a processor, cause the processor to perform a container service cluster node scheduling method as described above in an aspect of an embodiment of the present application.
In the embodiment of the application, if the cluster load resource occupancy of the target container service cluster is greater than or equal to the load resource occupancy threshold, the number of nodes to be expanded can be determined according to the number of nodes of the target container service cluster and the node number threshold, and the nodes to be expanded can be determined according to the node attribute determination information and the number of nodes to be expanded, so that the target container service cluster can be expanded based on the nodes to be expanded, a certain amount of idle available resources can be guaranteed to be reserved in the target container cluster all the time, and the expansion duration of the container service cluster is shortened.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic architecture diagram of a cluster node scheduling system provided in the present application;
fig. 2 is a schematic flow chart of a container service cluster node scheduling method according to an embodiment of the present application;
FIG. 3 is a flow diagram illustrating a process for extending a target container service cluster based on a shutdown node according to the present application;
fig. 4 is another schematic flow chart of a container service cluster node scheduling method provided in an embodiment of the present application;
fig. 5 is a schematic flow chart of acquiring K1 scheduling nodes based on subnet information according to the present application;
fig. 6 is a schematic flow chart illustrating the expansion of the target container service cluster based on the node to be expanded according to the present application;
fig. 7 is another schematic flow chart illustrating expanding the target container service cluster based on the node to be expanded according to the present application;
FIG. 8 is a flow diagram illustrating a process of performing capacity reduction on a target container service cluster based on a node to be capacity reduced according to the present application;
FIG. 9 is a flow chart illustrating a process of releasing a target container service cluster based on a shutdown node according to the present application;
FIG. 10 is a flow diagram illustrating self-healing of a target container service cluster as provided herein;
fig. 11 is a schematic structural diagram of a container service cluster node scheduling apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The method for dispatching the container service cluster nodes can determine the number of nodes to be expanded according to the number of the nodes of the target container service cluster and the number threshold of the nodes under the condition that the cluster load resource occupation of the target container service cluster is larger than or equal to the load resource occupation threshold, determine the nodes to be expanded according to the node attribute determination information and the number of the nodes to be expanded, further expand the target container service cluster based on the nodes to be expanded, guarantee that certain idle available resources are reserved in the target container cluster all the time, and shorten the expansion time of the container service cluster.
The container service cluster node scheduling method provided by the present application is applicable to a cluster node scheduling system, which includes a node scheduling platform, a database and a target container service cluster, please refer to fig. 1, which is an architecture diagram of the cluster node scheduling system provided by the present application. As shown in fig. 1, the architecture diagram includes a node scheduling platform 100, a database 101, and a target container service cluster 102, where the target container service cluster 102 may include a plurality of nodes, and as shown in fig. 1, may specifically include a node 102a, a node 102b, nodes 102c, …, and a node 102 n.
Each node in the node scheduling platform 100 and the target container service cluster 102 includes, but is not limited to, a terminal or a server, where the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. In short, the database 101 can be regarded as an electronic file cabinet, i.e., a place for storing electronic files, and a user can add, query, update, delete, etc. data in the files. A "database" is a collection of data that is stored together in a manner that can be shared by multiple users, has as little redundancy as possible, and is independent of the application. The present application will be described with the database 101 as a redis database.
In the container service cluster node scheduling method provided by the present application, the node scheduling platform 100 obtains the cluster load resource occupancy of the target container service cluster by accessing the target container service cluster 102, determines the number of nodes to be expanded according to the node number of the target container service cluster and the node number threshold when the cluster load resource occupancy is greater than or equal to the load resource occupancy threshold, obtains the node attribute determination information from the database 101, determines the nodes to be expanded according to the node attribute determination information and the number of nodes to be expanded, and then expands the target container service cluster 102 based on the nodes to be expanded.
Please refer to fig. 2, which is a flowchart illustrating a container service cluster node scheduling method according to an embodiment of the present application. As shown in fig. 2, the method provided by the present application may include the following steps:
s101, acquiring the cluster load resource occupation amount of the target container service cluster.
In some possible embodiments, the node scheduling platform may send a request for acquiring the cluster load resource occupancy to the target container service cluster according to a preset frequency (e.g., once every three seconds), after receiving the request, the target container service cluster acquires the load resource occupancy of each node in the target container service cluster according to the request, calculates the cluster load resource occupancy of the target container service cluster based on the load resource occupancy of each node, and returns the cluster load resource occupancy of the target container service cluster, and the node scheduling platform receives the cluster load resource occupancy of the target container service cluster returned by the target container service cluster.
S102, if the load resource occupation of the cluster is larger than or equal to the load resource occupation threshold, determining the number of the nodes to be expanded according to the number of the nodes of the target container service cluster and the node number threshold.
In some possible embodiments, if the cluster load resource occupancy is greater than or equal to the load resource occupancy threshold, the relationship between the node number of the target container service cluster and the node number threshold is further determined: if the number of the nodes is smaller than or equal to the threshold value of the number of the nodes, determining the number of the nodes to be expanded based on the target node expansion multiple and the number of the nodes; and if the number of the nodes is larger than the threshold value of the number of the nodes, determining the threshold value of the number of the nodes as the number of the nodes to be expanded. The node quantity threshold is determined based on the success rate of purchasing nodes, the target node expansion multiple can be any positive number, and for a cloud education scene, the target node expansion multiple is generally 1 due to the fact that the user scale is basically fixed. Among them, Cloud Computing Education (CCEDU) refers to an Education platform service based on Cloud Computing business model application. On the cloud platform, all education institutions, training institutions, enrollment service institutions, propaganda institutions, industry associations, management institutions, industry media, legal structures and the like are integrated into a resource pool in a centralized cloud mode, all resources are mutually displayed and interacted and communicated according to needs to achieve intentions, so that education cost is reduced, and efficiency is improved.
For example, assuming that the number of nodes of the target container service cluster is 10, the threshold value of the number of nodes is 30, and the expansion multiple of the target node is 1, the product 20 between the number of nodes 10 of the target container service cluster and the expansion multiple 1 of the target node is determined as the number of nodes to be expanded.
For another example, assuming that the number of nodes of the target container service cluster is 40 and the threshold of the number of nodes is 30, the threshold of the number of nodes 30 is determined as the number of nodes to be expanded.
It should be noted that, in order to avoid occurrence of a situation that the target container service cluster wastes resources due to repeated capacity expansion in the actual capacity expansion process, the node scheduling platform may send a cluster state value update instruction to the redis database after determining the number of nodes to be expanded, where the instruction carries the number of nodes to be expanded, and the redis database receives the instruction and updates the available amount of cluster resources in the cluster state value of the target container service cluster according to the number of nodes to be expanded carried by the instruction.
S103, obtaining node attribute determination information, and determining the nodes to be expanded according to the node attribute determination information and the number of the nodes to be expanded.
Optionally, the node attribute determination information may include the number of shutdown nodes of the target container service cluster.
In some feasible embodiments, the node scheduling platform obtains the number of shutdown nodes of the target container service cluster, and if the number of nodes to be expanded is less than or equal to the number of shutdown nodes, determines M shutdown nodes as nodes to be expanded from the shutdown nodes included in the target container service cluster, where M is equal to the number of nodes to be expanded.
Specifically, a node scheduling platform sends a shutdown node information acquisition request to a redis database, where the request carries a cluster identifier of a target container service cluster, the redis database acquires shutdown node information of the target container service cluster according to the cluster identifier carried by the request after receiving the request, and returns the shutdown node information of the target container service cluster, the node scheduling platform receives shutdown node information of the target container service cluster returned by the redis database, where the shutdown node information includes the number of shutdown nodes, and when the number of shutdown nodes is greater than or equal to the number of nodes to be expanded, M shutdown nodes are selected as the nodes to be expanded from all the shutdown nodes included in the shutdown node information.
And S104, expanding the target container service cluster based on the node to be expanded.
Optionally, the node to be expanded may be a shutdown node.
In some feasible embodiments, the node scheduling platform sends a first trigger instruction to the shutdown node according to the node identifier of the shutdown node, and triggers the shutdown node to start up and cancel isolation, so as to implement capacity expansion of the target container service cluster.
Specifically, please refer to fig. 3, which is a flow diagram illustrating the process of extending the target container service cluster based on the shutdown node according to the present application. As shown in fig. 3, the expanding the target container service cluster by the node scheduling platform based on the shutdown node includes the following steps:
s1041, the node scheduling platform sends a starting-up instruction to the shutdown node.
Specifically, the node scheduling platform sends a power-on instruction to the power-off node according to the power-off node identifier.
S1042, the shutdown node is started according to the startup instruction.
And S1043, the shutdown node sends a node startup result to the node scheduling platform.
S1044, the node scheduling platform sends a node state query request to the target container service cluster, wherein the node state query request carries the node identifier of the shutdown node.
Specifically, the node scheduling platform receives the node startup result, and sends a shutdown node state query request to the target container service cluster when the node startup result is successful, where the request carries a node identifier of a shutdown node.
S1045, the target container service cluster acquires the node state of the shutdown node according to the node identifier of the shutdown node.
Specifically, the target container service cluster receives the request, and acquires the node state of the shutdown node according to the node identifier of the shutdown node carried by the request.
S1046, the target container service cluster sends the node state of the shutdown node to the node scheduling platform.
S1047, the node scheduling platform sends an isolation canceling instruction to the target container service cluster, wherein the isolation canceling instruction carries the node identifier of the shutdown node.
Specifically, the node scheduling platform receives the node state of the shutdown node, and sends an isolation cancellation instruction to the target container service cluster when the node state is the startup state, where the instruction carries the node identifier of the shutdown node.
S1048, the target container service cluster cancels the isolation of the shutdown node according to the node identifier of the shutdown node.
Specifically, the target container service cluster receives the instruction, and cancels the isolation of the shutdown node according to the node identifier of the shutdown node carried by the instruction.
S1049, the target container service cluster sends the node isolation cancellation result to the node scheduling platform.
S10410, the node scheduling platform determines that the capacity expansion of the shutdown node is successful under the condition that the shutdown node cancellation isolation is successful.
Specifically, the node scheduling platform receives the node isolation cancellation result, and determines that the capacity expansion of the shutdown node is successful under the condition that the node isolation cancellation result is that the isolation cancellation is successful, so that the capacity expansion of the target container service cluster is realized.
It should be noted that, in order to ensure that the actual number of capacity expansion nodes of the target container service cluster is consistent with the number of nodes to be expanded, an update instruction of the target container service cluster may be sent to the redis database after capacity expansion operation is performed on each shutdown node, where the update instruction carries the number of shutdown nodes that complete the capacity expansion operation and a node identifier, so that the redis database updates the cluster resource availability in the cluster state value of the target container service cluster according to the number of shutdown nodes that complete the capacity expansion operation and the node identifier carried by the instruction.
In the embodiment of the application, if the cluster load resource occupancy of the target container service cluster is greater than or equal to the load resource occupancy threshold, the number of nodes to be expanded can be determined according to the number of nodes of the target container service cluster and the node number threshold, and the nodes to be expanded can be determined according to the node attribute determination information and the number of nodes to be expanded, so that the target container service cluster can be expanded based on the nodes to be expanded, a certain amount of idle available resources can be guaranteed to be reserved in the target container cluster all the time, and the expansion duration of the container service cluster is shortened.
Please refer to fig. 4, which is another flowchart illustrating a container service cluster node scheduling method according to an embodiment of the present application. As shown in fig. 4, the method provided by the present application may include the following steps:
s201, acquiring the cluster load resource occupation amount of the target container service cluster.
Here, the specific implementation manner of step S201 may refer to the description of step S101 in the corresponding embodiment, and is not described herein again.
S202, judging whether the cluster load occupation amount is larger than or equal to a load resource occupation amount threshold value or not.
Specifically, if the cluster load occupation amount is greater than or equal to the load resource occupation amount threshold, step S203 is executed, otherwise, step S206 is executed.
Further, in order to avoid frequent capacity reduction of the target container service cluster due to frequent creation or deletion of instances by the user, the number of times that the cluster load occupation amount is smaller than the load resource occupation amount threshold value may be calculated within a preset time duration, and in a case that the number of times is equal to the preset number threshold value, the step S206 is executed.
S203, determining the number of nodes to be expanded according to the number of the nodes of the target container service cluster and the threshold value of the number of the nodes.
Here, the specific implementation manner of step S203 may refer to the description of step S102 in the corresponding embodiment, and is not described herein again. Step S204 is then performed.
And S204, acquiring node attribute determination information, and determining the nodes to be expanded according to the node attribute determination information and the number of the nodes to be expanded.
Optionally, the node attribute determination information may include the number of shutdown nodes and capacity expansion node routing information of the target container service cluster.
In some feasible embodiments, the node scheduling platform obtains the number of shutdown nodes and the routing information of capacity expansion nodes of the target container service cluster, and if the number of nodes to be expanded is greater than the number of shutdown nodes, obtains K1 scheduling nodes based on the routing information of capacity expansion nodes, and determines K1 scheduling nodes and all shutdown nodes included in the target container service cluster as nodes to be expanded, where K1 is equal to a difference value between the number of nodes to be expanded and the number of shutdown nodes.
In some possible implementations, the capacity expansion node routing information may include subnet information.
Here, please refer to step S103 in the embodiment shown in fig. 2 for a specific implementation manner of the node scheduling platform obtaining the number of shutdown nodes of the target container service cluster, which is not described herein again.
Please refer to fig. 5, which is a flowchart illustrating a process of acquiring K1 scheduling nodes based on subnet information according to the present application. As shown in fig. 5, the node scheduling platform acquiring K1 scheduling nodes based on the subnet information includes the following steps:
s2041, the node scheduling platform sends a network identifier acquisition request to the cluster management platform.
Specifically, the node scheduling platform sends a network identifier acquisition request to the cluster management platform, where the request carries the cluster identifier of the target container service cluster.
S2042, the cluster management platform obtains the network identifier of the private network where the target container service cluster is located according to the network identifier obtaining request.
Specifically, the cluster management platform acquires the network identifier of the private network where the target container service cluster is located according to the network identifier acquisition request.
S2043, the cluster management platform sends the network identifier of the private network where the target container service cluster is located to the node scheduling platform.
S2044, the node scheduling platform sends a subnet information acquisition request to the private network.
Specifically, the node scheduling platform sends a subnet information acquisition request to the private network, where the subnet information acquisition request carries the network identifier of the private network where the target container service cluster is located.
S2045, the private network obtains the subnet information according to the subnet information obtaining request.
S2046, the private network sends the subnet information to the node scheduling platform.
S2047, the node scheduling platform sends a node configuration information acquisition request to the universal cloud server.
Specifically, the node scheduling platform sends a node configuration information acquisition request to the universal cloud server, where the request carries subnet information.
And S2048, the universal cloud server acquires the configuration information of the scheduling node according to the node configuration information acquisition request.
Specifically, the general cloud service receives the request, and acquires the scheduling node configuration information according to the available area of the purchasing node indicated by the subnet information carried by the request.
S2049, the universal cloud server sends scheduling node configuration information to the node scheduling platform.
S20410, the node scheduling platform sends a node purchasing instruction to the cluster management platform.
Specifically, the node scheduling platform receives the scheduling node configuration information and sends a node purchasing instruction to the cluster management platform, where the instruction carries the scheduling node configuration information.
S20411, the cluster management platform sends a node purchasing instruction to the selling node.
Specifically, the cluster management platform sends a node purchase instruction to K1 selling nodes with configuration information consistent with the configuration information of the scheduling node.
And S20412, the selling node executes purchasing operation according to the node purchasing instruction.
S20413, the selling node sends the purchase result to the cluster management platform.
S20414, the cluster management platform sends the purchase result to the node scheduling platform.
And S20415, if the purchase result is that the purchase is successful, determining the K1 nodes with successful purchase as K1 scheduling nodes.
And then, the node scheduling platform takes the K1 scheduling nodes and all shutdown nodes contained in the shutdown node information as nodes to be expanded. In the present application, the cluster management platform may be Tencent cloud Container service (TKE).
In some possible implementations, the capacity expansion node routing information may include a cluster identification of the container service cluster.
Specifically, the node scheduling platform sends a shutdown node information acquisition request to the redis database, where the request carries a cluster identifier of the target container service cluster, and the redis database acquires shutdown node information of the target container service cluster according to the cluster identifier carried by the request after receiving the request, and returns the shutdown node information of the target container service cluster. Meanwhile, the node scheduling platform sends a request for acquiring the alternative cluster identifier set to the cluster management platform, wherein the request carries the network identifier of the private network where the target container service cluster is located. After receiving the request, the cluster management platform acquires the cluster identifier of the container service cluster with the network identifier of the network where the cluster is located being consistent with the network identifier of the private network according to the request, generates an alternative cluster identifier set according to the cluster identifier, and returns the alternative cluster identifier set. And the node scheduling platform receives the candidate cluster identifier set, selects a cluster identifier of a container service cluster from the candidate cluster identifier set, and sends a shutdown node information acquisition request to the redis database, wherein the request carries the cluster identifier of the container service cluster. After receiving the request, the redis database acquires shutdown node information of the alternative container service cluster identified by the cluster identifier of the container service cluster according to the cluster identifier of the container service cluster carried by the request, and returns the shutdown node information of the alternative container service cluster. And then, the node scheduling platform receives shutdown node information of the target container service cluster and shutdown node information of the alternative container service cluster, wherein the shutdown node information of the target container service cluster comprises the number of shutdown nodes, when the number of the shutdown nodes is smaller than the number of nodes to be expanded, the difference K1 between the number of the nodes to be expanded and the number of the shutdown nodes is determined, K1 shutdown nodes are selected from joint nodes contained in the shutdown node information of the alternative container service cluster to serve as scheduling nodes, and the K1 scheduling nodes and all the shutdown nodes contained in the shutdown node information of the target container service cluster are used as nodes to be expanded.
Step S205 is then performed.
S205, expanding the capacity of the target container service cluster based on the node to be expanded.
In some feasible embodiments, the node scheduling platform sends a first trigger instruction according to all shutdown nodes in the target container service cluster to trigger all the shutdown nodes to be powered on and cancel isolation, and sends a second trigger instruction to K1 scheduling nodes according to node identifiers of K1 scheduling nodes to trigger K1 scheduling nodes to be powered on and join the target container service cluster, so as to expand the target container service cluster. The scheduling node herein refers to a node acquired by purchasing the node.
Specifically, please refer to fig. 6, which is a flow diagram illustrating a process of expanding a target container service cluster based on a node to be expanded. As shown in fig. 6, the expanding the target container service cluster by the node scheduling platform based on the node to be expanded includes the following steps:
s2051, the node scheduling platform sends a starting instruction to the shutdown node.
And S2052, the shutdown node is started according to the startup instruction.
And S2053, the shutdown node sends a node startup result to the node scheduling platform.
And S2054, the node scheduling platform sends a node state query request to the target container service cluster, wherein the node state query request carries the node identifier of the shutdown node.
And S2055, the target container service cluster acquires the node state of the shutdown node according to the node identifier of the shutdown node.
S2056, the target container service cluster sends the node state of the shutdown node to the node scheduling platform.
And S2057, the node scheduling platform sends an isolation canceling instruction to the target container service cluster, wherein the isolation canceling instruction carries the node identifier of the shutdown node.
And S2058, the target container service cluster cancels the isolation of the shutdown node according to the node identifier of the shutdown node.
S2059, the target container service cluster sends the node isolation cancellation result to the node scheduling platform.
S20510, the node scheduling platform determines that the capacity expansion of the shutdown node is successful under the condition that the shutdown node cancellation is successful.
Here, the specific implementation manner of steps S2051 to S20510 may refer to steps S1041 to S10410 in the embodiment corresponding to fig. 3, and details are not described here.
S20511, the node scheduling platform sends a starting instruction to the scheduling node.
Specifically, the node scheduling platform sends a startup instruction to the scheduling node according to the node identifier of the scheduling node.
And S20512, the scheduling node is started according to the starting instruction.
S20513, the scheduling node sends the node boot result to the node scheduling platform.
S20514, the node scheduling platform sends a node state obtaining request to the target container service cluster.
Specifically, the node scheduling platform receives the boot result, and sends a node state acquisition request to the target container service cluster when the boot result indicates that the boot is successful, where the request carries a node identifier of the scheduling node.
And S20515, the target container service cluster acquires the node state of the scheduling node according to the node state acquisition request.
Specifically, the target container service cluster receives the node state acquisition request, and acquires the node state of the scheduling node according to the node identifier of the scheduling node carried by the node state acquisition request.
S20516, the target container service cluster sends the node state of the scheduling node to the node scheduling platform.
S20517, if the node status of the scheduling node is that the cluster is successfully added, determining that the capacity expansion of the scheduling node is successful.
Specifically, the node scheduling platform receives the node status of the scheduling node, and determines that the capacity expansion of the scheduling node is successful if the node status of the scheduling node is that the cluster is successfully added.
In some feasible embodiments, the node scheduling platform sends a first trigger instruction according to all shutdown nodes in the target container service cluster, triggers all shutdown nodes to start up and cancel isolation, sends a second trigger instruction to the cluster management platform, and triggers K1 scheduling nodes to join the target container service cluster and start up after being released from the alternative container service cluster, so as to implement capacity expansion of the target container service cluster. The scheduling node herein refers to a node obtained by scheduling from an alternative container service cluster.
Specifically, please refer to fig. 7, which is another schematic flow chart illustrating the expansion of the target container service cluster based on the node to be expanded according to the present application. As shown in fig. 7, the expanding the target container service cluster by the node scheduling platform based on the node to be expanded includes the following steps:
s2061, the node scheduling platform sends a starting-up instruction to the shutdown node.
S2062, the shutdown node is started according to the startup instruction.
And S2053, the shutdown node sends a node startup result to the node scheduling platform.
S2064, the node scheduling platform sends a node state query request to the target container service cluster, wherein the node state query request carries the node identifier of the shutdown node.
S2065, the target container service cluster acquires the node state of the shutdown node according to the node identifier of the shutdown node.
S2066, the target container service cluster sends the node state of the shutdown node to the node scheduling platform.
S2067, the node scheduling platform sends an isolation canceling instruction to the target container service cluster, wherein the isolation canceling instruction carries the node identifier of the shutdown node.
S2068, the target container service cluster cancels isolation to the shutdown node according to the node identification of the shutdown node.
S2069, the target container service cluster sends the node isolation cancellation result to the node scheduling platform.
S20610, the node scheduling platform determines that the capacity expansion of the shutdown node is successful under the condition that the shutdown node is successfully canceled to isolate.
Here, the specific implementation manner of steps S2061 to S20610 may refer to steps S1041 to S10410 in the embodiment corresponding to fig. 3, and details are not described here.
S20611, the node scheduling platform sends a capacity expansion request to the cluster management platform.
Specifically, the node scheduling platform sends a capacity expansion request to the cluster management platform, where the request carries a cluster identifier of the candidate container service cluster, a cluster identifier of the target container service cluster, and node identifiers of K1 scheduling nodes.
S20612, the cluster management platform removes the scheduling node from the alternative container service cluster according to the capacity expansion request, and adds the scheduling node into the target container service cluster.
Specifically, the cluster management platform removes the K1 scheduling nodes from the candidate container service cluster according to the capacity expansion request, and adds the K1 scheduling nodes to the target container service cluster.
S20613, the cluster management platform sends the capacity expansion result to the node scheduling platform.
S20614, the node scheduling platform determines that the capacity expansion of the scheduling node is successful under the condition that the capacity expansion result is successful.
S206, obtaining the cluster state value of the target container service cluster, and determining the number of the nodes to be reduced according to the cluster state value and the cluster load resource occupation amount.
Specifically, a node scheduling platform sends a cluster state value obtaining request to a redis database, the request carries a cluster identifier of a target container service cluster, the redis database receives the request, obtains a cluster state value of the target container service cluster according to the cluster identifier of the target container service cluster carried by the request, and returns the cluster state value of the target container service cluster, the node scheduling platform receives the cluster state value of the target container service cluster, the cluster state value comprises cluster resource availability and cluster resource quantity to be determined, a difference value between the cluster resource availability and the cluster resource quantity to be determined is calculated to obtain a cluster idle resource total quantity a, further, a to-be-reduced resource total quantity a1 is a/2, and a quotient between the to-be-reduced resource total quantity a1 and a node preset resource total quantity a is determined as the quantity of to-be-reduced nodes.
It should be noted that, in order to avoid a situation of repeated capacity reduction of the target container service cluster in the actual capacity reduction process, after determining the number of nodes to be subjected to capacity reduction, the node scheduling platform may send a cluster state value update instruction to the redis database, where the instruction carries the number of nodes to be subjected to capacity reduction, and the redis database receives the instruction and updates the available amount of cluster resources in the cluster state value of the target container service cluster according to the number of nodes to be subjected to capacity reduction carried by the instruction.
Step S207 is then performed.
S207, acquiring the load resource occupation amount of each node in the target container service cluster, and determining N nodes from the target container service cluster as nodes to be subjected to capacity reduction according to the load resource occupation amount of each node, wherein N is equal to the number of the nodes to be subjected to capacity reduction.
Specifically, the node scheduling platform sends a node load resource occupation acquisition request to the target container service cluster, the target container service cluster receives the node load resource occupation acquisition request and returns the node load resource occupation of each node in the target container service cluster according to the request, the node scheduling platform receives the node load resource occupation of each node in the target container service cluster, and selects N nodes with the node load resource occupation equal to 0 from the target container service cluster as the nodes to be contracted according to the node load resource occupation of each node.
Step S208 is then performed.
And S208, carrying out capacity reduction on the target container service cluster based on the nodes to be subjected to capacity reduction.
Specifically, please refer to fig. 8, which is a flow diagram illustrating a process of performing capacity reduction on a target container service cluster based on a node to be capacity reduced. As shown in fig. 8, the node scheduling platform for performing capacity reduction on the target container service cluster based on the node to be capacity reduced includes the following steps:
s2081, the node scheduling platform sends an isolation instruction to the target container service cluster.
Specifically, the node scheduling platform sends an isolation instruction to the target container service cluster, where the isolation instruction carries a node identifier of a node to be capacity reduced.
S2082, the target container service cluster executes isolation operation on the to-be-reduced-capacity node according to the isolation instruction.
Specifically, the target container service cluster receives an isolation instruction, and executes isolation operation on the to-be-reduced-capacity node according to the isolation instruction.
S2083, the target container service cluster sends the isolation result to the node scheduling platform.
S2084, the node scheduling platform sends a shutdown instruction to the node to be reduced.
Specifically, the node scheduling platform receives the isolation result, and sends a shutdown instruction to the node to be reduced according to the node identifier of the node to be reduced under the condition that the isolation result is successful.
And S2085, the node to be reduced is shut down according to the shutdown instruction.
Specifically, the node to be capacity-reduced receives the shutdown instruction, and executes shutdown operation according to the shutdown instruction.
S2086, the node to be reduced dispatches the shutdown result of the platform node to the node.
S2087, the node scheduling platform confirms that the capacity reduction of the node to be reduced is successful under the condition that the shutdown result of the node is successful.
It should be noted that, in order to ensure that the actual number of the capacity reduction nodes of the target container service cluster is consistent with the number of the nodes to be reduced, an update instruction of the target container service cluster may be sent to the redis database after the capacity reduction of each node to be reduced is successful, where the update instruction carries the number of the nodes to be reduced and the node identifier, which are used to complete the capacity reduction operation, so that the redis database updates the shutdown node information of the target container service cluster and the available amount of the cluster resources in the cluster state value according to the number of the nodes to be reduced and the node identifier, which are carried by the update instruction and complete the capacity reduction operation.
Further, the node scheduling platform releases the target container service cluster based on the shutdown node under the condition that the node release triggering condition is met.
Specifically, please refer to fig. 9, which is a flowchart illustrating a process of releasing the target container service cluster based on a shutdown node according to the present application. As shown in fig. 9, the step of the node scheduling platform releasing the target container service cluster based on the shutdown node includes the following steps:
s2091, the node scheduling platform sends a shutdown node information acquisition request to the database.
Specifically, under the condition that the current time is the preset node release time, the node scheduling platform sends a shutdown node information acquisition request to the redis database, where the request carries the cluster identifier of the target container service cluster.
And S2092, the database acquires the shutdown node information of the target container service cluster according to the shutdown node information acquisition request.
Specifically, the redis database receives a shutdown node information acquisition request, and acquires shutdown node information of the target container service cluster according to the request.
S2093, the database sends the shutdown node information to the node scheduling platform.
S2094, the node scheduling platform sends a node removing instruction to the cluster management platform.
Specifically, the node scheduling platform receives shutdown node information of the target container service cluster, where the shutdown node information includes a node identifier of a shutdown node, and sends a node removal instruction to the cluster management platform, where the node removal instruction carries the node identifier of the shutdown node and the cluster identifier of the target container service cluster.
S2095, the cluster management platform removes the shutdown node from the target container service cluster according to the node removal instruction.
Specifically, the cluster management platform receives a node removal instruction, and removes the shutdown node from the target container service cluster according to the instruction.
S2096, the cluster management platform sends a destroy instruction to the shutdown node.
Specifically, after the cluster management platform removes the shutdown node from the target container service cluster, a destroy instruction is sent to the shutdown node according to the node identifier of the shutdown node.
And S2097, the shutdown node executes the destruction operation according to the destruction instruction.
Specifically, the shutdown node receives the destruction instruction and executes the destruction operation according to the destruction instruction.
S2098, the shutdown node sends the destruction result to the cluster management platform.
S2099, the cluster management platform sends a message that the node removal and destruction are successful to the node scheduling platform.
Specifically, the cluster management platform receives the destruction result, and sends a message that the node is removed and destroyed successfully to the node scheduling platform when the destruction result is that the destruction is successful.
S20910, the node scheduling platform sends a target container service cluster updating instruction to the database.
Specifically, after receiving the message, the node scheduling platform determines that the shutdown node is successfully released, and sends a target container service cluster update instruction to the redis database, where the instruction carries the node identifier of the shutdown node that completes the release operation.
S20911, the database updates the cluster state value of the target container service cluster according to the target container service cluster update instruction.
Specifically, the redis database receives a target container service cluster updating instruction, and updates the total amount of cluster resources in the cluster state value of the target container service cluster according to the node identifier of the shutdown node which completes the release operation and is carried by the instruction.
S20912, the database sends the information of successful cluster updating to the node scheduling platform.
In addition, under the condition that a self-healing triggering condition is met (for example, the time is the preset self-healing time), the node scheduling platform updates the cluster state value and the shutdown node information of the target container service cluster.
Specifically, please refer to fig. 10, which is a schematic flow diagram for self-healing a target container service cluster according to the present application. As shown in fig. 10, the self-healing of the target container service cluster by the node scheduling platform includes the following steps:
s20101, the node scheduling platform sends a shutdown node information emptying request to the database.
Specifically, under the condition that the current time is the preset self-healing time, the node scheduling platform sends a shutdown node information clearing request to the redis database, wherein the request carries the cluster identifier of the target container service cluster.
And S20102, emptying the shutdown node information of the target container service cluster by the database according to the shutdown node information emptying request.
Specifically, the redis database receives a shutdown node information clearing request, and clears the shutdown node information of the target container service cluster according to the request.
S20103, the database sends the emptying result to the node scheduling platform.
S20104, the node scheduling platform sends a cluster load resource occupation acquisition request to the target container service cluster.
Specifically, the node scheduling platform receives the emptying result, and sends a cluster load resource occupancy acquisition request to the target container service cluster when the emptying result is that emptying is successful.
S20105, the target container service cluster obtains the cluster load resource occupation amount of the target container service cluster.
S20106, the target container service cluster sends the cluster load resource occupation amount of the target container service cluster to the node scheduling platform.
Specifically, after receiving the request, the target container service cluster obtains the cluster load resource occupancy of the target container service cluster according to the request, and returns the cluster load resource occupancy of the target container service cluster.
S20107, the node scheduling platform sends a target container service cluster updating instruction to the database.
Specifically, the node scheduling platform receives the cluster load resource occupation amount of the target container service cluster, and sends a target container service cluster updating instruction to the redis database, where the instruction carries the cluster load resource occupation amount of the target container service cluster.
S20108, the database updates the cluster state value of the target container service cluster according to the target container service cluster updating instruction.
Specifically, the redis database receives an update instruction of the target container service cluster, and updates the total amount of cluster resources and the available amount of the cluster resources in the cluster state value of the target container service cluster according to the occupied amount of the cluster load resources of the target container service cluster carried by the update instruction.
S20109, the database sends the updating result to the node scheduling platform.
And S20110, the node scheduling platform sends a node state acquisition request to the target container service cluster.
Specifically, the node scheduling platform receives the update result, and sends a node state acquisition request to the target container service cluster when the update result is that the update is successful.
S20111, the target container service cluster acquires a node state of each node in the target container service cluster according to the node state acquisition request.
Specifically, the target container service cluster receives a node state acquisition request, and acquires the node state of each node in the target container service cluster according to the request.
S20112, the target container service cluster sends the node status of each node in the target container service cluster to the node scheduling platform.
And S20113, the node scheduling platform screens out shutdown nodes.
Specifically, the node scheduling platform receives the node state of each node in the target container service cluster, and determines the node with the node state of shutdown as a shutdown node.
And S20114, the node scheduling platform sends a shutdown node information updating instruction to the database.
Specifically, the node scheduling platform sends a shutdown node information updating instruction to the redis database, where the instruction carries a node identifier of the shutdown node.
And S20115, the database updates the shutdown node information according to the shutdown node information updating instruction.
Specifically, the redis database receives a shutdown node information updating instruction, and adds the shutdown node to the shutdown node information according to the instruction.
And S20116, the database sends the updating result to the node scheduling platform.
And S20117, determining that the self-healing of the target container service cluster is successful under the condition that the updating result of the database is successful.
Specifically, the node scheduling platform receives the update result, and determines that the self-healing of the target container service cluster is successful under the condition that the update result is successful.
In the embodiment of the application, when the cluster load resource occupation amount of the target container service cluster is greater than or equal to the load resource occupation amount threshold, the number of nodes to be expanded can be determined according to the node number of the target container service cluster and the node number threshold, the nodes to be expanded can be determined according to the node attribute determination information and the number of the nodes to be expanded, and the target container service cluster can be expanded based on the nodes to be expanded; under the condition that the cluster load resource occupancy of the target container service cluster is smaller than the load resource occupancy threshold, the number N of nodes to be subjected to capacity reduction can be determined according to the cluster load resource occupancy and the cluster state value of the target container service cluster, N nodes to be subjected to capacity reduction can be determined from the target container service cluster according to the load resource occupancy of each node, and then the target container service cluster can be subjected to capacity reduction based on the nodes to be subjected to capacity reduction, a certain amount of idle available resources can be guaranteed to be reserved in the target container cluster all the time, a small amount of idle available resources can be reserved in idle time, more idle available resources are reserved in busy time, the capacity expansion duration of the container service cluster is shortened, and the use experience of users in peak time is improved.
Based on the description of the foregoing method embodiment, the present application further provides a container service cluster node scheduling apparatus, which may be a node scheduling platform in the foregoing method embodiment, and is configured to execute corresponding steps in the method provided in the embodiment of the present application. Please refer to fig. 11, which is a schematic structural diagram of a container service cluster node scheduling apparatus according to an embodiment of the present application. As shown in fig. 11, the container service cluster node scheduling apparatus 11 may include: a load resource obtaining module 111, a capacity expansion quantity determining module 112, an obtaining determining module 113, and a capacity expansion module 114.
A load resource obtaining module 111, configured to obtain a cluster load resource occupation amount of the target container service cluster;
an expansion quantity determining module 112, configured to determine, if the cluster load resource occupancy is greater than or equal to the load resource occupancy threshold, a quantity of nodes to be expanded according to the node quantity of the target container service cluster and the node quantity threshold;
an obtaining and determining module 113, configured to obtain node attribute determining information, and determine a node to be expanded according to the node attribute determining information and the number of the nodes to be expanded;
and the capacity expansion module 114 is configured to expand the target container service cluster based on the number of the nodes to be expanded.
In some possible embodiments, the expansion amount determining module 112 includes:
a first capacity expansion number determining unit 1121, configured to determine, if the number of nodes is less than or equal to the node number threshold, the number of nodes to be capacity expanded based on a target node capacity expansion multiple and the number of nodes;
a second capacity expansion number determining unit 1122, configured to determine the node number threshold as the number of nodes to be expanded if the number of nodes is greater than the node number threshold.
In some possible embodiments, the node attribute determination information includes the number of shutdown nodes and/or capacity expansion node routing information of the target container service cluster;
the obtaining determining module 113 includes:
a first node-to-be-expanded determining unit 1131, configured to determine, if the number of the nodes to be expanded is less than or equal to the number of the shutdown nodes, M shutdown nodes from the shutdown nodes included in the target container service cluster as nodes to be expanded, where M is equal to the number of the nodes to be expanded;
a second node-to-be-expanded determining unit 1132, configured to, if the number of the nodes to be expanded is greater than the number of the shutdown nodes, obtain K1 scheduling nodes based on the expansion node routing information, and determine the K1 scheduling nodes and all the shutdown nodes included in the target container service cluster as nodes to be expanded, where K1 is equal to a difference between the number of the nodes to be expanded and the number of the shutdown nodes.
In some possible embodiments, the capacity expansion node routing information includes subnet information;
the second node-to-be-expanded determination unit 1132 is specifically configured to:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and acquiring scheduling node configuration information from a node configuration information acquisition path indicated by the subnet information, and acquiring K1 scheduling nodes according to the scheduling node configuration information.
In some possible embodiments, the capacity expansion node routing information includes a cluster identifier of the container service cluster;
the second node-to-be-expanded determination unit 1132 is specifically configured to:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and scheduling K1 shutdown nodes as scheduling nodes from the alternative container service cluster identified by the cluster identification.
In some possible embodiments, the node to be expanded is a shutdown node;
the capacity expansion module 114 includes:
and a power-on capacity expansion unit 1141, configured to send a first trigger instruction to the power-off node according to the node identifier of the power-off node, trigger the power-off node to power on and cancel isolation, so as to implement capacity expansion of the target container service cluster.
In some possible embodiments, the node to be expanded is a scheduling node;
the capacity expansion module 114 includes:
and the scheduling capacity expansion unit 1142 is configured to send a second trigger instruction to the scheduling node according to the node identifier of the scheduling node, trigger the scheduling node to start up and join the target container service cluster, so as to implement capacity expansion of the target container service cluster.
In some possible embodiments, the apparatus further comprises: a capacity reduction module 115.
The capacity reduction module 115 is configured to:
if the cluster load resource occupation amount is smaller than the load resource occupation amount threshold value, acquiring a cluster state value of the target container service cluster;
determining the number of nodes to be reduced according to the cluster state value and the cluster load resource occupation amount;
acquiring the load resource occupation amount of each node in the target container service cluster, and determining N nodes from the target container service cluster as nodes to be subjected to capacity reduction according to the load resource occupation amount of each node, wherein N is equal to the number of the nodes to be subjected to capacity reduction;
and carrying out capacity reduction on the target container service cluster based on the nodes to be subjected to capacity reduction.
It is understood that the container service cluster node scheduling apparatus 11 is used to implement the steps performed by the node scheduling platform in the embodiments of fig. 2 and fig. 4. As to the specific implementation manner and corresponding advantageous effects of the functional blocks included in the container service cluster node scheduling apparatus 5 in fig. 11, reference may be made to the specific descriptions of the embodiments in fig. 2 and fig. 4, which are not described herein again.
The container service cluster node scheduling apparatus 11 in the embodiment shown in fig. 11 may be implemented by the server 1200 shown in fig. 12. Please refer to fig. 12, which is a schematic structural diagram of a server provided in the present application. As shown in fig. 12, the server 1200 may include: one or more processors 1201, memory 1202, and transceiver 1203. The processor 1201, the memory 1202, and the transceiver 1203 are connected by a bus 1204. The transceiver 1203 is configured to receive or transmit data, and the memory 1202 is configured to store a computer program, where the computer program includes program instructions; the processor 1201 is configured to execute the program instructions stored in the memory 1202 to perform the following operations:
acquiring the cluster load resource occupation amount of a target container service cluster;
if the cluster load resource occupation is greater than or equal to the load resource occupation threshold, determining the number of nodes to be expanded according to the node number of the target container service cluster and the node number threshold;
acquiring node attribute determination information, and determining a node to be expanded according to the node attribute determination information and the number of the nodes to be expanded;
and expanding the capacity of the target container service cluster based on the node to be expanded.
In some possible embodiments, the processor 1201 determines, according to the number of nodes of the target container service cluster and a threshold of the number of nodes, the number of nodes to be expanded, and specifically executes the following steps:
if the number of the nodes is smaller than or equal to the threshold value of the number of the nodes, determining the number of the nodes to be expanded based on the target node expansion multiple and the number of the nodes;
and if the number of the nodes is larger than the threshold value of the number of the nodes, determining the threshold value of the number of the nodes as the number of the nodes to be expanded.
In some possible embodiments, the node attribute determination information includes the number of shutdown nodes and/or capacity expansion node routing information of the target container service cluster;
the processor 1201 determines a node to be expanded according to the node attribute determination information and the number of nodes to be expanded, and specifically executes the following steps:
if the number of the nodes to be expanded is less than or equal to the number of the shutdown nodes, determining M shutdown nodes as the nodes to be expanded from the shutdown nodes included in the target container service cluster, wherein M is equal to the number of the nodes to be expanded;
if the number of the nodes to be expanded is greater than the number of the shutdown nodes, acquiring K1 scheduling nodes based on the routing information of the expansion nodes, and determining the K1 scheduling nodes and all the shutdown nodes included in the target container service cluster as the nodes to be expanded, wherein K1 is equal to the difference between the number of the nodes to be expanded and the number of the shutdown nodes.
In some possible embodiments, the capacity expansion node routing information includes subnet information;
the processor 1201 obtains K1 scheduling nodes based on the capacity expansion node routing information, and specifically executes the following steps:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and acquiring scheduling node configuration information from a node configuration information acquisition path indicated by the subnet information, and acquiring K1 scheduling nodes according to the scheduling node configuration information.
In some possible embodiments, the capacity expansion node routing information includes a cluster identifier of the container service cluster;
the processor 1201 obtains K1 scheduling nodes based on the capacity expansion node routing information, and specifically executes the following steps:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and scheduling K1 shutdown nodes as scheduling nodes from the alternative container service cluster identified by the cluster identification.
In some possible embodiments, the node to be expanded is a shutdown node;
the processor 1201 expands the target container service cluster based on the node to be expanded, and specifically executes the following steps:
and sending a first trigger instruction to the shutdown node according to the node identifier of the shutdown node, triggering the shutdown node to start and canceling isolation so as to realize the expansion of the target container service cluster.
In some possible embodiments, the node to be expanded is a scheduling node;
the processor 1201 expands the target container service cluster based on the node to be expanded, and specifically executes the following steps:
and sending a second trigger instruction to the scheduling node according to the node identifier of the scheduling node, triggering the scheduling node to start and join the target container service cluster, so as to realize the expansion of the target container service cluster.
In some possible embodiments, the processor 1201 further performs the following steps:
if the cluster load resource occupation amount is smaller than the load resource occupation amount threshold value, acquiring a cluster state value of the target container service cluster;
determining the number of nodes to be reduced according to the cluster state value and the cluster load resource occupation amount;
acquiring the load resource occupation amount of each node in the target container service cluster, and determining N nodes from the target container service cluster as nodes to be subjected to capacity reduction according to the load resource occupation amount of each node, wherein N is equal to the number of the nodes to be subjected to capacity reduction;
and carrying out capacity reduction on the target container service cluster based on the nodes to be subjected to capacity reduction.
It should be understood that the server 1200 described in this embodiment may perform the description of the container service cluster node scheduling method in the embodiments corresponding to fig. 2 and fig. 4, and may also perform the description of the container service cluster node scheduling apparatus in the embodiments corresponding to fig. 5, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
Further, here, it is to be noted that: an embodiment of the present application further provides a computer storage medium, where the computer storage medium stores a computer program executed by the aforementioned container service cluster node scheduling apparatus 5, and the computer program includes program instructions, and when the processor executes the program instructions, the description of the container service cluster node scheduling method in the embodiment corresponding to fig. 2 or fig. 4 can be executed, so that details are not repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer storage medium referred to in the present application, reference is made to the description of the embodiments of the method of the present application. As an example, program instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network, which may comprise a block chain system.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The method and the related apparatus provided by the embodiments of the present application are described with reference to the flowchart and/or the structural diagram of the method provided by the embodiments of the present application, and each flow and/or block of the flowchart and/or the structural diagram of the method, and the combination of the flow and/or block in the flowchart and/or the block diagram can be specifically implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block or blocks.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (10)

1. A method for scheduling a container service cluster node is characterized by comprising the following steps:
acquiring the cluster load resource occupation amount of a target container service cluster;
if the cluster load resource occupation is greater than or equal to the load resource occupation threshold, determining the number of nodes to be expanded according to the number of nodes of the target container service cluster and the node number threshold;
acquiring node attribute determination information, and determining a node to be expanded according to the node attribute determination information and the number of the nodes to be expanded;
and expanding the capacity of the target container service cluster based on the node to be expanded.
2. The method of claim 1, wherein the determining the number of nodes to be expanded according to the number of nodes of the target container service cluster and a threshold of the number of nodes comprises:
if the number of the nodes is smaller than or equal to the threshold value of the number of the nodes, determining the number of the nodes to be expanded based on the target node expansion multiple and the number of the nodes;
and if the number of the nodes is larger than the threshold value of the number of the nodes, determining the threshold value of the number of the nodes as the number of the nodes to be expanded.
3. The method according to claim 1 or 2, wherein the node attribute determination information includes a shutdown node number and/or capacity expansion node routing information of the target container service cluster;
the determining the nodes to be expanded according to the node attribute determination information and the number of the nodes to be expanded includes:
if the number of the nodes to be expanded is less than or equal to the number of the shutdown nodes, determining M shutdown nodes as the nodes to be expanded from the shutdown nodes included in the target container service cluster, wherein M is equal to the number of the nodes to be expanded;
if the number of the nodes to be expanded is larger than the number of the shutdown nodes, acquiring K1 scheduling nodes based on routing information of the expansion nodes, and determining the K1 scheduling nodes and all the shutdown nodes included in the target container service cluster as the nodes to be expanded, wherein K1 is equal to the difference value between the number of the nodes to be expanded and the number of the shutdown nodes.
4. The method of claim 3, wherein the capacity expansion node routing information includes subnet information;
the obtaining K1 scheduling nodes based on the capacity expansion node routing information includes:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
and acquiring scheduling node configuration information from a node configuration information acquisition path indicated by the subnet information, and acquiring K1 scheduling nodes according to the scheduling node configuration information.
5. The method of claim 3, wherein the capacity expansion node routing information comprises a cluster identification of a container service cluster;
the obtaining K1 scheduling nodes based on the capacity expansion node routing information includes:
determining a difference value K1 between the number of the nodes to be expanded and the number of the shutdown nodes;
scheduling K1 shutdown nodes from the cluster identification identified alternate container service cluster as scheduling nodes.
6. The method according to claim 3, wherein the node to be expanded is a shutdown node;
the expanding the target container service cluster based on the node to be expanded comprises:
and sending a first trigger instruction to the shutdown node according to the node identifier of the shutdown node, triggering the shutdown node to start and cancel isolation, so as to realize capacity expansion of the target container service cluster.
7. The method according to any one of claims 1-2, further comprising:
if the cluster load resource occupation amount is smaller than the load resource occupation amount threshold value, acquiring a cluster state value of the target container service cluster;
determining the number of nodes to be subjected to capacity reduction according to the cluster state value and the cluster load resource occupation amount;
acquiring the load resource occupation amount of each node in the target container service cluster, and determining N nodes from the target container service cluster as nodes to be subjected to capacity reduction according to the load resource occupation amount of each node, wherein N is equal to the number of the nodes to be subjected to capacity reduction;
and carrying out capacity reduction on the target container service cluster based on the nodes to be subjected to capacity reduction.
8. A container service cluster node scheduling apparatus, comprising:
the load resource acquisition module is used for acquiring the cluster load resource occupation amount of the target container service cluster;
the capacity expansion quantity determining module is used for determining the quantity of the nodes to be subjected to capacity expansion according to the node quantity of the target container service cluster and the node quantity threshold if the load resource occupation of the cluster is greater than or equal to the load resource occupation threshold;
the acquisition determining module is used for acquiring node attribute determining information and determining the nodes to be expanded according to the node attribute determining information and the number of the nodes to be expanded;
and the capacity expansion module is used for expanding the capacity of the target container service cluster based on the number of the nodes to be expanded.
9. A server, comprising a processor, a memory and a transceiver, the processor, the memory and the transceiver being interconnected, wherein the transceiver is configured to receive or transmit data, the memory is configured to store program code, and the processor is configured to invoke the program code to perform the method of any of claims 1-7.
10. A computer storage medium, characterized in that the computer storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1-7.
CN202011073137.1A 2020-10-09 2020-10-09 Container service cluster node scheduling method and device, server and storage medium Active CN112235383B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011073137.1A CN112235383B (en) 2020-10-09 2020-10-09 Container service cluster node scheduling method and device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011073137.1A CN112235383B (en) 2020-10-09 2020-10-09 Container service cluster node scheduling method and device, server and storage medium

Publications (2)

Publication Number Publication Date
CN112235383A true CN112235383A (en) 2021-01-15
CN112235383B CN112235383B (en) 2024-03-22

Family

ID=74119961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011073137.1A Active CN112235383B (en) 2020-10-09 2020-10-09 Container service cluster node scheduling method and device, server and storage medium

Country Status (1)

Country Link
CN (1) CN112235383B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801797A (en) * 2021-03-30 2021-05-14 支付宝(杭州)信息技术有限公司 Method and apparatus for processing ticket data using a down-link trusted device
CN113311766A (en) * 2021-06-03 2021-08-27 中国工商银行股份有限公司 Distributed system batch node monitoring method, node and system
CN113905449A (en) * 2021-09-30 2022-01-07 阿里巴巴达摩院(杭州)科技有限公司 Computing resource scheduling method, system and equipment
CN114153518A (en) * 2021-10-25 2022-03-08 国网江苏省电力有限公司信息通信分公司 Autonomous capacity expansion and reduction method for cloud native MySQL cluster
CN114780232A (en) * 2022-03-25 2022-07-22 阿里巴巴(中国)有限公司 Cloud application scheduling method and device, electronic equipment and storage medium
CN116095083A (en) * 2023-01-16 2023-05-09 之江实验室 Computing method, computing system, computing device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106155805A (en) * 2015-04-14 2016-11-23 中兴通讯股份有限公司 Method of adjustment that system internal segment is counted and device
CN107329801A (en) * 2017-06-29 2017-11-07 深信服科技股份有限公司 A kind of node administration method and device, many component servers
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition
CN110677459A (en) * 2019-09-02 2020-01-10 金蝶软件(中国)有限公司 Resource adjusting method and device, computer equipment and computer storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106155805A (en) * 2015-04-14 2016-11-23 中兴通讯股份有限公司 Method of adjustment that system internal segment is counted and device
CN107329801A (en) * 2017-06-29 2017-11-07 深信服科技股份有限公司 A kind of node administration method and device, many component servers
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition
CN110677459A (en) * 2019-09-02 2020-01-10 金蝶软件(中国)有限公司 Resource adjusting method and device, computer equipment and computer storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801797A (en) * 2021-03-30 2021-05-14 支付宝(杭州)信息技术有限公司 Method and apparatus for processing ticket data using a down-link trusted device
CN113311766A (en) * 2021-06-03 2021-08-27 中国工商银行股份有限公司 Distributed system batch node monitoring method, node and system
CN113905449A (en) * 2021-09-30 2022-01-07 阿里巴巴达摩院(杭州)科技有限公司 Computing resource scheduling method, system and equipment
CN113905449B (en) * 2021-09-30 2024-04-05 杭州阿里云飞天信息技术有限公司 Computing resource scheduling method, system and equipment
CN114153518A (en) * 2021-10-25 2022-03-08 国网江苏省电力有限公司信息通信分公司 Autonomous capacity expansion and reduction method for cloud native MySQL cluster
CN114780232A (en) * 2022-03-25 2022-07-22 阿里巴巴(中国)有限公司 Cloud application scheduling method and device, electronic equipment and storage medium
CN116095083A (en) * 2023-01-16 2023-05-09 之江实验室 Computing method, computing system, computing device, storage medium and electronic equipment
CN116095083B (en) * 2023-01-16 2023-12-26 之江实验室 Computing method, computing system, computing device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN112235383B (en) 2024-03-22

Similar Documents

Publication Publication Date Title
CN112235383B (en) Container service cluster node scheduling method and device, server and storage medium
CN112035228B (en) Resource scheduling method and device
CN107066296B (en) Method and device for cleaning mirror image in cluster node
CN103841134A (en) API-based method for sending and receiving information, API-based apparatus, and API-based system
CN106445473B (en) container deployment method and device
TWI781535B (en) Resource library management system, resource library management method and program product
CN112965817A (en) Resource management method and device and electronic equipment
CN112181677A (en) Service processing method and device, storage medium and electronic device
EP3079339A1 (en) Method, device, and esb system for data processing
CN112882765A (en) Digital twin model scheduling method and device
CN115168031A (en) Fog calculation system, method, electronic equipment and storage medium
CN111683114A (en) Method and device for upgrading equipment program, terminal equipment and storage medium
CN111367506A (en) Data generation method, data generation device, storage medium and electronic device
CN114130035A (en) User matching method, device, equipment and storage medium
CN110764838B (en) Service model loading method and system, electronic equipment and storage medium
CN116737393B (en) Resource deployment method and device, storage medium and electronic equipment
CN112199200B (en) Resource scheduling method and device, computer equipment and storage medium
CN112906245A (en) Multi-robot simulation method, system, simulation server and terminal
CN109840094B (en) Database deployment method and device and storage equipment
CN109032674B (en) Multi-process management method, system and network equipment
CN115563160A (en) Data processing method, data processing device, computer equipment and computer readable storage medium
CN114911577A (en) Method, device, equipment and storage medium for setting network isolation rule
CN111767345B (en) Modeling data synchronization method, modeling data synchronization device, computer equipment and readable storage medium
CN113590308A (en) Workflow processing method, device, equipment and medium for applying for cloud resources
CN111294374B (en) Heterogeneous equipment starting system, method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant