CN116662010B - Dynamic resource allocation method and system based on distributed system environment - Google Patents

Dynamic resource allocation method and system based on distributed system environment Download PDF

Info

Publication number
CN116662010B
CN116662010B CN202310705142.7A CN202310705142A CN116662010B CN 116662010 B CN116662010 B CN 116662010B CN 202310705142 A CN202310705142 A CN 202310705142A CN 116662010 B CN116662010 B CN 116662010B
Authority
CN
China
Prior art keywords
resource
resource allocation
data
node
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310705142.7A
Other languages
Chinese (zh)
Other versions
CN116662010A (en
Inventor
王琳
黄燕芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhaoqing University
Original Assignee
Zhaoqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhaoqing University filed Critical Zhaoqing University
Priority to CN202310705142.7A priority Critical patent/CN116662010B/en
Publication of CN116662010A publication Critical patent/CN116662010A/en
Application granted granted Critical
Publication of CN116662010B publication Critical patent/CN116662010B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of computers, and discloses a dynamic resource allocation method and a dynamic resource allocation system based on a distributed system environment, which are used for realizing intelligent optimization of resource allocation, improving the stability and the operation efficiency of the system and reducing the resource waste and the management cost. The method comprises the following steps: acquiring a plurality of first distributed systems and system resource data of each first distributed system, and modularly integrating the plurality of first distributed systems according to the system resource data to obtain a second distributed system; receiving and responding to a plurality of historical resource allocation demands through a second distribution system to monitor resource usage data; respectively constructing a plurality of system module network structure diagrams and analyzing node relations to obtain a node attribute data set; constructing resource allocation training data according to the node attribute data set and establishing a resource prediction model; and inputting the target resource allocation requirement into a resource prediction model to predict the resource allocation strategy, so as to obtain the target resource allocation strategy.

Description

Dynamic resource allocation method and system based on distributed system environment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a dynamic resource allocation method and system based on a distributed system environment.
Background
With the increasing popularity of distributed systems, there is an increasing demand for resource allocation, and thus there is a need to optimize resource allocation to increase resource waste and management costs.
However, the prior art has problems of network transmission speed, delay and the like, and the problems can have a certain influence on the accuracy and timeliness of resource allocation. In addition, the selection and adjustment of resource allocation algorithms is also a challenge, and various factors including system performance, user requirements, cost control, etc. need to be considered, which results in lower stability and operating efficiency of the existing systems.
Disclosure of Invention
The invention provides a dynamic resource allocation method and a system based on a distributed system environment, which are used for realizing intelligent optimization of resource allocation, improving the stability and the operation efficiency of the system and reducing the resource waste and the management cost.
The first aspect of the present invention provides a dynamic resource allocation method based on a distributed system environment, where the dynamic resource allocation method based on the distributed system environment includes:
acquiring a plurality of first distributed systems and system resource data of each first distributed system, and modularly integrating the plurality of first distributed systems according to the system resource data to obtain a second distributed system;
Receiving and responding to a plurality of historical resource allocation demands through the second distribution system, and monitoring resource usage data of the second distribution system to obtain a plurality of historical resource usage data;
Respectively constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data, and carrying out node relation analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram;
Constructing resource allocation training data according to node attribute data sets of the network structure diagram of each system module, and constructing a resource prediction model according to the resource allocation training data;
Acquiring a target resource allocation requirement to be processed, and inputting the target resource allocation requirement into the resource prediction model to predict a resource allocation strategy to obtain a target resource allocation strategy;
and carrying out dynamic resource allocation on the second distributed system according to the target resource allocation strategy, and carrying out running state monitoring and strategy optimization on the second distributed system to obtain an optimized resource allocation strategy.
With reference to the first aspect, in a first implementation manner of the first aspect of the present invention, the acquiring a plurality of first distributed systems and system resource data of each first distributed system, and modularly integrating the plurality of first distributed systems according to the system resource data, to obtain a second distributed system, includes:
acquiring a plurality of first distributed systems, and respectively acquiring resource data of the plurality of first distributed systems to acquire system resource data of each first distributed system;
Performing association relation analysis on the plurality of first distributed systems according to the system resource data to obtain a target system association relation;
And according to the association relation of the target systems, carrying out module division and modularized integration on the plurality of first distributed systems to obtain a second distributed system.
With reference to the first aspect, in a second implementation manner of the first aspect of the present invention, the receiving and responding, by the second distribution system, to a plurality of historical resource allocation requirements, and performing resource usage data monitoring on the second distribution system to obtain a plurality of historical resource usage data, includes:
receiving a plurality of historical resource allocation requirements, and respectively transmitting the plurality of historical resource allocation requirements to the second distribution system for demand response to obtain a plurality of response state information;
based on the plurality of response state information, carrying out real-time update on a response mode of the second distributed system, and recording a plurality of response modes;
Acquiring resource use starting time, resource calling sequence and resource use data quantity of each response mode;
Generating a plurality of historical resource usage data according to the resource usage start time, the resource call sequence and the resource usage data amount.
With reference to the first aspect, in a third implementation manner of the first aspect of the present invention, the respectively constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource usage data, and performing node relation analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram, includes:
Constructing a resource topological relation diagram of the second distributed system according to the resource calling sequence in each historical resource use data;
Constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data through the resource topological relation diagram;
Performing node cluster analysis on the system module network structure diagram to obtain a plurality of initial nodes, and performing importance calculation on the plurality of initial nodes to obtain the importance of each initial node;
Performing master-slave node type division on the plurality of initial nodes according to the importance degree to obtain at least one master node and a plurality of slave nodes of a network structure diagram of each system module;
And respectively acquiring node attribute data of the at least one master node and the plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module, wherein the node attribute data set comprises CPU load data, storage space data, memory occupancy rate and network bandwidth data of each node.
With reference to the first aspect, in a fourth implementation manner of the first aspect of the present invention, the constructing resource allocation training data according to the node attribute data set of the network structure diagram of each system module, and constructing a resource prediction model according to the resource allocation training data includes:
performing aggregate data coding on node attribute data sets of each system module network structure diagram to obtain a plurality of attribute coding vectors;
Taking the attribute coding vectors as resource allocation training data, and acquiring a training model;
respectively inputting the attribute coding vectors into the training model to perform model training to obtain a plurality of training prediction results;
and according to the training prediction results, performing model parameter optimization on the training model to obtain a resource prediction model.
With reference to the first aspect, in a fifth implementation manner of the first aspect of the present invention, the obtaining a target resource allocation requirement to be processed, and inputting the target resource allocation requirement into the resource prediction model to perform resource allocation policy prediction, to obtain a target resource allocation policy, includes:
Acquiring a target resource allocation requirement to be processed;
Inputting the target resource allocation requirement into the resource prediction model, wherein the resource prediction model comprises a two-layer convolution network, a coding network and a decoding network;
And predicting the resource allocation strategy of the target resource allocation requirement through the resource prediction model to obtain the target resource allocation strategy.
With reference to the first aspect, in a sixth implementation manner of the first aspect of the present invention, the dynamically allocating resources to the second distributed system according to the target resource allocation policy, and performing operation state monitoring and policy optimization on the second distributed system to obtain an optimized resource allocation policy, includes:
According to the target resource allocation strategy, carrying out dynamic resource allocation on the second distributed system, and acquiring resource load data of the second distributed system;
according to the resource load data, performing operation state analysis on the second distributed system to obtain an operation state index;
and matching a corresponding load balancing model according to the running state index, and carrying out policy optimization on the target resource allocation policy based on the load balancing model to obtain an optimized resource allocation policy.
A second aspect of the present invention provides a dynamic resource allocation system based on a distributed system environment, the dynamic resource allocation system based on the distributed system environment comprising:
the system comprises an acquisition module, a first distribution system and a second distribution system, wherein the acquisition module is used for acquiring a plurality of first distribution systems and system resource data of each first distribution system, and modularly integrating the plurality of first distribution systems according to the system resource data to obtain the second distribution system;
The monitoring module is used for receiving and responding to a plurality of historical resource allocation demands through the second distribution system, and monitoring the resource usage data of the second distribution system to obtain a plurality of historical resource usage data;
The analysis module is used for respectively constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data, and carrying out node relation analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram;
The building module is used for building resource allocation training data according to the node attribute data set of the network structure diagram of each system module and building a resource prediction model according to the resource allocation training data;
The prediction module is used for acquiring a target resource allocation requirement to be processed, inputting the target resource allocation requirement into the resource prediction model to predict a resource allocation strategy, and obtaining a target resource allocation strategy;
and the allocation module is used for carrying out dynamic resource allocation on the second distributed system according to the target resource allocation strategy, and carrying out running state monitoring and strategy optimization on the second distributed system to obtain an optimized resource allocation strategy.
With reference to the second aspect, in a first implementation manner of the second aspect of the present invention, the acquiring module is specifically configured to:
acquiring a plurality of first distributed systems, and respectively acquiring resource data of the plurality of first distributed systems to acquire system resource data of each first distributed system;
Performing association relation analysis on the plurality of first distributed systems according to the system resource data to obtain a target system association relation;
And according to the association relation of the target systems, carrying out module division and modularized integration on the plurality of first distributed systems to obtain a second distributed system.
With reference to the second aspect, in a second implementation manner of the second aspect of the present invention, the monitoring module is specifically configured to:
receiving a plurality of historical resource allocation requirements, and respectively transmitting the plurality of historical resource allocation requirements to the second distribution system for demand response to obtain a plurality of response state information;
based on the plurality of response state information, carrying out real-time update on a response mode of the second distributed system, and recording a plurality of response modes;
Acquiring resource use starting time, resource calling sequence and resource use data quantity of each response mode;
Generating a plurality of historical resource usage data according to the resource usage start time, the resource call sequence and the resource usage data amount.
With reference to the second aspect, in a third implementation manner of the second aspect of the present invention, the analysis module is specifically configured to:
Constructing a resource topological relation diagram of the second distributed system according to the resource calling sequence in each historical resource use data;
Constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data through the resource topological relation diagram;
Performing node cluster analysis on the system module network structure diagram to obtain a plurality of initial nodes, and performing importance calculation on the plurality of initial nodes to obtain the importance of each initial node;
Performing master-slave node type division on the plurality of initial nodes according to the importance degree to obtain at least one master node and a plurality of slave nodes of a network structure diagram of each system module;
And respectively acquiring node attribute data of the at least one master node and the plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module, wherein the node attribute data set comprises CPU load data, storage space data, memory occupancy rate and network bandwidth data of each node.
With reference to the second aspect, in a fourth implementation manner of the second aspect of the present invention, the establishing module is specifically configured to:
performing aggregate data coding on node attribute data sets of each system module network structure diagram to obtain a plurality of attribute coding vectors;
Taking the attribute coding vectors as resource allocation training data, and acquiring a training model;
respectively inputting the attribute coding vectors into the training model to perform model training to obtain a plurality of training prediction results;
and according to the training prediction results, performing model parameter optimization on the training model to obtain a resource prediction model.
With reference to the second aspect, in a fifth implementation manner of the second aspect of the present invention, the prediction module is specifically configured to:
Acquiring a target resource allocation requirement to be processed;
Inputting the target resource allocation requirement into the resource prediction model, wherein the resource prediction model comprises a two-layer convolution network, a coding network and a decoding network;
And predicting the resource allocation strategy of the target resource allocation requirement through the resource prediction model to obtain the target resource allocation strategy.
With reference to the second aspect, in a sixth implementation manner of the second aspect of the present invention, the allocation module is specifically configured to:
According to the target resource allocation strategy, carrying out dynamic resource allocation on the second distributed system, and acquiring resource load data of the second distributed system;
according to the resource load data, performing operation state analysis on the second distributed system to obtain an operation state index;
and matching a corresponding load balancing model according to the running state index, and carrying out policy optimization on the target resource allocation policy based on the load balancing model to obtain an optimized resource allocation policy.
In the technical scheme provided by the invention, a plurality of first distributed systems and system resource data of each first distributed system are acquired, and the first distributed systems are integrated in a modularized manner according to the system resource data to obtain a second distributed system; receiving and responding to a plurality of historical resource allocation demands through a second distribution system to monitor resource usage data; respectively constructing a plurality of system module network structure diagrams and analyzing node relations to obtain a node attribute data set; constructing resource allocation training data according to the node attribute data set and establishing a resource prediction model; the invention improves the resource utilization rate, reduces the resource waste, improves the performance and the reliability of the system, realizes automatic and intelligent resource management, reduces manual intervention, improves the efficiency, and deals with the continuously-changing business demand and load fluctuation by optimizing the running state of the distributed system.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a dynamic resource allocation method based on a distributed system environment according to an embodiment of the present invention;
FIG. 2 is a flow chart of monitoring resource usage data in an embodiment of the invention;
FIG. 3 is a flow chart of node relationship analysis in an embodiment of the invention;
FIG. 4 is a flowchart of establishing a resource prediction model in an embodiment of the present invention;
FIG. 5 is a schematic diagram of an embodiment of a dynamic resource allocation system based on a distributed system environment in an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a dynamic resource allocation method and a system based on a distributed system environment, which are used for realizing intelligent optimization of resource allocation, improving the stability and the operation efficiency of the system and reducing the resource waste and the management cost. The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
For ease of understanding, the following describes a specific flow of an embodiment of the present invention, referring to fig. 1, and one embodiment of a dynamic resource allocation method based on a distributed system environment in the embodiment of the present invention includes:
S101, acquiring a plurality of first distributed systems and system resource data of each first distributed system, and modularly integrating the plurality of first distributed systems according to the system resource data to obtain a second distributed system;
It is to be understood that the implementation subject of the present invention may be a dynamic resource allocation system based on a distributed system environment, or a terminal or a server, and is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.
Specifically, the server needs to acquire a plurality of first distributed systems, and acquire resource data of each system respectively. Such resource data may include information such as processing power, storage capacity, network bandwidth, CPU utilization, memory usage, etc. By collecting this data, the server knows the current state of each system and the available resource situation. And carrying out association relation analysis according to the obtained system resource data to determine the interrelationship and the dependency relation among the systems. Through analyzing the association relation between the systems, the server knows the communication and data transmission modes between the systems, so that the allocation and integration strategy of the resources can be planned better. Based on the association relation of the system, module division and modularized integration can be performed. The system is divided into logically related modules and then by integrating these modules a larger second distributed system is built. The modular integration can improve the expandability and flexibility of the system and facilitate the management and allocation of resources. For example, assume that three distributed systems A, B and C are provided. And acquiring resource data of each system, wherein the resource data comprises processing capacity and storage capacity of A, network bandwidth and CPU utilization rate of B and memory utilization rate of C. And through association relation analysis, determining that high-speed network connection exists between A and B, and data transmission dependency relation exists between B and C. According to these relationships, a and B can be divided into one module, C as another module, and modularly integrated to construct a second distributed system. In this way, the resources of systems A and B can be shared and mutually accessed, while the resources of C can be managed independently. In this way, resources can be better utilized and co-operation between systems can be achieved.
S102, receiving and responding to a plurality of historical resource allocation demands through a second distribution system, and monitoring resource usage data of the second distribution system to obtain a plurality of historical resource usage data;
Specifically, the server needs to receive a plurality of historical resource allocation requirements and transmit the requirements to the second distribution system for response. Each historical resource allocation requirement describes information about the type of resource, quantity, and time required. The second distributed system processes these demands and generates corresponding response status information. The response status information may include resource allocation results, resource usage, task execution status, and the like. Based on the obtained response status information, a real-time update of the response mode of the second distributed system is required. The response mode refers to a resource scheduling strategy and a resource usage mode adopted by the system according to historical resource allocation requirements. Based on the response status of each historical demand, the system can update the response mode in real time and record. In order to generate the historical resource usage data, it is necessary to acquire the resource usage start time, the resource call order, and the resource usage data amount for each response mode. The resource usage start time refers to a point of time at which the resource starts to be used in each response mode. The resource call order describes the order and priority of calls to the resources by the system in response mode. The amount of resource usage data refers to the amount of resources actually used by the system in each time period. Based on the resource usage start time, the resource call order, and the amount of resource usage data, a plurality of historical resource usage data may be generated. These data reflect the execution of past resource allocation requirements and utilization of system resources. By analyzing these historical data, the resource utilization efficiency, bottlenecks, and performance status of the system can be known. For example, assume that there is a distributed system based on cloud computing, which includes multiple virtual machines. The historical resource allocation requirements include requirements for processors, memory, and storage of the virtual machine. The system receives multiple historical resource allocation requirements, e.g., requirement a requires 2 virtual machines, each requiring 2 processors and 4GB of memory, requirement B requires 3 virtual machines, each requiring 4 processors and 8GB of memory. The system allocates resources according to the demands, and records response state information of each demand, including allocation results, resource use conditions and task execution states. In response to the mode update, the system may adjust according to the priority of the demand and the availability of resources. The system records the resource usage start time, the resource call sequence and the resource usage data amount of each response mode. By analyzing these recorded data, historical resource usage data may be generated. For example, the system may derive historical resource usage data indicating that during a certain period of time, virtual machine 1 was first invoked and used 2 processors and 4GB of memory, and then virtual machine 2 was invoked and used 2 processors and 4GB of memory during a certain period of time. This historical resource usage data may be used to evaluate the resource utilization and performance of the system.
S103, respectively constructing a plurality of system module network structure diagrams corresponding to a plurality of historical resource use data, and carrying out node relation analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram;
It should be noted that, according to the resource calling sequence in each historical resource usage data, a resource topology relation diagram of the second distributed system is constructed. The resource call order describes the order of the resources that the system calls when executing the historical resource allocation needs. By analyzing the resource calling sequence, the dependency relationship and the connection mode between the resources can be determined, so that a topological relation diagram between the resources is constructed. Based on the resource topological relation diagram, a plurality of system module network structure diagrams can be constructed, and each structure diagram corresponds to one historical resource use data. The network structure diagram of the system module reflects the association and communication modes among different modules in the system. And carrying out node cluster analysis on the system module network structure diagram. Node cluster analysis is the process of classifying similar nodes into the same group. By analyzing the similarity and connectivity among the nodes, the nodes in the system module network structure diagram can be clustered to form different groups. Based on the cluster analysis, the importance of each initial node can be calculated. The importance calculation can be evaluated according to the key of the nodes, the resource utilization condition, the connection strength between the nodes and other factors. Through importance calculation, at least one master node and a plurality of slave nodes in the network structure diagram of each system module can be determined. And further acquiring node attribute data of the master node and the slave node, wherein the node attribute data comprise CPU load data, storage space data, memory occupancy rate and network bandwidth data. These node attribute data provide specific indicators as to the use of the node resources. By analyzing the node attribute data, the performance, the resource utilization rate and the bottleneck situation of each node can be known. For example, assume a distributed system comprising three modules A, B and C. The resource calling sequence in the historical resource usage data indicates that the module A is called first, then the module B is called, and finally the module C is called. And constructing a resource topological relation diagram of the system based on the resource calling sequence, and displaying A, B and C connection relations. According to the resource topological relation diagram, three system module network structure diagrams can be constructed, which correspond to A, B and C in the historical resource use data respectively. And carrying out node cluster analysis on the structure diagrams to obtain initial nodes, for example, grouping A and B into one group and grouping C into the other group. By calculating the importance of the initial node, it may be found that the importance of A and B is higher, while the importance of C is lower. Thus, a and B are determined as master nodes and C is determined as slave node. And acquiring node attribute data of the master nodes A and B and the slave node C, wherein the node attribute data comprise CPU load data, storage space data, memory occupancy rate and network bandwidth data. These data provide detailed information on the current state and resource utilization of each node. For example, for master node a, the node attribute data shows that its current CPU load is 80%, memory space has been used 60%, memory occupancy is 70%, and network bandwidth usage is 50%. The CPU load of the main node B is 60%, the storage space is 80%, the memory occupancy rate is 75%, and the network bandwidth utilization rate is 40%. The CPU load of the slave node C is 40%, the storage space is 30%, the memory occupancy rate is 50%, and the network bandwidth utilization rate is 30%. By analyzing these node attribute data, the resource utilization and performance of each node can be known. For example, a higher CPU load of master node a may mean that its processing tasks are more or additional resource allocation is needed to balance the load. The memory usage of the master node B is high and expansion or optimization of the memory may be required. The resource usage of the slave node C is relatively low and can be considered to handle lighter weight tasks or as a standby node. By acquiring the node attribute data set of the network structure diagram of each system module, the resource use condition and performance characteristics of each node in the system can be comprehensively known. This helps system administrators or resource scheduling algorithms make decisions and optimizations to improve the overall efficiency and performance of the system.
S104, constructing resource allocation training data according to the node attribute data set of the network structure diagram of each system module, and constructing a resource prediction model according to the resource allocation training data;
Specifically, for the node attribute data set of each system module network structure diagram, set data encoding is performed. This means that the attribute data of each node is converted into a numerically encoded form so as to be able to be used as input data for training the model. For example, each attribute value may be converted to a binary vector using one-hot encoding. These attribute code vectors are used as resource allocation training data. Each attribute code vector represents a sample and contains attribute information of a node in the system module network structure diagram. And taking a training data set consisting of attribute coding vectors of all nodes as the input of the model. This training data set is used to build a resource prediction model. Machine learning algorithms, such as decision trees, neural networks, support vector machines, etc., may be selected that fit the problem. The training data set is input into a model that is trained to learn the relationship between node attributes and resource allocations. In the model training process, each attribute code vector is taken as an input, and the model generates a corresponding training prediction result. These training predictions are predictions of the model's allocation of node resources. Based on these training predictions, model parameter optimization may be performed on the training model. By comparing with the actual resource allocation situation, the accuracy and performance of the model can be evaluated. According to the evaluation result, parameters of the model can be adjusted, and the prediction capacity and generalization capacity of the model are improved so as to obtain a more accurate resource prediction model. For example, assume that the server has a system module network structure diagram including node a, node B, and node C. The attributes of each node include CPU load, storage space, memory occupancy, and network bandwidth. The server encodes these attributes into binary vectors, e.g., classifies CPU load into three categories of low load, medium load, and high load, and represents them as [1, 0, 0], [0, 1, 0] and [0, 0, 1] using one-hot encoding. For each node, the server gets a similar attribute encoding vector. The training data set composed of the attribute coding vectors of all the nodes is used as input data for training a model. A suitable machine learning algorithm, such as a neural network, is selected and training data is input into the model for training. During the training process, the model generates training prediction results, i.e., predictions of node resource allocation. And according to the training prediction result, the server evaluates and optimizes the training model. By comparing with the actual resource allocation situation, the accuracy and performance of the model can be calculated. For example, assume that a server has a set of historical resource allocation data that includes node attributes and actual resource allocation conditions. The server uses these data to evaluate the prediction accuracy of the model, comparing the difference of the predicted result with the actual result. By evaluating the results, the server discovers that the model may have some deviation or error. To improve the performance of the model, the server performs model parameter optimization. The server attempts to improve the predictive and generalization capabilities of the model by adjusting its parameters or using more complex algorithms. This process may require multiple iterations and experiments to find the optimal model parameter configuration. The server establishes a resource prediction model which can predict the resource allocation of the nodes according to the node attribute data set of the input system module network structure diagram. This model may help system administrators or resource scheduling algorithms make decisions and optimizations to achieve more efficient resource allocation and utilization.
S105, acquiring a target resource allocation requirement to be processed, and inputting the target resource allocation requirement into a resource prediction model to predict a resource allocation strategy, so as to obtain a target resource allocation strategy;
Specifically, the server obtains a target resource allocation requirement to be processed. This is a task provided by a system administrator or resource scheduling algorithm that includes resource demand information for each node in the distributed system. The target resource allocation requirements may include the expected CPU utilization of the node, storage requirements, memory requirements, etc. The target resource allocation requirements are input into a resource prediction model. The resource prediction model is a trained model, which can predict the resource allocation situation of the node according to the input node attribute data set. The model may employ deep learning techniques, such as two-layer convolutional networks, encoding networks, and decoding networks, to capture complex relationships between node attributes. And predicting the resource allocation strategy for the target resource allocation requirement through a resource prediction model. Taking the target resource allocation requirement as input, the model will generate a corresponding resource allocation policy. The policy may dictate how resources in the system are reasonably allocated to meet the needs of the nodes and maximize the performance and efficiency of the system. For example, assume that the server has a target resource allocation requirement to be processed, which includes the resource requirements of three nodes A, B and C. Node a requires 80% CPU utilization and 100GB of memory, node B requires 60% CPU utilization and 80GB of memory, and node C requires 70% CPU utilization and 120GB of memory. The server inputs these resource demands into a resource prediction model that is trained to predict resource allocation based on the node attribute data. The model generates a corresponding resource allocation policy, such as allocating node a to a server with a high performance CPU and sufficient memory, node B to a medium performance server, and node C to a lower performance server with a larger memory. The generation of the resource allocation policy relies on accurate predictions of target resource allocation requirements by the resource prediction model. By constantly optimizing and training the resource prediction model, the server improves the accuracy of the predictions and the quality of the allocation policies.
S106, carrying out dynamic resource allocation on the second distributed system according to the target resource allocation strategy, and carrying out running state monitoring and strategy optimization on the second distributed system to obtain an optimized resource allocation strategy.
Specifically, dynamic resource allocation is performed on the second distributed system according to the target resource allocation strategy. The resources in the system are reallocated according to a previously predicted resource allocation policy. This may involve migrating a node from one server to another, increasing or decreasing the resource quota of the node, etc. Through dynamic resource allocation, the server flexibly adjusts the resource allocation in the system according to the actual demand and the guidance of the prediction model so as to meet the demand of the node. And acquiring resource load data of the second distributed system. The resource load data of the system can be obtained by monitoring and collecting the resource utilization conditions of each node in the second distributed system, such as CPU utilization rate, memory occupancy rate, storage space usage amount and the like. These data reflect the current operating state of the system and the resource utilization. And carrying out operation state analysis based on the resource load data to obtain operation state indexes. By analyzing and calculating the resource load data, some key operation state indexes such as average load of the system, load balancing condition of the nodes, resource utilization rate and the like can be obtained. These metrics may help the server to understand the overall performance and resource utilization of the system. And matching a corresponding load balancing model according to the running state index, and carrying out policy optimization on the target resource allocation policy based on the model. The load balancing model is a model for formulating a reasonable resource allocation strategy according to the running state of the system and the resource load data. Depending on the nature and requirements of the system, the server selects the appropriate load balancing model, such as weighted polling, minimum number of connections, minimum response time, etc. By matching and optimizing the target resource allocation strategy, the server further optimizes the allocation of resources, and improves the performance and efficiency of the system. For example, assume that the target resource allocation policy for a server is to allocate node a to a high performance server and node B to a low load server. According to this strategy, the server performs dynamic resource allocation, migrating node a to a server with higher computational power and more storage space, and migrating node B to a server with lower load. Then, the server acquires resource load data of the second distributed system, such as CPU utilization rate, memory occupancy rate and the like of the server. Through analysis of the resource load data, the server calculates the average load of the system and the load condition of each node. And selecting a proper load balancing model by the server to perform strategy optimization according to the running state index. For example, the server finds that the average load of the system is higher, the CPU utilization of node A is near saturation, and the CPU utilization of node B is lower. In this case, the server considers a load balancing model using weighted polling to distribute more requests to the node bs to balance the load of the system and improve overall performance. Based on the selected load balancing model and the optimization strategy, the server obtains an optimized resource allocation strategy. According to this policy, the server allocates node a and node B to the servers appropriate for them, respectively, to achieve better load balancing and resource utilization. Meanwhile, the server continuously monitors the running state and the resource load of the system, and further performs policy adjustment and optimization according to actual conditions.
In the embodiment of the invention, a plurality of first distributed systems and system resource data of each first distributed system are acquired, and the first distributed systems are modularized and integrated according to the system resource data to obtain a second distributed system; receiving and responding to a plurality of historical resource allocation demands through a second distribution system to monitor resource usage data; respectively constructing a plurality of system module network structure diagrams and analyzing node relations to obtain a node attribute data set; constructing resource allocation training data according to the node attribute data set and establishing a resource prediction model; the invention improves the resource utilization rate, reduces the resource waste, improves the performance and the reliability of the system, realizes automatic and intelligent resource management, reduces manual intervention, improves the efficiency, and deals with the continuously-changing business demand and load fluctuation by optimizing the running state of the distributed system.
In a specific embodiment, the process of executing step S101 may specifically include the following steps:
(1) Acquiring a plurality of first distributed systems, and respectively acquiring resource data of the plurality of first distributed systems to acquire system resource data of each first distributed system;
(2) Performing association relation analysis on a plurality of first distributed systems according to system resource data to obtain a target system association relation;
(3) And according to the association relation of the target systems, carrying out module division and modularized integration on the plurality of first distributed systems to obtain a second distributed system.
Specifically, the server first needs to acquire a plurality of first distributed systems, and acquire resource data of each system respectively. The resource data may include CPU utilization, memory usage, disk space usage, network bandwidth, etc. By acquiring these data, the server knows the resource situation of each system. The server needs to perform association analysis to determine association between systems. This may be achieved by analyzing the communication patterns, data interactions, and shared resources between the systems. For example, assume that the server has three distributed systems: A. b and C. By analyzing the traffic and data exchange patterns between them, the server determines that there is a strong association between A and B, while the association between A and C is weak. And according to the determined association relation of the target systems, the server performs module division and modularized integration on the plurality of first distributed systems so as to construct a second distributed system. Module partitioning is the division of a system into different functional modules or subsystems, each module being responsible for a particular task or function. Modular integration is the combination of these modules to build a larger, more powerful system. For example, assume that the server has three distributed systems: A. b and C. It was found by analysis that data exchange and communication between a and B was frequently performed, while the interaction between C and a and B was less. Based on this association, the server takes a and B as one module, C as the other module, and then integrates them together to form a second distributed system. In this way, the server performs association relation analysis according to the system resource data, and performs module division and modularized integration based on the association relation, thereby obtaining a second distributed system. The system structure can better meet the communication requirements among the systems, and improves the efficiency and performance of the whole system.
In a specific embodiment, as shown in fig. 2, the process of executing step S102 may specifically include the following steps:
s201, receiving a plurality of historical resource allocation requirements, and respectively transmitting the plurality of historical resource allocation requirements to a second distribution system for demand response to obtain a plurality of response state information;
S202, based on a plurality of response state information, updating the response mode of the second distributed system in real time, and recording a plurality of response modes;
s203, acquiring resource use starting time, resource calling sequence and resource use data quantity of each response mode;
s204, generating a plurality of historical resource use data according to the resource use starting time, the resource calling sequence and the resource use data quantity.
Specifically, the server first needs to receive a plurality of historical resource allocation requirements, and transmits the requirements to the second distribution system for demand response. Each historical resource allocation requirement includes specific demands on system resources, such as how much CPU, memory, disk space, etc., are needed. The second distributed system responds according to the requirements and generates corresponding response state information. In the process of responding to the historical resource allocation requirement, the second distributed system can update the response modes of the second distributed system in real time and record a plurality of response modes. The response mode refers to the strategy and mode adopted by the system to respond to the resource allocation requirement. For example, the system may employ a load balancing policy, a priority scheduling policy, or a dynamic resource allocation policy, etc. Recording these response patterns can help the server understand the operating characteristics and resource utilization of the system. The server needs to acquire the resource usage start time, the resource call order, and the resource usage data amount for each response mode. Such information may be obtained from a log or monitoring data of the system. The resource use start time refers to a start time at which each resource is called in the response mode, the resource call order refers to an order relation called between the resources, and the resource use data amount refers to the number or size of each resource used in the response mode. The server generates a plurality of historical resource usage data based on the resource usage start time, the resource call sequence, and the amount of resource usage data. These historical resource usage data reflect the resource usage of the past system in different response modes. By analyzing these data, the server knows the performance of the system under different load conditions, thereby providing a reference for the optimization of the resource allocation strategy. For example, assuming a distributed cloud platform, multiple users may submit historical resource allocation requirements, such as virtual machine creation and destruction requests. When a user submits these demands, the cloud platform will communicate these historical resource allocation demands to the underlying resource management system, such as OpenStack. The resource management system allocates corresponding virtual machine examples according to the demands of users and records response state information. The system can update the response mode in real time, for example, the system can perform elastic expansion and contraction according to the current system load condition, and the number of the virtual machines can be dynamically adjusted. By monitoring the resource usage conditions, such as CPU utilization, memory occupancy, etc., the resource usage start time, the resource call sequence, and the resource usage data amount for each response mode can be obtained. It is assumed that in some response mode, the system allocates a large number of virtual machine instances during peak hours to meet user demands. Historical resource usage data may be generated by recording the creation time and call order of each virtual machine instance, as well as CPU, memory, and disk resource usage for each instance. These data can be used to analyze the performance and resource utilization efficiency of the system under high load conditions. When the resource allocation strategy is further optimized, the response modes can be subjected to cluster analysis according to the historical resource use data, and similar resource use modes can be found. Future resource demands can be predicted from these patterns and resource allocation policies can be predicted and optimized. For example, if it is found that the user's demand for CPU resources is higher in a certain period of time, more CPU resources may be allocated in advance in the period of time to meet the user's demand.
In a specific embodiment, as shown in fig. 3, the process of executing step S103 may specifically include the following steps:
S301, constructing a resource topological relation diagram of a second distributed system according to a resource calling sequence in each historical resource use data;
S302, constructing a network structure diagram of a plurality of system modules corresponding to a plurality of historical resource use data through a resource topological relation diagram;
s303, performing node cluster analysis on a system module network structure diagram to obtain a plurality of initial nodes, and performing importance calculation on the plurality of initial nodes to obtain the importance of each initial node;
s304, carrying out master-slave node type division on a plurality of initial nodes according to the importance degree to obtain at least one master node and a plurality of slave nodes of a network structure diagram of each system module;
S305, respectively acquiring node attribute data of at least one master node and a plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module, wherein the node attribute data set comprises CPU load data, storage space data, memory occupancy rate and network bandwidth data of each node.
Specifically, the server constructs a resource topology relation diagram of the second distributed system according to the resource calling sequence in each historical resource use data. The resource call order may indicate dependencies between resources and communication traffic. And the server determines the connection and the dependency relationship between the resources by analyzing the calling sequence, so as to construct a resource topological relation diagram. And constructing a corresponding system module network structure diagram for each historical resource use data by the server by means of the resource topological relation diagram. The system module network architecture is an abstract representation of the resource topology at the module level, with each module representing a functional unit or component in the system. By mapping resource topology to a block network fabric, the server better understands the composition and function of the system. Further, the server performs node cluster analysis on the system module network structure diagram to identify a plurality of initial nodes therein. Node cluster analysis may group nodes into clusters based on their similarity between them. This may help the server discover clusters of nodes with similar features and functions. The server measures the degree of contribution of each initial node to the system by calculating the importance of the initial node. The importance calculation may be based on various metrics such as the centrality of the node, the centrality of the bets, etc. Based on the importance evaluation result, the server performs master-slave node type division on the plurality of initial nodes. The master node plays a key role in the system and is responsible for coordinating and managing other nodes. The slave node executes the tasks and instructions assigned by the master node. By dividing the types of master nodes and slave nodes, a server ensures that at least one master node and a plurality of slave nodes exist in a network structure diagram of a system module so as to ensure the stability and the expandability of the system. The server obtains node attribute data of at least one master node and a plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module. These node attribute data may include CPU load data, memory space data, memory occupancy, network bandwidth data, etc. for each node. By collecting and analyzing the attribute data, the server can deeply understand the performance and the resource use condition of each node, and provide basis for optimizing the system and managing the resources. For example, assume that a server has a distributed system that includes multiple servers and storage devices. And constructing a resource topological relation diagram by the server according to the resource calling sequence in the historical resource use data, and determining the connection and communication flow between the servers. Based on the resource topological relation diagram, the server constructs a system module network structure diagram, and abstracts the server and the storage device into different modules, such as a front-end module, a back-end module and a database module. The server performs node cluster analysis on the system module network structure diagram, and clusters together nodes with similar functions and features, such as front-end server nodes in one group and back-end server nodes in another group. The server calculates the importance of each of the initial nodes to determine how much they contribute to the system performance. And dividing master-slave node types by the server according to the importance evaluation result, and selecting at least one master node and a plurality of slave nodes. For example, the server may divide front-end server nodes with high importance into master nodes responsible for handling user requests and routing traffic, and other front-end server nodes into slave nodes responsible for assisting in handling requests. The server divides the back-end server and the database server. The server acquires node attribute data of at least one master node and a plurality of slave nodes in a network structure diagram of each system module. These attribute data may include CPU load data, memory space data, memory occupancy, and network bandwidth data for each node. For example, the server gathers information such as CPU utilization, disk space utilization, memory utilization, and network bandwidth utilization of the master node and the slave nodes.
In a specific embodiment, as shown in fig. 4, the process of executing step S104 may specifically include the following steps:
s401, performing aggregate data coding on node attribute data sets of each system module network structure diagram to obtain a plurality of attribute coding vectors;
S402, taking a plurality of attribute coding vectors as resource allocation training data, and acquiring a training model;
S403, respectively inputting a plurality of attribute coding vectors into a training model to perform model training to obtain a plurality of training prediction results;
S404, according to a plurality of training prediction results, model parameter optimization is carried out on the training model, and a resource prediction model is obtained.
Specifically, for each node attribute data set of the network structure diagram of the system module, the server needs to perform data encoding to facilitate the processing of the training model. The server converts each attribute to a digital representation using a suitable encoding method, such as one-time thermal encoding or tag encoding. This ensures that the relationships between the attributes are preserved and can be calculated in the model. By encoding each node attribute data set, the server obtains a plurality of attribute encoding vectors, each representing attribute information of one node. The server takes the attribute code vectors as the resource allocation training data and acquires a training model. The training model is a variety of machine learning models such as a neural network, decision tree, or support vector machine. This model will learn how to predict the appropriate resource allocation strategy based on the node's attribute encoding vector. The server inputs a plurality of attribute coding vectors into a training model for model training, and the model gradually learns the relation between node attributes and resource allocation through feedback of training data and iteration of an optimization algorithm. In this way, the server obtains a plurality of training predictions, i.e., predictions of the resource allocation of the model to different node attribute vectors. And according to the training prediction results, the server optimizes model parameters of the training model. This may include adjusting the hyper-parameters of the model, optimizing the loss function, or using regularization techniques. By optimizing model parameters, the server improves the performance and accuracy of the model, so that the model can better predict the resource allocation strategy corresponding to the node attribute vector. And (3) optimizing parameters of the training model, and obtaining a resource prediction model by the server. The model can predict the optimal resource allocation strategy according to the attribute coding vector of the node. And inputting future node attribute data into the trained model, and obtaining a corresponding resource allocation prediction result by the server, so as to guide the system to reasonably allocate resources. For example, assume a distributed system in which node attributes include processing power, storage capacity, and network bandwidth of the node. The server encodes the attributes to obtain an attribute encoding vector of each node. The server uses these attribute code vectors as training data and builds a neural network model for training. Through continuous iteration and optimization, the server obtains a resource prediction model. Now, assume that the server has a new node whose properties are high processing power, medium storage capacity, low network bandwidth. The server encodes the attributes of this node into a corresponding vector and inputs into the trained resource prediction model. The model predicts according to the attribute code vector of the node and gives the prediction result of the resource allocation strategy. For example, the model may predict that the node needs to allocate more computing resources and storage capacity, and higher network bandwidth to meet its high processing power requirements. By such resource allocation prediction, the server performs dynamic resource allocation according to the actual situation of the system. If other nodes in the system have free resources, the server allocates a portion of the resources to that node to meet its needs. If resources in the system are tense, the server is preferentially distributed to the nodes with higher importance, so that the overall performance and stability of the system are ensured. The server continuously optimizes the resource allocation model by continuously receiving historical resource allocation requirements and performing resource prediction. By analyzing the historical response state information and the actual resource use condition, the server adjusts parameters of the model and the training method so as to improve the accuracy and effect of prediction.
In a specific embodiment, the process of executing step S105 may specifically include the following steps:
(1) Acquiring a target resource allocation requirement to be processed;
(2) Inputting the target resource allocation requirement into a resource prediction model, wherein the resource prediction model comprises a two-layer convolution network, a coding network and a decoding network;
(3) And predicting the resource allocation strategy of the target resource allocation requirement through a resource prediction model to obtain the target resource allocation strategy.
Specifically, the server collects target resource allocation requirements to be processed. These requirements are requests submitted by system users, including requirements for computing resources, storage resources, network bandwidth, and the like. For example, a user may request allocation of a certain amount of computing resources and storage capacity to run a particular application. The server inputs these target resource allocation requirements into a resource prediction model. The resource prediction model is a trained model comprising a two-layer convolutional network, an encoding network, and a decoding network. These network structures are designed to extract and learn features in the target resource allocation requirements and make predictions of the resource allocation policies. And predicting the resource allocation strategy of the target resource allocation requirement by the server through a resource prediction model. The model outputs a resource allocation strategy through processing and calculation of the network structure according to the input target resource requirement. The policy directs the system to make a reasonable allocation of the target resources to meet the demands and optimize the performance and efficiency of the system. For example, assuming a resource prediction model, the allocation of computing resources and storage capacity may be predicted by training. There is now a target resource allocation requirement, which includes a need to allocate 100 compute cores and 1TB of memory capacity. The server inputs the demand into a resource prediction model, and the model predicts an optimal resource allocation strategy after calculation. Based on the model's predictions, the system can allocate appropriate computing resources and storage capacity to the demand. For example, the system may allocate a server with 100 computing cores and sufficient storage capacity to meet the demand. Thus, the target resources can be ensured to be properly allocated, and the resource utilization rate and performance of the system are improved to the maximum extent. By continuously collecting target resource allocation requirements and inputting a resource prediction model, the system can implement dynamic resource allocation policy prediction. The embodiment helps the system to reasonably allocate resources, improves the resource utilization rate and the system performance, and meets the requirements of users. Meanwhile, the accuracy and effect of prediction can be improved by continuously optimizing and training a resource prediction model, so that the system is more intelligent and adaptive.
In a specific embodiment, the process of executing step S106 may specifically include the following steps:
(1) According to the target resource allocation strategy, carrying out dynamic resource allocation on the second distributed system, and acquiring resource load data of the second distributed system;
(2) According to the resource load data, performing operation state analysis on the second distributed system to obtain an operation state index;
(3) And matching the corresponding load balancing model according to the running state index, and carrying out policy optimization on the target resource allocation policy based on the load balancing model to obtain an optimized resource allocation policy.
Specifically, dynamic resource allocation is performed on the second distributed system according to the target resource allocation strategy. According to the policy, the system may increase, decrease or reallocate resources to different nodes or instances according to the priority of the demand and the availability of resources. Thus, the resource allocation according to the needs can be ensured, and flexible adjustment can be carried out according to the change of the needs. For example, if the load of a node is too high, the system may balance the load by dynamically allocating more resources to the node. And acquiring resource load data of the second distributed system. The system collects the resource usage of each node or instance, including CPU utilization, memory usage, network bandwidth, etc. These data can be used to evaluate the current state and performance of the system. And carrying out operation state analysis based on the resource load data to obtain operation state indexes. The system analyzes the resource load data, such as indexes of the utilization rate of computing resources, the available space of memory, the delay of a network and the like, to evaluate the running state of the system. These indicators can be used to determine the health and performance of the system, such as whether there is a resource bottleneck or overload condition. And matching the corresponding load balancing model according to the running state index. The system selects an appropriate load balancing model according to the running state index. The load balancing model is a rule or algorithm which is defined in advance and is used for deciding the mode and strategy of resource allocation. Depending on the requirements and performance goals of the system, different load balancing models may be selected, such as polling, weighted polling, least connection, etc. And carrying out policy optimization on the target resource allocation policy based on the load balancing model to obtain an optimized resource allocation policy. The system can optimize the target resource allocation strategy according to the current resource load and the running state index and by combining a load balancing model. The goals of optimization are to maximize resource utilization, minimize response time, balance load, etc. The system can adjust the resource allocation strategy to be more suitable for the current system state and requirement, and the overall performance is improved. For example, assume that the server has a second distributed system consisting of a plurality of nodes. According to the target resource allocation policy, the system needs to allocate more computing resources to nodes with higher loads to balance the load and improve performance. According to the resource load data, the server discovers that the CPU utilization rate of the node A is higher than that of other nodes. Through running state analysis, the server obtains the indexes of high CPU utilization rate and low memory occupancy rate of the node A. Based on these metrics, the system determines that node a needs more computing resources to handle the high load. Based on the selection of the load balancing model, the system decides to transfer a portion of the computing resources from other nodes to node a to achieve dynamic allocation of the resources. The system can reallocate the resources according to the resource requirements of the node A and the resource availability of other nodes, so that the node A can obtain the required resources. The system will optimize the target resource allocation policy. According to the current resource load condition and running state indexes and in combination with a load balancing model, the system can adjust a resource allocation strategy so as to improve the performance of the whole system. For example, the system may adjust the weight or priority of the resource allocation according to the requirements of the load balancing model, so that the resource allocation is more balanced and efficient. In this embodiment, the process of dynamically allocating resources to the second distributed system and optimizing the resource allocation policy according to the target resource allocation policy is a continuous cycle. The system can continuously collect resource load data, analyze the running state and adjust and optimize the resources according to the running state index and the load balancing model. Thus, the system can maintain efficient resource utilization and performance under different load conditions so as to meet the changing demands.
The above describes a dynamic resource allocation method based on a distributed system environment in an embodiment of the present invention, and the following describes a dynamic resource allocation system based on a distributed system environment in an embodiment of the present invention, referring to fig. 5, and one embodiment of the dynamic resource allocation system based on a distributed system environment in an embodiment of the present invention includes:
an obtaining module 501, configured to obtain a plurality of first distributed systems and system resource data of each first distributed system, and modularly integrate the plurality of first distributed systems according to the system resource data to obtain a second distributed system;
The monitoring module 502 is configured to receive and respond to a plurality of historical resource allocation requirements through the second distributed system, and monitor resource usage data of the second distributed system to obtain a plurality of historical resource usage data;
An analysis module 503, configured to respectively construct a plurality of system module network structure diagrams corresponding to the plurality of historical resource usage data, and perform node relationship analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram;
the establishing module 504 is configured to construct resource allocation training data according to the node attribute data set of the network structure diagram of each system module, and establish a resource prediction model according to the resource allocation training data;
the prediction module 505 is configured to obtain a target resource allocation requirement to be processed, and input the target resource allocation requirement into the resource prediction model to perform resource allocation policy prediction, so as to obtain a target resource allocation policy;
and the allocation module 506 is configured to perform dynamic resource allocation on the second distributed system according to the target resource allocation policy, and perform operation state monitoring and policy optimization on the second distributed system to obtain an optimized resource allocation policy.
Optionally, the obtaining module 501 is specifically configured to:
acquiring a plurality of first distributed systems, and respectively acquiring resource data of the plurality of first distributed systems to acquire system resource data of each first distributed system;
Performing association relation analysis on the plurality of first distributed systems according to the system resource data to obtain a target system association relation;
And according to the association relation of the target systems, carrying out module division and modularized integration on the plurality of first distributed systems to obtain a second distributed system.
Optionally, the monitoring module 502 is specifically configured to:
receiving a plurality of historical resource allocation requirements, and respectively transmitting the plurality of historical resource allocation requirements to the second distribution system for demand response to obtain a plurality of response state information;
based on the plurality of response state information, carrying out real-time update on a response mode of the second distributed system, and recording a plurality of response modes;
Acquiring resource use starting time, resource calling sequence and resource use data quantity of each response mode;
Generating a plurality of historical resource usage data according to the resource usage start time, the resource call sequence and the resource usage data amount.
Optionally, the analysis module 503 is specifically configured to:
Constructing a resource topological relation diagram of the second distributed system according to the resource calling sequence in each historical resource use data;
Constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data through the resource topological relation diagram;
Performing node cluster analysis on the system module network structure diagram to obtain a plurality of initial nodes, and performing importance calculation on the plurality of initial nodes to obtain the importance of each initial node;
Performing master-slave node type division on the plurality of initial nodes according to the importance degree to obtain at least one master node and a plurality of slave nodes of a network structure diagram of each system module;
And respectively acquiring node attribute data of the at least one master node and the plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module, wherein the node attribute data set comprises CPU load data, storage space data, memory occupancy rate and network bandwidth data of each node.
Optionally, the establishing module 504 is specifically configured to:
performing aggregate data coding on node attribute data sets of each system module network structure diagram to obtain a plurality of attribute coding vectors;
Taking the attribute coding vectors as resource allocation training data, and acquiring a training model;
respectively inputting the attribute coding vectors into the training model to perform model training to obtain a plurality of training prediction results;
and according to the training prediction results, performing model parameter optimization on the training model to obtain a resource prediction model.
Optionally, the prediction module 505 is specifically configured to:
Acquiring a target resource allocation requirement to be processed;
Inputting the target resource allocation requirement into the resource prediction model, wherein the resource prediction model comprises a two-layer convolution network, a coding network and a decoding network;
And predicting the resource allocation strategy of the target resource allocation requirement through the resource prediction model to obtain the target resource allocation strategy.
Optionally, the allocation module 506 is specifically configured to:
According to the target resource allocation strategy, carrying out dynamic resource allocation on the second distributed system, and acquiring resource load data of the second distributed system;
according to the resource load data, performing operation state analysis on the second distributed system to obtain an operation state index;
and matching a corresponding load balancing model according to the running state index, and carrying out policy optimization on the target resource allocation policy based on the load balancing model to obtain an optimized resource allocation policy.
Acquiring a plurality of first distributed systems and system resource data of each first distributed system through cooperative cooperation of the components, and modularly integrating the plurality of first distributed systems according to the system resource data to obtain a second distributed system; receiving and responding to a plurality of historical resource allocation demands through a second distribution system to monitor resource usage data; respectively constructing a plurality of system module network structure diagrams and analyzing node relations to obtain a node attribute data set; constructing resource allocation training data according to the node attribute data set and establishing a resource prediction model; the invention improves the resource utilization rate, reduces the resource waste, improves the performance and the reliability of the system, realizes automatic and intelligent resource management, reduces manual intervention, improves the efficiency, and deals with the continuously-changing business demand and load fluctuation by optimizing the running state of the distributed system.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random acceS memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. The dynamic resource allocation method based on the distributed system environment is characterized by comprising the following steps:
acquiring a plurality of first distributed systems and system resource data of each first distributed system, and modularly integrating the plurality of first distributed systems according to the system resource data to obtain a second distributed system;
Receiving and responding to a plurality of historical resource allocation demands through the second distribution system, and monitoring resource usage data of the second distribution system to obtain a plurality of historical resource usage data;
Respectively constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data, and carrying out node relation analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram; the method specifically comprises the following steps: constructing a resource topological relation diagram of the second distributed system according to the resource calling sequence in each historical resource use data; constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data through the resource topological relation diagram; performing node cluster analysis on the system module network structure diagram to obtain a plurality of initial nodes, and performing importance calculation on the plurality of initial nodes to obtain the importance of each initial node; performing master-slave node type division on the plurality of initial nodes according to the importance degree to obtain at least one master node and a plurality of slave nodes of a network structure diagram of each system module; respectively acquiring node attribute data of the at least one master node and the plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module, wherein the node attribute data set comprises CPU load data, storage space data, memory occupancy rate and network bandwidth data of each node;
Constructing resource allocation training data according to node attribute data sets of the network structure diagram of each system module, and constructing a resource prediction model according to the resource allocation training data;
Acquiring a target resource allocation requirement to be processed, and inputting the target resource allocation requirement into the resource prediction model to predict a resource allocation strategy to obtain a target resource allocation strategy;
and carrying out dynamic resource allocation on the second distributed system according to the target resource allocation strategy, and carrying out running state monitoring and strategy optimization on the second distributed system to obtain an optimized resource allocation strategy.
2. The method for dynamic resource allocation in a distributed system environment according to claim 1, wherein the obtaining system resource data of a plurality of first distributed systems and each first distributed system, and modularly integrating the plurality of first distributed systems according to the system resource data, to obtain a second distributed system, includes:
acquiring a plurality of first distributed systems, and respectively acquiring resource data of the plurality of first distributed systems to acquire system resource data of each first distributed system;
Performing association relation analysis on the plurality of first distributed systems according to the system resource data to obtain a target system association relation;
And according to the association relation of the target systems, carrying out module division and modularized integration on the plurality of first distributed systems to obtain a second distributed system.
3. The method for dynamic resource allocation in a distributed system environment according to claim 1, wherein said receiving and responding to a plurality of historical resource allocation demands by the second distributed system, and performing resource usage data monitoring on the second distributed system, to obtain a plurality of historical resource usage data, includes:
receiving a plurality of historical resource allocation requirements, and respectively transmitting the plurality of historical resource allocation requirements to the second distribution system for demand response to obtain a plurality of response state information;
based on the plurality of response state information, carrying out real-time update on a response mode of the second distributed system, and recording a plurality of response modes;
Acquiring resource use starting time, resource calling sequence and resource use data quantity of each response mode;
Generating a plurality of historical resource usage data according to the resource usage start time, the resource call sequence and the resource usage data amount.
4. The method for dynamic resource allocation in a distributed system environment according to claim 1, wherein said constructing resource allocation training data according to the node attribute data set of the network structure diagram of each system module, and constructing a resource prediction model according to the resource allocation training data, comprises:
performing aggregate data coding on node attribute data sets of each system module network structure diagram to obtain a plurality of attribute coding vectors;
Taking the attribute coding vectors as resource allocation training data, and acquiring a training model;
respectively inputting the attribute coding vectors into the training model to perform model training to obtain a plurality of training prediction results;
and according to the training prediction results, performing model parameter optimization on the training model to obtain a resource prediction model.
5. The method for dynamic resource allocation in a distributed system environment according to claim 1, wherein the obtaining the target resource allocation requirement to be processed, and inputting the target resource allocation requirement into the resource prediction model to perform resource allocation policy prediction, and obtaining the target resource allocation policy, includes:
Acquiring a target resource allocation requirement to be processed;
Inputting the target resource allocation requirement into the resource prediction model, wherein the resource prediction model comprises a two-layer convolution network, a coding network and a decoding network;
And predicting the resource allocation strategy of the target resource allocation requirement through the resource prediction model to obtain the target resource allocation strategy.
6. The method for dynamic resource allocation in a distributed system environment according to claim 1, wherein said performing dynamic resource allocation on the second distributed system according to the target resource allocation policy, and performing operation state monitoring and policy optimization on the second distributed system to obtain an optimized resource allocation policy includes:
According to the target resource allocation strategy, carrying out dynamic resource allocation on the second distributed system, and acquiring resource load data of the second distributed system;
according to the resource load data, performing operation state analysis on the second distributed system to obtain an operation state index;
and matching a corresponding load balancing model according to the running state index, and carrying out policy optimization on the target resource allocation policy based on the load balancing model to obtain an optimized resource allocation policy.
7. A dynamic resource allocation system based on a distributed system environment, the dynamic resource allocation system based on the distributed system environment comprising:
the system comprises an acquisition module, a first distribution system and a second distribution system, wherein the acquisition module is used for acquiring a plurality of first distribution systems and system resource data of each first distribution system, and modularly integrating the plurality of first distribution systems according to the system resource data to obtain the second distribution system;
The monitoring module is used for receiving and responding to a plurality of historical resource allocation demands through the second distribution system, and monitoring the resource usage data of the second distribution system to obtain a plurality of historical resource usage data;
The analysis module is used for respectively constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data, and carrying out node relation analysis on the system module network structure diagrams to obtain a node attribute data set of each system module network structure diagram; the method specifically comprises the following steps: constructing a resource topological relation diagram of the second distributed system according to the resource calling sequence in each historical resource use data; constructing a plurality of system module network structure diagrams corresponding to the plurality of historical resource use data through the resource topological relation diagram; performing node cluster analysis on the system module network structure diagram to obtain a plurality of initial nodes, and performing importance calculation on the plurality of initial nodes to obtain the importance of each initial node; performing master-slave node type division on the plurality of initial nodes according to the importance degree to obtain at least one master node and a plurality of slave nodes of a network structure diagram of each system module; respectively acquiring node attribute data of the at least one master node and the plurality of slave nodes to obtain a node attribute data set of a network structure diagram of each system module, wherein the node attribute data set comprises CPU load data, storage space data, memory occupancy rate and network bandwidth data of each node;
The building module is used for building resource allocation training data according to the node attribute data set of the network structure diagram of each system module and building a resource prediction model according to the resource allocation training data;
The prediction module is used for acquiring a target resource allocation requirement to be processed, inputting the target resource allocation requirement into the resource prediction model to predict a resource allocation strategy, and obtaining a target resource allocation strategy;
and the allocation module is used for carrying out dynamic resource allocation on the second distributed system according to the target resource allocation strategy, and carrying out running state monitoring and strategy optimization on the second distributed system to obtain an optimized resource allocation strategy.
8. The dynamic resource allocation system in a distributed system environment according to claim 7, wherein the acquisition module is specifically configured to:
acquiring a plurality of first distributed systems, and respectively acquiring resource data of the plurality of first distributed systems to acquire system resource data of each first distributed system;
Performing association relation analysis on the plurality of first distributed systems according to the system resource data to obtain a target system association relation;
And according to the association relation of the target systems, carrying out module division and modularized integration on the plurality of first distributed systems to obtain a second distributed system.
9. The dynamic resource allocation system in a distributed system environment according to claim 7, wherein the monitoring module is specifically configured to:
receiving a plurality of historical resource allocation requirements, and respectively transmitting the plurality of historical resource allocation requirements to the second distribution system for demand response to obtain a plurality of response state information;
based on the plurality of response state information, carrying out real-time update on a response mode of the second distributed system, and recording a plurality of response modes;
Acquiring resource use starting time, resource calling sequence and resource use data quantity of each response mode;
Generating a plurality of historical resource usage data according to the resource usage start time, the resource call sequence and the resource usage data amount.
CN202310705142.7A 2023-06-14 2023-06-14 Dynamic resource allocation method and system based on distributed system environment Active CN116662010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310705142.7A CN116662010B (en) 2023-06-14 2023-06-14 Dynamic resource allocation method and system based on distributed system environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310705142.7A CN116662010B (en) 2023-06-14 2023-06-14 Dynamic resource allocation method and system based on distributed system environment

Publications (2)

Publication Number Publication Date
CN116662010A CN116662010A (en) 2023-08-29
CN116662010B true CN116662010B (en) 2024-05-07

Family

ID=87715049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310705142.7A Active CN116662010B (en) 2023-06-14 2023-06-14 Dynamic resource allocation method and system based on distributed system environment

Country Status (1)

Country Link
CN (1) CN116662010B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116881744B (en) * 2023-09-07 2023-12-22 北京中科朗易科技有限责任公司 Operation and maintenance data distribution method, device, equipment and medium based on Internet of things
CN117273374A (en) * 2023-10-17 2023-12-22 深圳达普信科技有限公司 Cloud service open platform based on supply chain ecosystem
CN117240806B (en) * 2023-11-16 2024-02-06 北京邮电大学 Network resource allocation and scheduling method under super fusion architecture

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108900358A (en) * 2018-08-01 2018-11-27 重庆邮电大学 Virtual network function dynamic migration method based on deepness belief network resource requirement prediction
WO2022171066A1 (en) * 2021-02-10 2022-08-18 ***通信有限公司研究院 Task allocation method and apparatus based on internet-of-things device, and network training method and apparatus
CN115118602A (en) * 2022-06-21 2022-09-27 中船重工信息科技有限公司 Container resource dynamic scheduling method and system based on usage prediction
CN115629883A (en) * 2022-10-31 2023-01-20 中国农业银行股份有限公司 Resource prediction method, resource prediction device, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7536373B2 (en) * 2006-02-14 2009-05-19 International Business Machines Corporation Resource allocation using relational fuzzy modeling
US20200287923A1 (en) * 2019-03-08 2020-09-10 International Business Machines Corporation Unsupervised learning to simplify distributed systems management

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108900358A (en) * 2018-08-01 2018-11-27 重庆邮电大学 Virtual network function dynamic migration method based on deepness belief network resource requirement prediction
WO2022171066A1 (en) * 2021-02-10 2022-08-18 ***通信有限公司研究院 Task allocation method and apparatus based on internet-of-things device, and network training method and apparatus
CN115118602A (en) * 2022-06-21 2022-09-27 中船重工信息科技有限公司 Container resource dynamic scheduling method and system based on usage prediction
CN115629883A (en) * 2022-10-31 2023-01-20 中国农业银行股份有限公司 Resource prediction method, resource prediction device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于策略的频谱资源动态分配技术;王琳 等;《中国电子科学研究院学报》;第13卷(第1期);第108-114页 *

Also Published As

Publication number Publication date
CN116662010A (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN116662010B (en) Dynamic resource allocation method and system based on distributed system environment
Askarizade Haghighi et al. An energy-efficient dynamic resource management approach based on clustering and meta-heuristic algorithms in cloud computing IaaS platforms: Energy efficient dynamic cloud resource management
Wu et al. Energy and migration cost-aware dynamic virtual machine consolidation in heterogeneous cloud datacenters
CN107404523A (en) Cloud platform adaptive resource dispatches system and method
Huang et al. Scalable orchestration of service function chains in NFV-enabled networks: A federated reinforcement learning approach
CN104317658A (en) MapReduce based load self-adaptive task scheduling method
CN110221920B (en) Deployment method, device, storage medium and system
CN113037800B (en) Job scheduling method and job scheduling device
Li An adaptive overload threshold selection process using Markov decision processes of virtual machine in cloud data center
Wei et al. Multi-dimensional resource allocation in distributed data centers using deep reinforcement learning
CN113467944B (en) Resource deployment device and method for complex software system
Hummaida et al. Scalable virtual machine migration using reinforcement learning
Taghizadeh et al. An efficient data replica placement mechanism using biogeography-based optimization technique in the fog computing environment
Najafizadegan et al. An autonomous model for self‐optimizing virtual machine selection by learning automata in cloud environment
CN113641445B (en) Cloud resource self-adaptive configuration method and system based on depth deterministic strategy
Tekiyehband et al. An efficient dynamic service provisioning mechanism in fog computing environment: A learning automata approach
Li et al. QoS-aware and multi-objective virtual machine dynamic scheduling for big data centers in clouds
CN114490049A (en) Method and system for automatically allocating resources in containerized edge computing
Premalatha et al. Optimal Energy-efficient Resource Allocation and Fault Tolerance scheme for task offloading in IoT-FoG Computing Networks
CN107155215B (en) Distribution method and device of application home service cluster
CN117290102A (en) Cross-domain heterogeneous resource scheduling method and device
CN116668442A (en) High-precision cooperative scheduling system and method for network cloud resources driven by intention
CN114466014B (en) Service scheduling method and device, electronic equipment and storage medium
CN116257336A (en) Operator intelligent parallelization stream processing method and device under fluctuation data stream scene
Arravinth et al. Multi-Agent with Multi Objective-Based Optimized Resource Allocation on Inter-Cloud.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant