CN116340006B

CN116340006B - Computing power resource idle prediction method based on deep learning and storage medium

Info

Publication number: CN116340006B
Application number: CN202310603689.6A
Authority: CN
Inventors: 李参宏; 韩平军; 徐翠兰
Original assignee: Jiangsu Netmarch Technologies Co ltd
Current assignee: Jiangsu Netmarch Technologies Co ltd
Priority date: 2023-05-26
Filing date: 2023-05-26
Publication date: 2024-05-17
Anticipated expiration: 2043-05-26
Also published as: CN116340006A

Abstract

The invention discloses a computational power resource idle prediction method and a storage medium based on deep learning, wherein the method comprises the following steps: collecting business behavior data, classifying the data, and labeling the classified data to obtain a training set; preprocessing business behavior data, wherein the preprocessing comprises data expansion and unbalance processing; constructing a feature extraction model of computational power resource idle prediction, inputting a training set into the model for training, and extracting features of business behavior data by using the trained model; and predicting the idle computing power resources of the business behavior data through the idle computing power resource prediction model. The invention has stronger robustness and generalization capability under the conditions of insufficient service data, low universality of application scenes and the like, and simultaneously, the deep learning model provided by the invention can effectively realize the self-adaptive refinement purpose of the feature extraction stage by combining the attention mechanism.

Description

Computing power resource idle prediction method based on deep learning and storage medium

Technical Field

The invention relates to the technical field of deep learning, in particular to a computational power resource idle prediction method based on deep learning and a computer readable storage medium.

Background

The computing infrastructure has various forms, wide distribution and complex attribution, the computing power measurement has certain difficulty, the island computing power functional units are fixed, the computing power is limited, the single-point computing power supply can not meet the requirement of service diversity, and the development of the emerging industry is limited. The calculation power is used as new kinetic energy for social development, so that the calculation power is flexibly supplied as basic energy sources such as water, electricity and the like according to the need and the quantity, and the calculation power will become the development trend of the calculation power in the future. The core feature of the computing power network is that the computing power resource and the network resource are comprehensively taken over by computing power, so that the network can sense the computing power requirement of a user and the computing power state of the network in real time. After analysis, the power network can dispatch power resources of different positions and different types to serve users. Therefore, the reasonable planning is carried out on the use of the calculation force, and the reasonable distribution of the calculation force resources is realized by accurately predicting the idle condition of the calculation force resources, so that the method has very important significance.

Chinese patent publication No. CN114896070A proposes a GPU resource allocation method for a deep learning task, which predicts the resource demand of the deep learning task and reasonably allocates GPU resources in a container cloud cluster according to the predicted resource demand, thereby realizing GPU resource sharing in the container cloud cluster and improving the GPU utilization rate in the container cloud cluster.

The Chinese patent publication No. CN114035945A proposes a computing power resource allocation method, which comprises the following steps: under the condition that the computational power resources of target equipment in the local area network are insufficient, idle computational power resources which can be provided by support equipment in the local area network are obtained, and the service quality coefficient of the support equipment at least comprises a reliability coefficient which is used for indicating the reliability degree of the support equipment for providing computational power resource support service by utilizing the idle computational power resources in the history time; according to idle computing power resources which can be provided by support equipment in a local area network, a service quality coefficient and target computing power resources required by target equipment, at least one first support equipment for providing computing power support service for the target equipment and first idle computing power resources which are required to be provided by each first support equipment are determined, wherein the first idle computing power resources are used for processing target computing tasks corresponding to the target computing power resources. The method can avoid waste of idle computing power resources and improve the reliability of task processing.

The method can reasonably distribute the computational power resources, but can only feed back the computational power idle state in real time through a system, then manually judge whether the computational power is idle or not, and can not predict the computational power idle time according to the real-time state of the service, so that the computational power waste is easily caused in a period of time. Therefore, further improvement is required in this respect.

Disclosure of Invention

In order to solve the technical problems, the invention provides a computing power resource idle prediction method based on deep learning and a computer readable storage medium.

In order to achieve the above object, the present invention provides a method for predicting the idle computing power resource based on deep learning, comprising the following steps:

Collecting business behavior data, classifying the data, and labeling the classified data to obtain a training set; preprocessing business behavior data, wherein the preprocessing comprises data expansion and unbalance processing; constructing a feature extraction model of computational power resource idle prediction, inputting a training set into the model for training, and extracting features of business behavior data by using the trained model; and predicting the idle computing power resources of the business behavior data through the idle computing power resource prediction model.

Optionally, the preprocessing further comprises redundant sample removal of the business behavior data.

Optionally, the unbalanced processing adopts a random oversampling algorithm to perform unbalanced processing on the idle prediction data of the computational power resources, samples of a minority class are sampled in a random manner, and the sampled samples are combined with the initial samples of the minority class, so that the number of the samples of the minority class is the same as that of the samples of a majority class.

Optionally, the data set processed by the random oversampling algorithm is as follows:

|S′|＝|S_maj|+|S_min|+|E|,

Wherein S _maj and S _min represent a majority sample and a minority sample, S' represents a data set obtained by performing data unbalance processing on the initial data set S, and a data set obtained by randomly sampling the minority sample in the data set is E.

Optionally, the feature extraction model of the computational resource free prediction comprises a residual neural network comprising a convolutional layer, a pooling layer, a residual block, dropout, and a softmax classifier.

Optionally, the computational resource idle prediction model is an extreme learning machine.

Optionally, the network structure of the extreme learning machine includes an input layer, an hidden layer, and an output layer.

Optionally, the method further comprises: optimizing a residual neural network through an evolution algorithm, taking the inverse of error as an fitness function, and carrying out a standard for measuring the individual fitness in the population, wherein the formula is as follows:

Wherein E is an error function, P is an integral output, w is a weight vector, x is an input vector, F is a fitness, j is a number of times of selection, and y _j is a theoretical output.

Optionally, the residual neural network utilizes an attention mechanism to improve the threshold value, and the network automatically generates a corresponding threshold value according to the data.

To achieve the above object, the present invention provides a computer-readable storage medium storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform the deep learning-based power resource idling prediction method of any one of the above.

According to the technical scheme provided by the invention, the data marking is carried out by collecting the business behavior data, the data expansion and unbalance processing are carried out on the data, the feature extraction and modeling are further carried out on the data, the idle time of calculation force is predicted, and the reasonable distribution of calculation force resources is realized. Compared with the prior art, the method provided by the invention has stronger robustness and generalization capability under the conditions of insufficient service data, low application scene universality and the like, and meanwhile, the deep learning model provided by the invention can effectively realize the self-adaptive refinement purpose of the feature extraction stage by combining the attention mechanism.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of an inventive computing power resource idle prediction method based on deep learning.

FIG. 2 is a flow chart of a method for unbalanced processing of computational power resource idle prediction data according to the present invention.

Fig. 3 is a schematic structural diagram of a residual neural network according to the present invention.

Fig. 4 is a flow chart of the adaptive threshold generation process of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to specific embodiments, structures, features and effects of a high-performance data caching method according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The invention provides a computational power resource idle prediction method based on deep learning, referring to fig. 1, the method comprises the following steps:

and S10, collecting business behavior data, classifying the data, and marking the classified data to obtain a training set.

The data adopted by the invention for carrying out idle prediction on the computing power resources is business behavior data, and comprises a series of indexes related to business. In the invention, an idle prediction feature library of computing power resources covering network flow features, data packet features, equipment operation features and the like is acquired, and data are marked as training samples for model training.

And step S20, preprocessing the business behavior data, wherein the preprocessing comprises data expansion and unbalance processing.

In one embodiment of the present invention, the preprocessing further includes redundant sample removal for the business activity data.

And redundant samples of the data are removed, so that the data quality is improved. Meanwhile, sample expansion and unbalance processing are carried out on the data. Referring to fig. 2, the present invention proposes to use a random oversampling algorithm to perform unbalanced processing on idle prediction data samples of computing resources, where the principle of the random oversampling algorithm is to sample samples of minority classes in a random manner, and combine the sampled samples with initial samples of minority classes to obtain a balanced prediction data set of computing resources, so that the number of samples of minority classes is increased to the same level as that of samples of majority classes.

Assuming that the initial data set is S, the data set obtained by randomly sampling a few types of samples in the data set is E, and the data set obtained by processing through a random oversampling algorithm is expressed as follows:

|S′|＝|S_maj|+|S_min|+|E|,

Wherein S _maj and S _min represent a majority type sample and a minority type sample, respectively, and S' represents a data set subjected to data imbalance processing finally obtained.

In addition, the data set E is sampled from the minority class samples S _min by means of random sampling. The data set E obtained by sampling from the minority class samples S _min is combined with the majority class samples S _maj and the minority class samples S _min to obtain the data set S' after the data unbalance processing, so that the balance of samples among different classes in the computational resource idle prediction data set is realized.

And step S30, constructing a feature extraction model of computational power resource idle prediction, inputting a training set into the model for training, and extracting features of business behavior data by using the trained model.

In an actual application scene of idle prediction of computational resources, supervised learning is needed on a large number of manually marked data sets, but the method is limited by the problems of marking cost and marking quality of marked data, and the scale of the actually available marked training data is limited, so how to fully learn the feature expression with good generalization capability from the marked data with limited scale is an important problem in idle prediction tasks of computational resources.

The invention provides a feature extraction model using a residual neural network based on an evolution algorithm and an attention mechanism as an idle prediction of computational resources.

Referring to fig. 3, the residual neural network structure includes a convolution layer, a pooling layer, a residual block, a Dropout, and a softmax classifier, and the specific structure of each layer is as follows:

(1) Convolutional layer

The convolution layer is used for extracting the characteristics of the object, and the depth characteristic extraction of the data is realized by setting the super parameters such as the number of convolution layers, the size of a convolution window and the like. A further feature of convolutional neural networks is the sharing of parameter thresholds, the magnitude of which determines the quality of the convolution. The threshold parameters are deleted correspondingly every time they are used, and then the next convolution is carried out according to the result generated as required. However, the parameter value is always unchanged in the same operation at different stages, and the convolution kernel is the number set of weight parameters. In the process of selecting the convolution kernel, the convolution kernel freely moves in fixed data and performs convolution processing on the convolution kernel and the part of the corresponding area, and then the rest part is filled. The convolution (by adding a bias term) operation between the input feature and the convolution kernel can be expressed as:

Wherein: x _i is the ith channel of the input feature; y _j is the jth channel of the output feature; k is a convolution kernel; b _j is a bias term; m _j is a set of jth channels used to calculate output characteristics.

(2) Batch normalization

Batch normalization (batch normalization, BN) is one normalization approach proposed to the internal covariance offset problem. In the proposed classification algorithm, batch normalization is introduced, so that the convergence rate of the model can be increased, and more importantly, the problem of gradient dispersion in a deep network is relieved to a certain extent, so that the deep network model is easier and more stable to train.

The batch normalization selects a small batch in the deep learning training process, then calculates the mean value and variance of the small batch data, and the input of each layer of neural network is kept in the same distribution in the training process after the processing.

Unlike the general normalization method, batch normalization is an operation embedded within the deep neural network from layer to layer. The calculation process of BN is expressed as:

y_i＝γz_i+β；

Wherein: x _i and y _i represent input and output characteristics, respectively, of the ith observation in the batch; n _batch is the number of samples per batch in the classification task; gamma and beta are two trainable parameters, and more proper characteristic distribution can be adaptively learned; epsilon is a constant close to zero; mu is a first normalization influence factor, sigma ² is a second normalization influence factor, and z _i is a third normalization influence factor.

(3) Activation function design

ReLU is used as the most common activation function, solves the problems of S-type local gradient explosion and gradient disappearance, and accelerates the convergence of the neural network.

The algorithm for ReLU is as follows:

y＝max(0,x)；

Wherein: x and y are the input and output, respectively, of the ReLU activation function.

The ReLU algorithm discards such vibration signals when there is oscillation in the input signal, impairing the classification predictive ability of the model. In this regard, the method of the present invention employs LReLU as an activation function to address the problem that arises when there is oscillation in the input signal. The specific algorithm is as follows:

wherein: x and y are the input and output, respectively, of LReLU activation functions; a is obtained according to practical experience, and a large number of experiments prove that the effect of the value of a is optimal within the range of 0-0.5.

(4) Basic principle of residual error module

The residual module (residual building block, RBB) is the core of Resnet, and RBB is implemented by skipping convolutional layer blocks using a shortcut connection, avoiding gradient explosion and disappearance, helping to construct deeper neural network structures, improving the final performance of fault diagnosis.

The execution path of the convolution layer block F (x) is "input x→bn layer→activation function relu→convolution layer→bn layer→activation function relu→convolution layer→output F (x)". When the input dimension and the output dimension of the convolution layer block are the same, the output value of the shortcut connection is the input value x, and the final output result of the residual error module is shown as the following formula:

y＝F(x)+x；

when the input and output dimensions are different, the shortcut connection needs to use a convolution layer with a convolution kernel size of 1×1 to match the dimension of the output result, so as to obtain the output H (x) of the shortcut connection, and the final output result is shown in the following formula:

y＝F(x)+H(x)；

(5) Extrusion and excitation network structure

The invention adopts an extrusion and excitation network structure (SENet), which can automatically obtain the importance of each channel and strengthen the connection among the channels, thereby achieving the purpose of improving the model performance. The core of the structure is two major operations of extrusion (Squeeze) and Excitation (specification).

The Squeeze operation is a global pooling of input features, compressing each feature into a real number with a global receptive field. The specific algorithm is shown as follows:

wherein, in the formula: x _i represents the ith feature input as h×w in size.

The accounting operation mainly consists of 2 full connection layers and 2 activation functions, and can help capture channel correlation and generate weight of a corresponding channel. The algorithm is shown as follows:

y_i＝F_ex(F_sq(x_i),ω)＝σ(ω₂δ(ω₁F_sq(x_i)))；

Wherein: omega ₁ represents the first full connection layer calculation; omega ₂ represents a second full connection layer calculation; f (x) represents the output value after the squeze operation; delta represents the activation function ReLU; sigma is a Sigmoid function specific algorithm as follows:

wherein: x represents the output value after 2 full-join calculations.

(6) Cross entropy loss function

The cross entropy loss function Softmax, which is typically an activation function of the final output layer, fixes the output value of the neural network between (0, 1) to represent the probability of different events occurring, the algorithm is shown as follows:

Wherein: n _class represents the category related to the classification task, x _j represents the j-th output of the upper layer; y _j represents the jth predictor of the neural network.

(7) Pooling-layer global average pooling

Global averaging pooling (global average pooling, GAP) is an operation for averaging features, can greatly reduce parameters during training of the neural network, accelerates the calculation speed of the neural network, and is a common deep learning pooling operation.

(8) Residual block design

In the residual block, a numerical threshold is typically set to remove redundant noise, and the threshold is applied in many noise reduction, and the formula is as follows:

Where x represents the input data and τ is the threshold, i.e., the features within the interval of the threshold [ - τ, τ ] are set to 0, letting features farther from 0 tend to shrink toward 0.

Based on the depth residual error network, the threshold is improved by using the attention mechanism, so that the network automatically generates a corresponding threshold according to the data to eliminate noise, and each group of data can carry out unique characteristic channel weighting adjustment according to different importance degrees of the samples. In the process of generating the adaptive threshold, as shown in fig. 4, data is subjected to global convolution processing, then is subjected to batch normalization and activation, and output is mapped into [0,1] by using Sigmoid function, the scaling coefficient of the mapping is marked as alpha, and the final threshold can be expressed as alpha×a, so that different samples correspond to different thresholds. And adding the self-adaptive threshold block into a residual error network to be improved into a residual error shrinkage module, thereby achieving the purpose of eliminating or weakening noise.

Because the full-connection layer spreads the convolution layer and then classifies each feature map, the parameter calculation amount of the full-connection layer is huge, and often occupies most of the total parameter calculation amount of the network, so that the training speed of the neural network is very slow. In order to solve the problem of low training speed, global convolution is introduced into a network, wherein the global convolution is to directly carry out convolution processing on the feature map of each channel, namely, one feature map outputs one value, and then the result is input into a classifier for classification. In the identification task, the global convolution can generate a feature map for each particular class in the final convolution layer.

The GAP is added to the original full-connection layer, the parameters required to be calculated are greatly reduced, the calculation speed of the network is greatly improved, and the GAP does not need a great amount of training optimization parameters like the full-connection layer, so that the problem of over-fitting is avoided. GAP summarizes spatial information and is therefore more robust to spatial transformations of the input.

(9) Evolutionary algorithm optimized residual neural network

Based on the residual neural network, a mode of optimizing by using an evolution algorithm is provided to replace an original back propagation optimizing mode. In the evolution algorithm adopted by the invention, the inverse of the error is adopted as the fitness function, and the standard for measuring the individual fitness in the population is carried out, wherein the formula is as follows:

The traditional evolution algorithm often adopts a mode of 'roulette' in the working process, the probability of selecting individuals in the population is random, the optimal individuals are most likely to be lost in the selection mode, and larger errors can be generated in the actual operation process, so that the invention improves the selection operator, firstly, the individuals in the population are rearranged by using a sorting method, and the probability of selecting the individuals after the rearrangement is as follows:

p＝s(1-p₀)^b-1；

where a is the number of populations in the evolution algorithm, p ₀ is the probability that the optimal individual may be selected, s is the normalized value of p ₀, and b is the position of the nth individual after the population is rearranged.

And S40, predicting the idle computing power resources of the business behavior data through the idle computing power resource prediction model.

The present invention proposes to use an extreme learning machine as a predictive model. Unlike a conventional single hidden layer feed-forward neural network (SLFNs), the extreme learning machine randomly assigns input weights and hidden layer biases without the need to adjust parameters as they are back-propagated to errors in the neural network. The output weight of the network model of the extreme learning machine is directly determined by solving the linear model, so that the training stage of the extreme learning machine is completed only by one iteration, and the training speed is extremely high. The network structure of the extreme learning machine comprises: the input layer, the hidden layer and the output layer, wherein the connection between the input layer and the hidden layer is established by the input weight omega, and the connection between the hidden layer and the output layer is established by the output weight beta.

Assuming that the given input data is a training dataset consisting of N arbitrary different samplesWhere x _i＝[x_i1,x_i2,…,x_in]^T∈Rⁿ includes n features for each sample, the tag t _i＝[t_i1,t_i2,…,t_im]^T∈R^m includes m output categories. The output of a standard SLFN containing L neurons can be expressed as:

Where ω _i＝[ω_i1,ω_i2,…,ω_in]^T∈Rⁿ, i=1, …, L is the input weight of the i-th hidden layer neuron, and b _i is the bias of the i-th hidden layer neuron. β _i＝[β_i1,β_i2,…,β_im]^T∈R^m is the output weight of the ith neuron, the output value of the network is o _i＝[o_i1,o_i2,…,o_im]^T∈R^m, and g () is the activation function. In extreme learning machines, the Sigmoid function is often regarded as an activation function:

loss function of standard SLFN In case the network parameters ω, b, β are fully adjustable, an infinitely close to zero error is possible. In this case, the formula is converted into:

Thus, the above N formulas may be combined together into a matrix form of hβ=t;

wherein,

The matrix H is the output of the hidden layer and T is the real class label. The output weight beta is calculated by solving the least square problem:

wherein, Is the MP generalized inverse of the hidden layer output H.

The method provided by the invention has stronger robustness and generalization capability under the conditions of insufficient service data, low application scene universality and the like, and meanwhile, the deep learning model provided by the invention is combined with a attention mechanism, so that the self-adaptive refinement purpose of the feature extraction stage can be effectively realized.

Embodiments of the present invention provide a computer-readable storage medium storing computer-executable instructions for execution by one or more processors, e.g., to perform the method steps S10 through S40 of fig. 1 described above.

In particular, the computer-readable storage medium can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically erasable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM may be available in many forms such as Synchronous RAM (SRAM), dynamic RAM, (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), SYNCHLINK DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The disclosed memory components or memories of the operating environment described in embodiments of the present invention are intended to comprise one or more of these and/or any other suitable types of memory.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims

1. A method for computing power resource idle prediction based on deep learning, the method comprising:

collecting business behavior data, classifying the data, and labeling the classified data to obtain a training set;

Preprocessing business behavior data, wherein the preprocessing comprises data expansion and unbalance processing;

Constructing a feature extraction model of computational power resource idle prediction, inputting a training set into the model for training, and carrying out feature extraction on the service behavior data subjected to pretreatment by using the trained model;

Predicting the computing power resource idle of the business behavior data with the feature extraction completed through the computing power resource idle prediction model;

The feature extraction model of the computational power resource idle prediction comprises a residual neural network, wherein the residual neural network comprises a convolution layer, a pooling layer, a residual block, a Dropout and a softmax classifier;

The convolution layer is used for extracting the characteristics of the object, and depth characteristic extraction of the data is realized by setting the number of convolution layers and the convolution window size super-parameters;

in the feature extraction model, performing global pooling processing on the input features by utilizing the Squeeze operation;

the algorithm is shown as follows:

；

Wherein, in the formula: representing a Squeeze operation function; /(I) Representative input is size/>Is the ith feature of (2);

In the feature extraction model, capturing the correlation of the channels by utilizing the specification operation, and generating the weight of the corresponding channel;

the algorithm is shown as follows:

；

wherein: For input as/> Corresponding output; /(I)Representing a Squeeze operation function; /(I)Representing an expression operation function; /(I)Representing a first full connection layer calculation; /(I)Representing a second full connection layer calculation; /(I)Representing an activation function ReLU; sigma is a Sigmoid function specific algorithm as follows:

；

wherein: x represents the output value after 2 full-join calculations.

2. The deep learning based computational power resource idleness prediction method according to claim 1, wherein the preprocessing further comprises redundant sample removal of business behavior data.

3. The deep learning-based computational power resource idle prediction method of claim 1, wherein the unbalanced processing is performed by adopting a random oversampling algorithm, samples of a minority class are sampled in a random manner, and samples obtained by sampling are combined with initial samples of a minority class, so that the number of the samples of the minority class is the same as that of the samples of a majority class.

4. The deep learning-based computational power resource idle prediction method of claim 3, wherein the data set processed by the random oversampling algorithm is as follows:

wherein, And/>The data set is obtained by randomly sampling the minority class samples in the data set, and S' represents the data set obtained by carrying out data unbalance processing on the initial data set S.

5. The deep learning-based computational power resource idle prediction method of claim 1, wherein the computational power resource idle prediction model is an extreme learning machine.

6. The deep learning based computational power resource idle prediction method of claim 5, wherein the network structure of the extreme learning machine comprises an input layer, an implicit layer, and an output layer.

7. The deep learning based computing power resource idleness prediction method of claim 5, wherein the method further comprises:

optimizing a residual neural network through an evolution algorithm, and determining a standard for measuring the individual adaptability in the population by taking the inverse of the error as an adaptability function, wherein the formula is as follows:

；

wherein E is an error function, P is an integral output, Is a weight vector,/>For input vectors, F is fitness, j is number of selections,/>Is a theoretical output.

8. The deep learning-based computational power resource idle prediction method of claim 5, wherein the residual neural network utilizes an attention mechanism to improve the threshold, and the network automatically generates a corresponding threshold according to the data itself.

9. A storage medium being a computer-readable storage medium storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform the deep learning based power resource idleness prediction method of any of claims 1-8.