CN112416588B

CN112416588B - Resource allocation method based on random forest algorithm

Info

Publication number: CN112416588B
Application number: CN202011313348.8A
Authority: CN
Inventors: 邹适宇; 李复名; 谢爱平; 周涛
Original assignee: CETC 29 Research Institute
Current assignee: CETC 29 Research Institute
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2022-06-07
Anticipated expiration: 2040-11-20
Also published as: CN112416588A

Abstract

The invention discloses a resource allocation method based on a random forest algorithm, which comprises the following steps: s1: a resource allocation mathematical model construction step, in which the definition of a resource allocation mathematical model is carried out on a target object; s2: designing a cost function; s3: a step of constructing a random forest training data set, which is to construct the training data set of the random forest by using a classification idea based on a resource distribution mathematical model and a cost function; s4: a random forest generation step, S5: a target object prediction step based on historical data; s6: and resource allocation step based on the prediction structure. The resource allocation method provided by the invention can satisfy the constraint condition of task completion degree, and has a random forest-based resource allocation algorithm with high optimal solution solving probability, thereby solving the problems that intelligent optimization algorithms such as genetic algorithm are easy to fall into local optimization and the like.

Description

Resource allocation method based on random forest algorithm

Technical Field

The invention belongs to the technical field of resource allocation, and particularly relates to a random forest algorithm for resource allocation.

Background

With the rapid development of modern society and scientific technology, the matching problem between resources and individuals exists in various actual scenes nowadays. The resource allocation problem refers to that aiming at a certain target, resources with the size of N are mapped and divided into P individuals according to a certain strategy so as to achieve the system targets of improving efficiency, saving cost, reasonably allocating resources or optimizing overall income and the like. In various relatively simple small-scale resource allocation problems, deterministic algorithms such as a traditional exhaustion method, a branch definition method, a dynamic programming method and the like can quickly realize optimization. However, when the size of the problem increases to a certain extent or the complexity of the problem is high, the search efficiency of these algorithms decreases sharply, and failure occurs quickly. In particular, the resource allocation problem is an NP-hard problem, and the application of the deterministic algorithm is very limited, i.e. the failure of the deterministic algorithm is inevitable for the problem that the optimization cannot be completed within the polynomial time.

With the continuous improvement of computer capacity and computing speed, the generation of large-scale parallel processing technology and the gradual maturity of parallel distributed theory, intelligent optimization algorithms developed by simulating certain characteristics shown in biological and physical processes and human behavior processes in nature, such as genetic algorithms, simulated annealing algorithms, particle swarm algorithms, firework algorithms and the like, are widely concerned by researchers in various countries in the field of solving combined optimization problems including resource allocation problems. The algorithms overcome the defects of the traditional deterministic algorithm to a great extent and provide a new idea and means for solving the problem of resource allocation. However, these algorithms still suffer from the following disadvantages: (1) the universality and robustness of the algorithm need to be improved; (2) the ductility of the algorithm is insufficient, and the performance of the algorithm is rapidly reduced along with the increase of the problem scale; (3) the algorithm is prone to fall into local optima. Therefore, the existing intelligent optimization algorithm cannot well meet the solving requirements of part of large-scale resource allocation problems.

Disclosure of Invention

The invention aims to solve the problem that intelligent optimization algorithms such as genetic algorithm are easy to fall into local optimization and the like, and provides a resource allocation algorithm based on random forests, which can meet the constraint condition of task completion degree and has high optimal solution solving probability.

The purpose of the invention is realized by the following technical scheme:

a resource allocation method based on a random forest algorithm comprises the following steps: s1: a resource allocation mathematical model construction step, in which the definition of a resource allocation mathematical model is carried out on a target object; s2: designing a cost function; s3: a step of constructing a random forest training data set, which is to construct the training data set of the random forest by using a classification idea based on a resource distribution mathematical model and a cost function; s4: a random forest generation step, S5: a target object prediction step based on historical data; s6: and resource allocation step based on the prediction structure.

According to a preferred embodiment, the step S1 specifically includes: setting M types of target objects in the target object set T, and respectively recording the M types of target objects as T_i＝{T₁,T₂,...,T_MI e {1,2, …, M }, where each class of target objects contains a different combination of individuals. Is provided with N types of resources, denoted as R_i＝{R₁,R₂,...,R_NAnd e, i belongs to {1, 2.., N }, and combining the N types of resources to obtain a resource attribute combination table.

According to a preferred embodiment, the cost function designing step of step S2 includes: different cost functions are designed for different resource allocation problems.

According to a preferred embodiment, the step S3 specifically includes: randomly sampling a target object set, obtaining n types of resources which can be allocated to the m individuals according to resource attributes, combining the n types of resources, and obtaining different allocation schemes under the condition of meeting constraint conditions; and solving the cost value of each obtained distribution scheme to obtain the distribution scheme with the lowest cost function value, inquiring a resource attribute combination table, taking the index of the distribution scheme as the label of the target object, repeating the process until a training initial training set with a target quantity is generated, and performing feature extraction on the training initial training set to obtain a training data set.

According to a preferred embodiment, said step S4 includes: carrying out a-time replaced random sampling on the training set, and acquiring b times every time to obtain a sampling sets containing b samples; and (4) training by using a sampling set obtained by each sampling to obtain a decision tree models, wherein all decision trees jointly form a random forest.

According to a preferred embodiment, in step S4, when training the nodes of each decision tree model, part of the sample features are randomly selected from all the sample features on the nodes, and an optimal feature is selected from the randomly selected part of the sample features to make the left and right subtrees of the decision tree.

According to a preferred embodiment, said step S5 includes: and mining and analyzing the historical data, and predicting to obtain a target object needing resource allocation in each time period according to the time sequence information.

According to a preferred embodiment, the step S5 specifically includes: analyzing and mining the historical data of the target object to obtain attribute information of the target object, classifying the target object according to time sequence information, predicting the target object by adopting a random forest method respectively to obtain prediction results of different time periods, and taking each result as an allocated test set, namely performing resource allocation on the target object set obtained by prediction.

According to a preferred embodiment, the step S6 specifically includes: and inputting a target object set, outputting a prediction label by each decision tree, outputting the prediction label by a random forest reuse voting mode, and obtaining a final resource allocation scheme according to the label index.

The main scheme and the further selection schemes can be freely combined to form a plurality of schemes which are all adopted and claimed by the invention; in the invention, the selection (each non-conflict selection) and other selections can be freely combined. The skilled person in the art can understand various combinations according to the prior art and the common general knowledge after understanding the solution of the present invention, and the combinations are all the technical solutions to be protected by the present invention, and are not exhaustive here.

Compared with the traditional intelligent bionic algorithm, the resource allocation algorithm based on the random forest has the following beneficial effects:

(1) the algorithm can effectively solve the problems that the traditional bionic algorithm is easy to fall into local optimization and the like, and can calculate to obtain a global optimal resource allocation scheme on the basis of meeting resource allocation constraint conditions.

(2) The algorithm can meet the requirement of resource allocation on timeliness, model training is realized through the existing data by a method for generating random forests, a decision tree model is formed, and finally, the resource allocation scheme is rapidly acquired.

(3) The algorithm is based on analysis and mining of historical data of a target object, the behavior characteristics of the target object are analyzed, a target portrait is completed, then a resource allocation problem is converted into a classification problem by utilizing a classification idea, and a random forest algorithm is adopted for optimal solution.

Drawings

FIG. 1 is a schematic diagram of the resource allocation method of the present invention;

FIG. 2 is a graph of random forest learning for the resource allocation method of the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that, in order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described below, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments.

Thus, the following detailed description of the embodiments of the present invention is not intended to limit the scope of the invention as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.

In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. indicate orientations and positional relationships that are conventionally used in the products of the present invention, and are used merely for convenience in describing the present invention and for simplicity in description, but do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and therefore, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

Furthermore, the terms "horizontal", "vertical", "overhang" and the like do not imply that the components are required to be absolutely horizontal or overhang, but may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.

In the description of the present invention, it should also be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "mounted," "connected," and "connected" are to be construed broadly and may, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

In addition, it should be noted that, in the present invention, if the specific structures, connection relationships, position relationships, power source relationships, and the like are not written in particular, the structures, connection relationships, position relationships, power source relationships, and the like related to the present invention can be known by those skilled in the art without creative work on the basis of the prior art.

Referring to fig. 1, the invention discloses a resource allocation method based on a random forest algorithm, the resource allocation method comprising the following steps:

step S1: and a step of constructing a resource allocation mathematical model, which is to define the resource allocation mathematical model for the target object.

Preferably, the step S1 hasThe body includes: setting M types of target objects in the target object set T, and respectively recording the M types of target objects as T_i＝{T₁,T₂,...,T_MI e {1,2, …, M }, where each class of target objects contains a different combination of individuals. Is provided with N resources, denoted as R_i＝{R₁,R₂,...,R_NAnd e, i belongs to {1, 2.,. N }, and the N resources are combined to obtain a resource attribute combination table. I.e., there are N types of resources that can be allocated to the M type target object.

Step S2: and designing a cost function. The cost function is used for representing the overall consumption of the system by the allocation scheme, and the lower the cost is, the better the allocation scheme is proved to be, different cost functions need to be designed aiming at different resource allocation problems.

Step S3: and a step of constructing a random forest training data set, which is to construct the training data set of the random forest by using a classification idea based on a resource distribution mathematical model and a cost function.

Preferably, the step S3 specifically includes: randomly sampling a target object set, obtaining n types of resources which can be allocated to the m individuals according to resource attributes, combining the n types of resources, and obtaining different allocation schemes under the condition of meeting constraint conditions; and solving the cost value of each obtained distribution scheme to obtain the distribution scheme with the lowest cost function value, inquiring a resource attribute combination table, taking the index of the distribution scheme as the label of the target object, repeating the process until a training initial training set with a target quantity is generated, and performing feature extraction on the training initial training set to obtain a training data set.

Step S4: and (5) random forest generation. The step S4 includes: carrying out a-time replaced random sampling on the training set, and acquiring b times every time to obtain a sampling sets containing b samples; and (4) training by using a sampling set obtained by each sampling to obtain a decision tree models, wherein all decision trees jointly form a random forest.

Further, in step S4, when training the node of each decision tree model, randomly select some sample features from all sample features on the node, and select an optimal feature from the randomly selected some sample features to make the left and right subtrees of the decision tree.

Step S5: and predicting the target object based on the historical data. The step S5 includes: and mining and analyzing the historical data, and predicting to obtain a target object needing resource allocation in each time period according to the time sequence information.

Preferably, the step S5 specifically includes: analyzing and mining the historical data of the target object to obtain attribute information of the target object, classifying the target object according to time sequence information, predicting the target object by adopting a random forest method respectively to obtain prediction results of different time periods, and taking each result as an allocated test set, namely performing resource allocation on the target object set obtained by prediction.

Step S6: and resource allocation step based on the prediction structure. The step S6 specifically includes: and inputting a target object set, outputting a prediction label by each decision tree, outputting the prediction label by a random forest reuse voting mode, and obtaining a final resource allocation scheme according to the label index.

Compared with the traditional intelligent bionic algorithm, the resource allocation algorithm based on the random forest has the following beneficial effects: (1) the algorithm can effectively solve the problems that the traditional bionic algorithm is easy to fall into local optimization and the like, and can calculate to obtain a global optimal resource allocation scheme on the basis of meeting resource allocation constraint conditions. (2) The algorithm can meet the requirement of resource allocation on timeliness, model training is realized through the existing data by a method for generating random forests, a decision tree model is formed, and finally, the resource allocation scheme is rapidly acquired. (3) The algorithm is based on analysis and mining of historical data of a target object, the behavior characteristics of the target object are analyzed, a target portrait is completed, then a resource allocation problem is converted into a classification problem by utilizing a classification idea, and a random forest algorithm is adopted for optimal solution.

Examples

The invention is further explained by taking the problem of resource allocation of the anti-unmanned aerial vehicle system as an example. The method comprises the following specific steps:

1) and (4) defining a resource allocation model. Setting M types of unmanned aerial vehicle combinations in a target object set T, wherein the flying use frequencies of the M types of unmanned aerial vehicle combinations are different and are respectively marked as T_i＝{T₁,T₂,...,T_MAnd e, i belongs to {1,2, …, M }, wherein each target object contains different unmanned planes and is formed by combining t types of unmanned planes at most. Is provided with an N-type anti-unmanned aerial vehicle system which is recorded as R_i＝{R₁,R₂,...,R_NAnd e, i belongs to {1, 2.,. N }, namely, N types of anti-unmanned aerial vehicle systems can be allocated to the M types of unmanned aerial vehicle combinations.

2) And combining the resource attributes. And (4) arranging and combining the N types of anti-unmanned aerial vehicle systems, wherein the number of the types of the anti-unmanned aerial vehicle systems in each combination mode is not more than t, so as to obtain a resource attribute combination table.

3) And designing a cost function for resisting the resource allocation problem of the unmanned aerial vehicle system. The cost function is designed aiming at the problem of resource allocation of the anti-unmanned aerial vehicle system as follows:

wherein, N is the anti-unmanned aerial vehicle system number that needs the distribution, and i is the kind serial number of anti-unmanned aerial vehicle system, and price, the weight of the anti-unmanned aerial vehicle system of ith kind are p respectively_i,w_i，d_iIs the linear distance between the anti-drone system and the target object, a₁As a weight of the price, a₂As weight of the path cost, a₁+a₂＝1。

4) Randomly sampling a target object set, and recording the sampling result as T_i＝[t₁,t₂,...,t_m]It means that m types of drones are included in the target object. N types of anti-unmanned aerial vehicle systems which can be allocated to the m types of unmanned aerial vehicles are combined, and the constraint condition is set to that the anti-unmanned aerial vehicle systems need to carry out full-frequency coverage on all unmanned aerial vehicles in the distribution scheme. Under the condition of satisfying the constraint condition, obtaining different distribution schemes W_i＝{w₁,w₂,...,w_k},i∈{1,2,...,k}。

5) Obtaining the cost value of each distribution scheme obtained in the step 4), obtaining the distribution scheme with the lowest cost function value, inquiring the resource attribute combination table, and taking the index of the distribution scheme as the label of the target object.

6) And generating a random forest training data set. Repeating the steps 4) -5) to obtain a trained initial data set, selecting a maximum value, a minimum value, a mean value and a variance as characteristic values of the sample, and performing characteristic extraction on the data set to obtain { F }₁,F₂,...,F_nAs a training data set, where F_n＝{[max_n,min_n,mean_n,var_n]And the values of the maximum, the minimum, the mean and the variance of the samples are represented by the values of the maximum, the minimum, the mean and the variance.

7) And (4) random forest generation. And (4) carrying out a-time replaced random sampling on the training set, and collecting b times in total to obtain a sampling sets containing b samples. And (4) training by using a sampling set obtained by each sampling to obtain a decision tree models, wherein all decision trees jointly form a random forest. When the nodes of each decision tree model are trained, a part of sample features are selected from all sample features on the nodes, an optimal feature is selected from the randomly selected part of sample features to divide left and right subtrees of the decision tree, and finally, a learning curve of the random forest model is shown in fig. 2. Fig. 2 is a random forest learning curve diagram of the resource allocation method based on random forests, according to the input test samples, each decision tree outputs a prediction label, and the random forests output prediction labels in a voting manner. In order to evaluate the performance of the random forest algorithm, 30 types of anti-unmanned aerial vehicle system resources are set, 10 types of unmanned aerial vehicles fly to use frequency bands, and the figure shows the relationship between the accuracy of the generation and distribution scheme of the algorithm and the number of training samples.

8) Target object prediction based on historical data. The historical data of the unmanned aerial vehicle flight is analyzed and mined to obtain data such as flight time, coordinates and use frequency band, and the form of the data is shown in table 1.

Time	Location of a site	Uplink remote control link	Downlink remote control link	Information transmission	Others are
						1:00	[200,200]	840Mhz	845Mhz	1430Mhz	2400Mhz

TABLE 1

Classifying the flight use frequency of the unmanned aerial vehicle according to functions of uplink and downlink remote control links, information transmission and the like, predicting frequency band information used by the unmanned aerial vehicle flying in each time period in a key area for different functions by adopting a random forest algorithm according to time sequence information of daily flight data, and recording the prediction result as O_i＝{o₁,o₂,...,o_mAnd the prediction result is used as a test set of a resource allocation scheme, wherein i belongs to {1, 2.. eta., m }, namely the common flight use frequency band of the unmanned aerial vehicle in each time period in the key area.

9) And testing the random forest model and evaluating the precision. And (3) outputting a prediction label for each decision tree according to the input test sample, and outputting the prediction label by using a voting mode in the random forest. In order to evaluate the performance of the random forest algorithm, 30 types of anti-unmanned aerial vehicle system resources are set, 10 types of unmanned aerial vehicles use frequency bands in flying, the KNN and the decision tree algorithm in the classification algorithm are adopted to carry out optimal solution on the mathematical model, the optimal solution is compared with the performance of the random forest algorithm, the comparison results are shown in tables 2 and 3 when the training sample numbers are 1000 and 200, and the random forest algorithm is the highest in accuracy and has more obvious advantages on a small data set.

Name of algorithm	KNN	Decision tree	Random forest
				Accuracy (%)	93.57	97.62	98.65
Algorithm duration(s)	0.002	0.004	0.178

TABLE 2

Name of algorithm	KNN	Decision tree	Random forest
				Accuracy (%)	71.39	83.61	87.44
Algorithm duration(s)	0.001	0.002	0.176

TABLE 3

The foregoing basic embodiments of the invention and their various further alternatives can be freely combined to form multiple embodiments, all of which are contemplated and claimed herein. In the scheme of the invention, each selection example can be combined with any other basic example and selection example at will. Numerous combinations will be known to those skilled in the art.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A resource allocation method based on a random forest algorithm is characterized in that the resource allocation comprises the following steps:

s1: a resource allocation mathematical model construction step, in which the definition of a resource allocation mathematical model is carried out on a target object;

the step S1 specifically includes: setting M kinds of target objects in the target object set T, and respectively recording the M kinds of target objects as T_i＝{T₁,T₂,...,T_MAnd i belongs to {1,2, …, M }, wherein each type of target object comprises different individual combinations, is provided with N types of resources and is marked as R_i＝{R₁,R₂,...,R_NThe resource attribute combination table is obtained by combining N resources;

s2: designing a cost function;

the cost function designing step of step S2 includes: designing different cost functions aiming at different resource allocation problems;

s3: a step of constructing a random forest training data set, which is to construct the training data set of the random forest by using a classification idea based on a resource distribution mathematical model and a cost function;

the step S3 specifically includes: randomly sampling a target object set, obtaining n types of resources which can be allocated to the m individuals according to resource attributes, combining the n types of resources, and obtaining different allocation schemes under the condition of meeting constraint conditions;

then, solving the cost value of each obtained distribution scheme to obtain a distribution scheme with the lowest cost function value, inquiring a resource attribute combination table, taking the index of the distribution scheme as the label of the target object, repeating the process until a training initial training set with a target quantity is generated, and performing feature extraction on the training initial training set to obtain a training data set;

s4: a step of generating a random forest, wherein,

the step S4 includes:

carrying out a-time replaced random sampling on the training set, and acquiring b times every time to obtain a sampling sets containing b samples;

training a sampling set obtained by each sampling to obtain a decision tree models, wherein all decision trees jointly form a random forest;

s5: a target object prediction step based on historical data;

the step S5 includes: mining and analyzing historical data, and predicting according to time sequence information to obtain a target object needing resource allocation in each time period;

s6: a resource allocation step based on the prediction structure;

the step S6 specifically includes:

and inputting a target object set, outputting a prediction label by each decision tree, outputting the prediction label by a random forest reuse voting mode, and obtaining a final resource allocation scheme according to the label index.

2. A resource allocation method based on a random forest algorithm as claimed in claim 1, wherein in step S4, when training the nodes of each decision tree model, part of the sample features are randomly selected from all the sample features on the nodes, and an optimal feature is selected from the randomly selected part of the sample features to make the left and right subtree division of the decision tree.

3. The resource allocation method based on the random forest algorithm as claimed in claim 1, wherein the step S5 specifically comprises:

analyzing and mining the historical data of the target object to obtain attribute information of the target object, classifying the target object according to time sequence information, predicting the target object by adopting a random forest method respectively to obtain prediction results of different time periods, and taking each result as an allocated test set, namely performing resource allocation on the target object set obtained by prediction.