CN111800828A - Mobile edge computing resource allocation method for ultra-dense network - Google Patents
Mobile edge computing resource allocation method for ultra-dense network Download PDFInfo
- Publication number
- CN111800828A CN111800828A CN202010597779.5A CN202010597779A CN111800828A CN 111800828 A CN111800828 A CN 111800828A CN 202010597779 A CN202010597779 A CN 202010597779A CN 111800828 A CN111800828 A CN 111800828A
- Authority
- CN
- China
- Prior art keywords
- user
- users
- expressed
- noma
- energy consumption
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/16—Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/20—Control channels or signalling for resource management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Abstract
The invention discloses a mobile edge computing resource allocation method of a super-dense network, based on the super-dense network, wherein a NOMA-MEC communication system in the super-dense network comprises M ═ {1,2, …, M } small base stations, wherein each small base station is provided with an MEC server to execute a computing task unloaded by a user; assuming that the set of users served by each small base station is N ═ {1,2, …, N }, N users are divided into Y ═ {1,2, …, Y } groups, and K ═ 1,2, …, K } users in each group. The problem that mutual interference among users is difficult to process in the prior art, and therefore the computing performance of the users is affected is solved.
Description
[ technical field ] A method for producing a semiconductor device
The invention belongs to the technical field of wireless communication, and particularly relates to a mobile edge computing resource allocation method of a super-dense network.
[ background of the invention ]
With the rapid development of fifth generation (5G) mobile communication technology, the deployment of ultra-dense networks (UDNs) has become the main architecture for future development. The UDN can effectively improve the system capacity and the data transmission rate to ensure the service quality of the user. However, solving the computationally intensive task in UDNs is a huge challenge due to the limited computing power of users. As an emerging technology, Moving Edge Computing (MEC) has been proposed to relieve the computational pressure of users in UDNs. In particular, MECs offload compute-intensive tasks to the edge of the network to reduce energy consumption and task latency for users.
In MEC systems, how to improve the spectrum resource utilization between users is a significant challenge, as it directly affects energy consumption and task delay. As an emerging multiple access method, non-orthogonal multiple access (NOMA) can effectively improve the spectrum efficiency of a system by allocating the same resource to multiple users. Thus, in some work, NOMA has been applied to MEC systems to reduce energy consumption and task delays.
Mean Field Gaming (MFG) is a tool that is suitable for scenarios with large-scale gaming individuals and can model relationships between individuals and groups in UDNs. In particular, in UDNs, MFGs average the effects between each member, simplifying complex models.
The authors in document 1 "Learning by mean field fields for modeling large social interaction viewer [ in International Conference on Learning responses, Vancouver, Canada, apr.2018] demonstrated an equilibrium solution for mean field betting using the Markov Decision Process (MDP) to predict the evolution of the population distribution over time.
Unlike the existing literature, the present invention models the NOMA-MEC system in UDN scenarios, where each Small Base Station (SBS) is equipped with a MEC server. When a user is unable to handle a large number of computing tasks, some of the tasks will be offloaded to the MEC server. Firstly, a User Clustering Matching Algorithm (UCMA) based on channel gain difference is provided to cluster users, so that the data rate of the users is improved. Then, an MFG theoretical framework is established by taking the NOMA-MEC system as a model, and a balanced solution algorithm of the MFG is solved by using a deep deterministic strategy gradient (DDPG) algorithm in reinforcement learning so as to reduce the energy consumption and task delay of a user.
[ summary of the invention ]
The invention aims to provide a mobile edge computing resource allocation method of an ultra-dense network, which aims to solve the problem that the prior art is difficult to process the mutual interference among users, thereby influencing the computing performance of the users.
The technical scheme adopted by the invention is that the resource allocation method for the mobile edge computing of the ultra-dense network is based on the ultra-dense network, the NOMA-MEC communication system in the ultra-dense network comprises M ═ 1,2, …, M small base stations, wherein each small base station is provided with an MEC server to execute the computing task unloaded by the user; assuming that the set of users served by each small base station is N ═ {1,2, …, N }, N users are divided into Y ═ {1,2, …, Y } groups, and K ═ 1,2, …, K } users in each group;
the resource allocation method is implemented according to the following steps:
step one, an uplink NOMA-MEC communication system is constructed, and each SBS is provided with an MEC server to serve a plurality of users;
step two, performing clustering processing on all users in the NOMA-MEC communication system according to the difference of channel gains; the cluster users adopt an NOMA transmission mode, and a TDMA transmission mode is adopted among clusters;
step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task; wherein the computational cost comprises a local computational cost and an off-load computational cost of the user;
modeling the NOMA-MEC communication system into an MFG framework; the SINR and the channel gain of a user are expressed as a state space, and the transmitting power, the unloading decision factor and the resource allocation factor of the user are expressed as an action space; constructing a reward function of the user according to the calculation cost of the user;
and step five, acquiring a balanced solution of the mean field game, namely an optimal resource allocation scheme in the mobile edge computing system by using a DDPG-based reinforcement learning method.
Further, the specific method of the second step is as follows:
in the NOMA-MEC communication system model established in the step one, all users of each SBS service are sequenced according to the channel gain, and then users with the first M channel gains are sequentially selected as first users in M NOMA clusters;
selecting a user which enables the NOMA cluster to have the maximum sum of channel gain differences from other users according to a greedy matching method;
when the number of users cannot be uniformly allocated to each cluster, the redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
Further, the specific mode of the third step is as follows:
3.1) local computing cost of the user:
let x bemkRepresenting the offload variable for the kth user in the mth group, for a local computation model, i.e., a user can complete a computation task locally without offloading the computation task to the MEC server, assume fmlk> 0 denotes the local computing capacity of the kth user in the mth group, then when the user performs the task locally, its time is:
when calculating the energy consumption of local calculation, a common model for calculating the energy consumption is used, i.e., ═ κ f2. Where κ is an energy coefficient depending on the chip structure, and the local energy consumption of the kth user in the mth group can be expressed as:
according to equations (5) and (6), the local computation cost of the kth user in the mth group can be expressed as:
wherein the content of the first and second substances,andweight coefficients representing delay and energy consumption, respectively, and
3.2) offload computation cost for the user:
in the process of unloading to the calculation of the MEC server, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
wherein f issIs the computing power of the MEC server;
the total time of the unloading process is:
the energy consumption in the unloading process also has two parts, namely the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively as follows:
according to equation (11) and equation (12), the total energy consumption of the unloading process is expressed as:
thus, the offload computation cost function for the kth user in the mth group is expressed as:
3.3) total calculated cost of the user:
obtaining the user local computation cost and the user offload computation cost according to 3.1 and 3.2, the overall computation cost function for the user to complete the computation task can be expressed as:
further, the specific steps of the fourth step are as follows:
in the NOMA-MEC system of the ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
smk(t)={τmk(t),hmk(t)} (16),
each user is dependent on the current state smk(t) selecting motion a from motion space Amk(t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action amk(t) ∈ A is expressed as:
amk(t)={pmk(t),xmk,λmk} (17),
in the formula (I), the compound is shown in the specification,weight coefficients representing delay and energy consumption;
according to the analysis of the user calculation cost in the third step, the cost function of the user is expressed as:
therefore, the reward function for the kth user in the mth group is expressed as:
in the mean field game, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the entire system model;
when the kth user in the mth group is in state smk(t) selecting action amk(t), its FPK equation can be expressed as:
πmk(t+1)=πmk(t)Pmk(pmk,xmk,λmk) (20),
wherein, pimk(t +1) is the state of the kth user in the mth group at time (t +1), Pmk(pmk,xmk,λmk) Is the probability that the kth user in the mth group transfers from the state at the time t to the state at the time (t +1), which is mainly determined by the actions of the users;
the state s at time t according to the definition of the reward functionmkThe value function (i.e., the HJB equation) of (t) is expressed as:
solving a Nash equilibrium solution for the MFG based on the FPK and HJB equations.
Further, the concrete mode of the step five is as follows:
and solving the equilibrium solution of the MFG by adopting a DDPG algorithm, wherein an objective function of the DDPG algorithm is defined as:
wherein, thetaμIs a parameter of the policy network that generates deterministic actions, and θμUpdating through strategy gradient;
there are two main networks in the Actor part, an online policy network and a target policy network. Deterministic policy μ for directly deriving action a at each instantt=μ(st|θμ) A determined value. Like the Actor portion, the criticc portion also has two networks, namely an online Q network and a target Q network. The Q function (i.e. action value function) defined by bellman's equation is the reward expectation value for selecting an action under a deterministic policy, fitting the Q function using a Q network, i.e.:
Qμ(st,at)=E[R+γQ(st+1,μ(st+1))](23),
wherein Q isμ(st,at) Is shown in state stSelecting action a with deterministic policy μtThe expected value obtained, to measure the performance of the policy, defines the performance goals as follows:
where β represents the behavior strategy, ρβIs a probability density function of the state space. In Critic part, the mean square error is taken as a loss function, i.e.:
thus, the loss function L with respect to θ can be derived from standard back propagation algorithmsQI.e.:
by updating the gradient in real time, the objective function tends to converge, and finally an optimal strategy is obtained, namely an optimal resource allocation scheme in the mobile edge computing system is obtained.
Compared with the prior art, the invention has the beneficial effects that:
1. the NOMA-MEC system is constructed as an MFG theoretical framework, and the equilibrium solution of the MFG is solved through reinforcement learning, so that the calculation cost of a user, including energy consumption and time delay, is minimized.
2. The invention constructs an uplink NOMA-MEC system in a super-dense network, and each SBS is provided with an MEC server to serve a plurality of users. In the system, all users of each SBS service are divided into different clusters according to a user clustering algorithm to increase the data rate of the users.
3. The NOMA-MEC system under ultra-dense networks was modeled as an MFG framework. And then, solving the equilibrium solution of the MFG by adopting a DDPG method, and reducing the energy consumption and task delay of the user by learning a dynamic resource allocation strategy.
4. The method provided by the invention is verified through experiments that the optimal resource allocation strategy can be effectively learned, and compared with other methods, the method more effectively reduces the calculation time delay and energy consumption of the user.
[ description of the drawings ]
FIG. 1 is a system diagram of the mobile edge computing of the ultra-dense network proposed by the present invention;
FIG. 2 is a schematic diagram of the relationship between the mean field game and the reinforcement learning algorithm of the present invention;
FIG. 3 is a schematic diagram of the present invention employing a reinforcement learning algorithm to optimize resource allocation in a NOMA-MEC system;
FIG. 4 is a diagram illustrating the relationship between energy consumption and maximum transmit power for different algorithm comparisons according to the present invention;
fig. 5 is a schematic diagram of the relationship between the calculated delay and the maximum transmission power under comparison of different algorithms.
[ detailed description ] embodiments
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
Different from the existing documents, from the viewpoint of relieving network resources and overcoming the self limitation of the mobile equipment, the invention researches the resource optimization in the uplink NOMA-MEC system in the ultra-dense network, combines a deep reinforcement learning algorithm, and minimizes the system delay and energy consumption by optimizing power and unloading strategies.
Step one, constructing a system model:
an uplink NOMA-MEC system is constructed, each SBS being equipped with a MEC server to serve multiple users.
The specific construction mode is as follows:
as shown in fig. 1, the present invention contemplates a NOMA-MEC communication system in an ultra-dense network with M ═ {1,2, …, M } small cells, where each small cell is equipped with a MEC server to perform user offloaded computational tasks. Assuming that the set of users served by each small base station is N ═ 1,2, …, N, in order to reduce interference between users, users need to be grouped. In the present invention, N users are divided into groups of {1,2, …, Y } and each group has K ═ 1,2, …, K } users.
When information transmission is carried out, the bandwidth B of the whole system is divided into Y sub-channels, and the bandwidth of each sub-channel is represented as BscAnd the users in each group are simultaneously transmitting information in their subchannels.
And step two, clustering all users in the system through a user clustering algorithm so as to improve the data transmission rate of the users. The intra-cluster users adopt a NOMA transmission mode, and the inter-cluster users adopt a Time Division Multiple Access (TDMA) transmission mode.
The specific mode of the second step is as follows:
in the NOMA-MEC communication system model established in step one, all users of each SBS service are sorted according to their channel gains, and then the user with the first M channel gains is sequentially selected as the first user in the M NOMA clusters. Next, the user having the largest sum of channel gain differences for the NOMA cluster is selected from the remaining users according to a greedy matching method. Further, when the number of users cannot be uniformly allocated to each cluster, the redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
And step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task. Including the local computational cost and the offload computational cost of the user.
The third step is specifically as follows:
and finishing clustering by the user according to the clustering algorithm in the step two. When information is transmitted, the NOMA technology is adopted by the users in the cluster, and the TDMA technology is adopted between the clusters, so that any user is interfered by not only the users in the same cluster, but also the users served by the SBS in the same time slot during information transmission.
For users in the NOMA cluster, users with greater channel gain will be interfered by users with smaller channel gain. The user with the smallest channel gain is not interfered by other users. Thus, the interference experienced by a user within a NOMA cluster can be expressed as:
wherein p ismfRepresents the transmission power, h, of the f-th user in the m-th NOMA clustermfRepresenting the channel gain for the f-th user in the m groups.
Secondly, in an ultra-dense network, users served by different small base stations may generate interference when transmitting tasks in the same time slot, which may be expressed as:
wherein p isjkDenotes the transmission power, h, of the kth user in group jjkRepresenting the channel gain for the kth user in group j.
So the SINR of the kth user in the mth group is expressed as:
wherein the content of the first and second substances,is the power of additive white gaussian noise, the data rate of the kth user in the mth group is expressed as:
Rmk=Wsclog(1+τmk) (4),
wherein, Wsc=Wtotal/M,WtotalIs the system bandwidth.
The computing task for the kth user in the mth group may be defined asWherein d ismkRepresenting input data required by the kth user in the mth group to complete the computing task, cmkRepresenting the k-th user in the m-th group to calculate dmkThe number of CPU cycles required for the CPU,representing the last time the kth user in the mth group completed the computing task.
Let x bemkRepresenting the unloading variables of the kth user in the mth group, for the local computation model, assumeRepresenting the local computing capacity of the kth user in the mth group, then when the user performs the task locally, its time is:
when calculating the energy consumption of local calculation, a common model for calculating the energy consumption is used, i.e., ═ κ f2. Where κ is an energy coefficient depending on the chip structure, so that the m-th groupThe local energy consumption of k users can be expressed as:
according to equations (5) and (6), the computation cost of the kth user in the mth group in the local computation can be expressed as:
wherein the content of the first and second substances,andweight coefficients representing delay and energy consumption, respectively, andwhen in useTime indicates that the user is sensitive to delay, and more focuses on computing time; otherwise, the energy of the user is low, and the energy consumption of the computing task is emphasized.
In the process of unloading to the calculation of the MEC server, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
wherein f issIs the computing power of the MEC server. The total time for this unloading process is therefore:
similarly, the energy consumption in the unloading process also has two parts, namely the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively:
according to equation (11) and equation (12), the total energy consumption of the unloading process can be expressed as:
thus, the cost function for the kth user in the mth group during the offloading process can be expressed as:
further, the cost function of the kth user in the mth group to complete the computing task can be expressed as
Step four, establishing a cost function:
modeling NOMA-MEC as an MFG framework, wherein SINR and channel gain of a user are expressed as a state space, and transmitting power, an unloading decision factor and a resource allocation factor of the user are expressed as an action space; and constructing a reward function of the user according to the calculation cost of the user.
The fourth step comprises the following specific steps:
when many users are simultaneously computing tasks, the interference can become very severe. This severely reduces the data transfer rate for the user, thereby increasing the time delay and power consumption when offloading computing tasks. Since each user is an independent individual, it only considers his interests in the ultra-dense scenario. Therefore, the present invention expresses this model as the MFG theoretical framework.
The state of each user comes only from its own local observations. In the NOMA-MEC system of the ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
smk(t)={τmk(t),hmk(t)} (16),
each user is dependent on the current state smk(t) selecting motion a from motion space Amk(t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action amk(t) ∈ A is expressed as:
amk(t)={pmk(t),xmk,λmk} (17),
in the formula (I), the compound is shown in the specification,a weighting factor representing delay and energy consumption.
It is an object of the invention to minimize the computational cost of the user on the basis of the maximum delay. From the analysis of the user's calculated cost in step three, the user's cost function can be expressed as:
therefore, the reward function for the kth user in the mth group can be expressed as:
in the mean field game, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the entire system model. When in the m-th group
k users in state smk(t) selecting action amk(t), its FPK equation can be expressed as:
πmk(t+1)=πmk(t)Pmk(pmk,xmk,λmk) (20),
wherein, pimk(t +1) is the state of the kth user in the mth group at time (t +1), Pmk(pmk,xmk,λmk) Is the probability that the kth user in the mth group transfers from the state at time t to the state at time (t +1), which is mainly determined by the user's actions.
The state s at time t according to the definition of the reward functionmkThe value function (i.e., the HJB equation) of (t) is expressed as:
the nash equilibrium solution for the MFG can be solved based on the FPK and HJB equations.
And step five, acquiring a balanced solution of the mean field game by using a DDPG-based reinforcement learning method.
The concrete mode of the fifth step is as follows:
the DDPG algorithm is adopted to solve the equilibrium solution of the MFG, the problem of continuous action space can be solved, and the relation between the MFG and reinforcement learning is shown in figure 2. The DDPG algorithm can be used for resource optimization problems in many communication scenarios.
A schematic diagram for optimizing resource allocation in a NOMA-MEC system using the DDPG algorithm is shown in fig. 3. The DDPG algorithm is an Actor-Critic framework, so the DDPG algorithm is mainly divided into an Actor part and a Critic part to describe the process of the DDPG algorithm. The Actor part outputs a specific action a by minimizing the action Q (s, a) through a deterministic strategy mu on the premise of inputting a state s; the criticic part is to output Q (s, a) updated by bellman's equation on the premise of inputting state s and a specific action a. Thus, the objective function of the DDPG algorithm can be defined as:
wherein, thetaμIs a policy network that generates deterministic actionsAnd θ isμThe update is performed by a policy gradient.
There are two main networks in the Actor part, an online policy network and a target policy network. Deterministic policy μ for directly deriving action a at each instantt=μ(st|θμ) A determined value. Like the Actor portion, the criticc portion also has two networks, namely an online Q network and a target Q network. The Q function (i.e. action value function) defined by bellman's equation is the reward expectation value for selecting an action under a deterministic policy, fitting the Q function using a Q network, i.e.:
Qμ(st,at)=E[R+γQ(st+1,μ(st+1))](23),
wherein Q isμ(st,at) Is shown in state stSelecting action a with deterministic policy μtThe expected value obtained, to measure the performance of the policy, defines the performance goals as follows:
where β represents the behavior strategy, ρβIs a probability density function of the state space. The purpose of the training is to target the performance of the Q network JβMaximizing and minimizing the loss of the Q network. In Critic part, the mean square error is taken as a loss function, i.e.:
L(θQ)=E[R+γQ′(st+1,μ′(st+1|θμ′)|θQ′)-Q(st,at|θQ)](25),
thus, the loss function L with respect to θ can be derived from standard back propagation algorithmsQI.e.:
example (b):
the diagrams provided in the following examples and the setting of specific parameter values in the models are mainly for explaining the basic idea of the present invention and performing simulation verification on the present invention, and can be appropriately adjusted according to the actual scene and requirements in the specific application environment.
The invention researches a NOMA-MEC system in an ultra-dense network, wherein 60 small base stations are randomly distributed within a range of 10km x 10km, the coverage range of each small base station is 20m, and 64 users are randomly distributed near the small base stations.
To implement the DDPG algorithm, the Actor network and Critic network use a fully-connected neural network with three hidden layers, each containing 300 neurons. For the Actor network, the last output layer uses a Sigmoid activation function to ensure that the probability of the last action output is between 0 and 1. For a criticic network, a ReLU activation function is used for each layer. The learning rates of the Actor network and Critic network are set to 0.0001 and 0.001, respectively.
Fig. 4 and 5 show the effect of maximum transmit power for different algorithms and different multiple access modes. In fig. 4, it can be observed that the energy consumption of the system gradually increases with increasing maximum transmit power. The NOMA scheme can achieve lower energy consumption when the maximum transmission power is fixed. This is because users in a NOMA cluster can simultaneously use the full spectrum resources to transmit information, which can reduce the energy consumption of the system. As can be seen from fig. 5, the calculation delay decreases as the maximum transmission power increases. This is because, when the maximum transmission power is large, the calculation speed and the data transmission rate of the user become large, resulting in a reduction in calculation delay.
Claims (5)
1. A method for allocating computing resources at a mobile edge of a very dense network,
the resource allocation method is based on a super-dense network, wherein a NOMA-MEC communication system in the super-dense network comprises M {1,2, …, M } small base stations, wherein each small base station is provided with an MEC server to execute calculation tasks unloaded by users; assuming that the set of users served by each small base station is N ═ {1,2, …, N }, N users are divided into Y ═ {1,2, …, Y } groups, and K ═ 1,2, …, K } users in each group;
the resource allocation method is implemented according to the following steps:
step one, an uplink NOMA-MEC communication system is constructed, and each SBS is provided with an MEC server to serve a plurality of users;
step two, performing clustering processing on all users in the NOMA-MEC communication system according to the difference of channel gains; the cluster users adopt an NOMA transmission mode, and a TDMA transmission mode is adopted among clusters;
step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task; wherein the computational cost comprises a local computational cost and an off-load computational cost of the user;
modeling the NOMA-MEC communication system into an MFG framework; the SINR and the channel gain of a user are expressed as a state space, and the transmitting power, the unloading decision factor and the resource allocation factor of the user are expressed as an action space; constructing a reward function of the user according to the calculation cost of the user;
and step five, acquiring a balanced solution of the mean field game, namely an optimal resource allocation scheme in the mobile edge computing system by using a DDPG-based reinforcement learning method.
2. The method for allocating computing resources on the mobile edge of the ultra-dense network as claimed in claim 1, wherein the specific method in the second step is:
in the NOMA-MEC communication system model established in the step one, all users of each SBS service are sequenced according to the channel gain, and then users with the first M channel gains are sequentially selected as first users in M NOMA clusters;
selecting a user which enables the NOMA cluster to have the maximum sum of channel gain differences from other users according to a greedy matching method;
when the number of users cannot be uniformly allocated to each cluster, the redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
3. The method for allocating computing resources at a mobile edge in a very dense network as claimed in claim 1 or 2, wherein the third step is specifically:
3.1) local computing cost of the user:
let x bemkRepresenting the offload variable for the kth user in the mth group, it is assumed that for local computation models, i.e., users can complete computation tasks locally, without offloading the computation tasks to the MEC serverRepresenting the local computing capacity of the kth user in the mth group, then when the user performs the task locally, its time is:
when calculating the energy consumption of local calculation, a common model for calculating the energy consumption is used, i.e., ═ κ f2. Where κ is an energy coefficient depending on the chip structure, and the local energy consumption of the kth user in the mth group can be expressed as:
according to equations (5) and (6), the local computation cost of the kth user in the mth group can be expressed as:
wherein the content of the first and second substances,andweight coefficients representing delay and energy consumption, respectively, and
3.2) offload computation cost for the user:
in the process of unloading to the calculation of the MEC server, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
wherein f issIs the computing power of the MEC server;
the total time of the unloading process is:
the energy consumption in the unloading process also has two parts, namely the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively as follows:
according to equation (11) and equation (12), the total energy consumption of the unloading process is expressed as:
thus, the offload computation cost function for the kth user in the mth group is expressed as:
3.3) total calculated cost of the user:
obtaining the user local computation cost and the user offload computation cost according to 3.1 and 3.2, the overall computation cost function for the user to complete the computation task can be expressed as:
4. the method for allocating the computing resources of the mobile edge of the ultra-dense network as claimed in claim 1 or 2, wherein the specific steps of the fourth step are:
in the NOMA-MEC system of the ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
smk(t)={τmk(t),hmk(t)} (16),
each user is dependent on the current state smk(t) selecting motion a from motion space Amk(t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action amk(t) ∈ A is expressed as:
amk(t)={pmk(t),xmk,λmk} (17),
in the formula (I), the compound is shown in the specification,weight coefficients representing delay and energy consumption;
according to the analysis of the user calculation cost in the third step, the cost function of the user is expressed as:
therefore, the reward function for the kth user in the mth group is expressed as:
in the mean field game, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the entire system model;
when the kth user in the mth group is in state smk(t) selecting action amk(t), its FPK equation can be expressed as:
πmk(t+1)=πmk(t)Pmk(pmk,xmk,λmk) (20),
wherein, pimk(t +1) is the state of the kth user in the mth group at time (t +1), Pmk(pmk,xmk,λmk) Is the probability that the kth user in the mth group transfers from the state at the time t to the state at the time (t +1), which is mainly determined by the actions of the users;
the state s at time t according to the definition of the reward functionmkThe value function (i.e., the HJB equation) of (t) is expressed as:
solving a Nash equilibrium solution for the MFG based on the FPK and HJB equations.
5. The method for allocating computing resources at a mobile edge in a very dense network as claimed in claim 1 or 2, wherein the concrete manner of said step five is:
and solving the equilibrium solution of the MFG by adopting a DDPG algorithm, wherein an objective function of the DDPG algorithm is defined as:
wherein, thetaμIs a parameter of the policy network that generates deterministic actions, and θμUpdating through strategy gradient;
there are two main networks in the Actor part, an online policy network and a target policy network. Deterministic policy μ for directly deriving each epochEngraving action at=μ(st|θμ) A determined value. Like the Actor portion, the criticc portion also has two networks, namely an online Q network and a target Q network. The Q function (i.e. action value function) defined by bellman's equation is the reward expectation value for selecting an action under a deterministic policy, fitting the Q function using a Q network, i.e.:
Qμ(st,at)=E[R+γQ(st+1,μ(st+1))](23),
wherein Q isμ(st,at) Is shown in state stSelecting action a with deterministic policy μtThe expected value obtained, to measure the performance of the policy, defines the performance goals as follows:
where β represents the behavior strategy, ρβIs a probability density function of the state space. In Critic part, the mean square error is taken as a loss function, i.e.:
L(θQ)=E[R+γQ′(st+1,μ′(st+1|θμ′)|θQ′)-Q(st,at|θQ)](25),
thus, the loss function L with respect to θ can be derived from standard back propagation algorithmsQI.e.:
by updating the gradient in real time, the objective function tends to converge, and finally an optimal strategy is obtained, namely an optimal resource allocation scheme in the mobile edge computing system is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010597779.5A CN111800828B (en) | 2020-06-28 | 2020-06-28 | Mobile edge computing resource allocation method for ultra-dense network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010597779.5A CN111800828B (en) | 2020-06-28 | 2020-06-28 | Mobile edge computing resource allocation method for ultra-dense network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111800828A true CN111800828A (en) | 2020-10-20 |
CN111800828B CN111800828B (en) | 2023-07-18 |
Family
ID=72803807
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010597779.5A Active CN111800828B (en) | 2020-06-28 | 2020-06-28 | Mobile edge computing resource allocation method for ultra-dense network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111800828B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112468568A (en) * | 2020-11-23 | 2021-03-09 | 南京信息工程大学滨江学院 | Task relay unloading method of mobile edge computing network |
CN112492691A (en) * | 2020-11-26 | 2021-03-12 | 辽宁工程技术大学 | Downlink NOMA power distribution method of deep certainty strategy gradient |
CN112601256A (en) * | 2020-12-07 | 2021-04-02 | 广西师范大学 | MEC-SBS clustering-based load scheduling method in ultra-dense network |
CN112654081A (en) * | 2020-12-14 | 2021-04-13 | 西安邮电大学 | User clustering and resource allocation optimization method, system, medium, device and application |
CN112738822A (en) * | 2020-12-25 | 2021-04-30 | 中国石油大学(华东) | NOMA-based security offload and resource allocation method in mobile edge computing environment |
CN113055854A (en) * | 2021-03-16 | 2021-06-29 | 西安邮电大学 | NOMA-based vehicle edge computing network optimization method, system, medium and application |
CN113517920A (en) * | 2021-04-20 | 2021-10-19 | 东方红卫星移动通信有限公司 | Calculation unloading method and system for simulation load of Internet of things in ultra-dense low-orbit constellation |
CN113543342A (en) * | 2021-07-05 | 2021-10-22 | 南京信息工程大学滨江学院 | Reinforced learning resource allocation and task unloading method based on NOMA-MEC |
CN113938997A (en) * | 2021-09-30 | 2022-01-14 | 中国人民解放军陆军工程大学 | Resource allocation method for secure MEC system in NOMA (non-access-oriented multi-media access) Internet of things |
CN114827191A (en) * | 2022-03-15 | 2022-07-29 | 华南理工大学 | Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system |
CN115022937A (en) * | 2022-07-14 | 2022-09-06 | 合肥工业大学 | Topological feature extraction method and multi-edge cooperative scheduling method considering topological features |
CN115460080A (en) * | 2022-08-22 | 2022-12-09 | 昆明理工大学 | Block chain assisted time-varying mean field game edge calculation unloading optimization method |
CN117857559A (en) * | 2024-03-07 | 2024-04-09 | 北京邮电大学 | Metropolitan area optical network task unloading method based on average field game and edge server |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107819840A (en) * | 2017-10-31 | 2018-03-20 | 北京邮电大学 | Distributed mobile edge calculations discharging method in the super-intensive network architecture |
CN109548013A (en) * | 2018-12-07 | 2019-03-29 | 南京邮电大学 | A kind of mobile edge calculations system constituting method of the NOMA with anti-eavesdropping ability |
US20190124667A1 (en) * | 2017-10-23 | 2019-04-25 | Commissariat A L'energie Atomique Et Aux Energies Alternatives | Method for allocating transmission resources using reinforcement learning |
CN109951897A (en) * | 2019-03-08 | 2019-06-28 | 东华大学 | A kind of MEC discharging method under energy consumption and deferred constraint |
CN110798849A (en) * | 2019-10-10 | 2020-02-14 | 西北工业大学 | Computing resource allocation and task unloading method for ultra-dense network edge computing |
CN111245539A (en) * | 2020-01-07 | 2020-06-05 | 南京邮电大学 | NOMA-based efficient resource allocation method for mobile edge computing network |
-
2020
- 2020-06-28 CN CN202010597779.5A patent/CN111800828B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190124667A1 (en) * | 2017-10-23 | 2019-04-25 | Commissariat A L'energie Atomique Et Aux Energies Alternatives | Method for allocating transmission resources using reinforcement learning |
CN107819840A (en) * | 2017-10-31 | 2018-03-20 | 北京邮电大学 | Distributed mobile edge calculations discharging method in the super-intensive network architecture |
CN109548013A (en) * | 2018-12-07 | 2019-03-29 | 南京邮电大学 | A kind of mobile edge calculations system constituting method of the NOMA with anti-eavesdropping ability |
CN109951897A (en) * | 2019-03-08 | 2019-06-28 | 东华大学 | A kind of MEC discharging method under energy consumption and deferred constraint |
CN110798849A (en) * | 2019-10-10 | 2020-02-14 | 西北工业大学 | Computing resource allocation and task unloading method for ultra-dense network edge computing |
CN111245539A (en) * | 2020-01-07 | 2020-06-05 | 南京邮电大学 | NOMA-based efficient resource allocation method for mobile edge computing network |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112468568A (en) * | 2020-11-23 | 2021-03-09 | 南京信息工程大学滨江学院 | Task relay unloading method of mobile edge computing network |
CN112468568B (en) * | 2020-11-23 | 2024-04-23 | 南京信息工程大学滨江学院 | Task relay unloading method for mobile edge computing network |
CN112492691A (en) * | 2020-11-26 | 2021-03-12 | 辽宁工程技术大学 | Downlink NOMA power distribution method of deep certainty strategy gradient |
CN112492691B (en) * | 2020-11-26 | 2024-03-26 | 辽宁工程技术大学 | Downlink NOMA power distribution method of depth deterministic strategy gradient |
CN112601256B (en) * | 2020-12-07 | 2022-07-15 | 广西师范大学 | MEC-SBS clustering-based load scheduling method in ultra-dense network |
CN112601256A (en) * | 2020-12-07 | 2021-04-02 | 广西师范大学 | MEC-SBS clustering-based load scheduling method in ultra-dense network |
CN112654081A (en) * | 2020-12-14 | 2021-04-13 | 西安邮电大学 | User clustering and resource allocation optimization method, system, medium, device and application |
CN112738822A (en) * | 2020-12-25 | 2021-04-30 | 中国石油大学(华东) | NOMA-based security offload and resource allocation method in mobile edge computing environment |
CN113055854A (en) * | 2021-03-16 | 2021-06-29 | 西安邮电大学 | NOMA-based vehicle edge computing network optimization method, system, medium and application |
CN113517920A (en) * | 2021-04-20 | 2021-10-19 | 东方红卫星移动通信有限公司 | Calculation unloading method and system for simulation load of Internet of things in ultra-dense low-orbit constellation |
CN113543342A (en) * | 2021-07-05 | 2021-10-22 | 南京信息工程大学滨江学院 | Reinforced learning resource allocation and task unloading method based on NOMA-MEC |
CN113543342B (en) * | 2021-07-05 | 2024-03-29 | 南京信息工程大学滨江学院 | NOMA-MEC-based reinforcement learning resource allocation and task unloading method |
CN113938997A (en) * | 2021-09-30 | 2022-01-14 | 中国人民解放军陆军工程大学 | Resource allocation method for secure MEC system in NOMA (non-access-oriented multi-media access) Internet of things |
CN113938997B (en) * | 2021-09-30 | 2024-04-30 | 中国人民解放军陆军工程大学 | Resource allocation method of secure MEC system in NOMA (non-volatile memory access) Internet of things |
CN114827191A (en) * | 2022-03-15 | 2022-07-29 | 华南理工大学 | Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system |
CN114827191B (en) * | 2022-03-15 | 2023-11-03 | 华南理工大学 | Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system |
CN115022937A (en) * | 2022-07-14 | 2022-09-06 | 合肥工业大学 | Topological feature extraction method and multi-edge cooperative scheduling method considering topological features |
CN115022937B (en) * | 2022-07-14 | 2022-11-11 | 合肥工业大学 | Topological feature extraction method and multi-edge cooperative scheduling method considering topological features |
CN115460080A (en) * | 2022-08-22 | 2022-12-09 | 昆明理工大学 | Block chain assisted time-varying mean field game edge calculation unloading optimization method |
CN115460080B (en) * | 2022-08-22 | 2024-04-05 | 昆明理工大学 | Blockchain-assisted time-varying average field game edge calculation unloading optimization method |
CN117857559A (en) * | 2024-03-07 | 2024-04-09 | 北京邮电大学 | Metropolitan area optical network task unloading method based on average field game and edge server |
Also Published As
Publication number | Publication date |
---|---|
CN111800828B (en) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111800828B (en) | Mobile edge computing resource allocation method for ultra-dense network | |
CN109729528B (en) | D2D resource allocation method based on multi-agent deep reinforcement learning | |
CN113950066B (en) | Single server part calculation unloading method, system and equipment under mobile edge environment | |
CN111586720B (en) | Task unloading and resource allocation combined optimization method in multi-cell scene | |
CN111445111B (en) | Electric power Internet of things task allocation method based on edge cooperation | |
Wen et al. | Federated dropout—a simple approach for enabling federated learning on resource constrained devices | |
CN113873022A (en) | Mobile edge network intelligent resource allocation method capable of dividing tasks | |
CN110798849A (en) | Computing resource allocation and task unloading method for ultra-dense network edge computing | |
CN112118287B (en) | Network resource optimization scheduling decision method based on alternative direction multiplier algorithm and mobile edge calculation | |
Chen et al. | Multiuser computation offloading and resource allocation for cloud–edge heterogeneous network | |
CN112788605B (en) | Edge computing resource scheduling method and system based on double-delay depth certainty strategy | |
CN110856259A (en) | Resource allocation and offloading method for adaptive data block size in mobile edge computing environment | |
Cheng et al. | Efficient resource allocation for NOMA-MEC system in ultra-dense network: A mean field game approach | |
CN113596785A (en) | D2D-NOMA communication system resource allocation method based on deep Q network | |
CN112860429A (en) | Cost-efficiency optimization system and method for task unloading in mobile edge computing system | |
Lu et al. | Cost-efficient resources scheduling for mobile edge computing in ultra-dense networks | |
Sha et al. | DRL-based task offloading and resource allocation in multi-UAV-MEC network with SDN | |
CN114650228A (en) | Federal learning scheduling method based on computation unloading in heterogeneous network | |
Bhandari et al. | Optimal Cache Resource Allocation Based on Deep Neural Networks for Fog Radio Access Networks | |
Geng et al. | Deep reinforcement learning-based computation offloading in vehicular networks | |
Hu et al. | Deep reinforcement learning for task offloading in edge computing assisted power IoT | |
CN114885422A (en) | Dynamic edge computing unloading method based on hybrid access mode in ultra-dense network | |
Xu et al. | Deep reinforcement learning for communication and computing resource allocation in RIS aided MEC networks | |
CN114828018A (en) | Multi-user mobile edge computing unloading method based on depth certainty strategy gradient | |
Liu et al. | Multi-User Dynamic Computation Offloading and Resource Allocation in 5G MEC Heterogeneous Networks With Static and Dynamic Subchannels |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |