CN111800828A - Mobile edge computing resource allocation method for ultra-dense network - Google Patents

Mobile edge computing resource allocation method for ultra-dense network Download PDF

Info

Publication number
CN111800828A
CN111800828A CN202010597779.5A CN202010597779A CN111800828A CN 111800828 A CN111800828 A CN 111800828A CN 202010597779 A CN202010597779 A CN 202010597779A CN 111800828 A CN111800828 A CN 111800828A
Authority
CN
China
Prior art keywords
user
users
expressed
noma
energy consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010597779.5A
Other languages
Chinese (zh)
Other versions
CN111800828B (en
Inventor
李立欣
程倩倩
张敬敏
王大伟
李旭
梁微
林文晟
李煊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202010597779.5A priority Critical patent/CN111800828B/en
Publication of CN111800828A publication Critical patent/CN111800828A/en
Application granted granted Critical
Publication of CN111800828B publication Critical patent/CN111800828B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/20Control channels or signalling for resource management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses a mobile edge computing resource allocation method of a super-dense network, based on the super-dense network, wherein a NOMA-MEC communication system in the super-dense network comprises M ═ {1,2, …, M } small base stations, wherein each small base station is provided with an MEC server to execute a computing task unloaded by a user; assuming that the set of users served by each small base station is N ═ {1,2, …, N }, N users are divided into Y ═ {1,2, …, Y } groups, and K ═ 1,2, …, K } users in each group. The problem that mutual interference among users is difficult to process in the prior art, and therefore the computing performance of the users is affected is solved.

Description

Mobile edge computing resource allocation method for ultra-dense network
[ technical field ] A method for producing a semiconductor device
The invention belongs to the technical field of wireless communication, and particularly relates to a mobile edge computing resource allocation method of a super-dense network.
[ background of the invention ]
With the rapid development of fifth generation (5G) mobile communication technology, the deployment of ultra-dense networks (UDNs) has become the main architecture for future development. The UDN can effectively improve the system capacity and the data transmission rate to ensure the service quality of the user. However, solving the computationally intensive task in UDNs is a huge challenge due to the limited computing power of users. As an emerging technology, Moving Edge Computing (MEC) has been proposed to relieve the computational pressure of users in UDNs. In particular, MECs offload compute-intensive tasks to the edge of the network to reduce energy consumption and task latency for users.
In MEC systems, how to improve the spectrum resource utilization between users is a significant challenge, as it directly affects energy consumption and task delay. As an emerging multiple access method, non-orthogonal multiple access (NOMA) can effectively improve the spectrum efficiency of a system by allocating the same resource to multiple users. Thus, in some work, NOMA has been applied to MEC systems to reduce energy consumption and task delays.
Mean Field Gaming (MFG) is a tool that is suitable for scenarios with large-scale gaming individuals and can model relationships between individuals and groups in UDNs. In particular, in UDNs, MFGs average the effects between each member, simplifying complex models.
The authors in document 1 "Learning by mean field fields for modeling large social interaction viewer [ in International Conference on Learning responses, Vancouver, Canada, apr.2018] demonstrated an equilibrium solution for mean field betting using the Markov Decision Process (MDP) to predict the evolution of the population distribution over time.
Document 2, "colloid architectural intelligent understanding (AI) for User-cell association in Ultra-deep cell Systems [ IEEE International Conference on communications works (iccworkshos), Kansas, MO, May2018 ]" proposes a neural Q learning algorithm to solve the problem of User association in a super-Dense network system.
Unlike the existing literature, the present invention models the NOMA-MEC system in UDN scenarios, where each Small Base Station (SBS) is equipped with a MEC server. When a user is unable to handle a large number of computing tasks, some of the tasks will be offloaded to the MEC server. Firstly, a User Clustering Matching Algorithm (UCMA) based on channel gain difference is provided to cluster users, so that the data rate of the users is improved. Then, an MFG theoretical framework is established by taking the NOMA-MEC system as a model, and a balanced solution algorithm of the MFG is solved by using a deep deterministic strategy gradient (DDPG) algorithm in reinforcement learning so as to reduce the energy consumption and task delay of a user.
[ summary of the invention ]
The invention aims to provide a mobile edge computing resource allocation method of an ultra-dense network, which aims to solve the problem that the prior art is difficult to process the mutual interference among users, thereby influencing the computing performance of the users.
The technical scheme adopted by the invention is that the resource allocation method for the mobile edge computing of the ultra-dense network is based on the ultra-dense network, the NOMA-MEC communication system in the ultra-dense network comprises M ═ 1,2, …, M small base stations, wherein each small base station is provided with an MEC server to execute the computing task unloaded by the user; assuming that the set of users served by each small base station is N ═ {1,2, …, N }, N users are divided into Y ═ {1,2, …, Y } groups, and K ═ 1,2, …, K } users in each group;
the resource allocation method is implemented according to the following steps:
step one, an uplink NOMA-MEC communication system is constructed, and each SBS is provided with an MEC server to serve a plurality of users;
step two, performing clustering processing on all users in the NOMA-MEC communication system according to the difference of channel gains; the cluster users adopt an NOMA transmission mode, and a TDMA transmission mode is adopted among clusters;
step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task; wherein the computational cost comprises a local computational cost and an off-load computational cost of the user;
modeling the NOMA-MEC communication system into an MFG framework; the SINR and the channel gain of a user are expressed as a state space, and the transmitting power, the unloading decision factor and the resource allocation factor of the user are expressed as an action space; constructing a reward function of the user according to the calculation cost of the user;
and step five, acquiring a balanced solution of the mean field game, namely an optimal resource allocation scheme in the mobile edge computing system by using a DDPG-based reinforcement learning method.
Further, the specific method of the second step is as follows:
in the NOMA-MEC communication system model established in the step one, all users of each SBS service are sequenced according to the channel gain, and then users with the first M channel gains are sequentially selected as first users in M NOMA clusters;
selecting a user which enables the NOMA cluster to have the maximum sum of channel gain differences from other users according to a greedy matching method;
when the number of users cannot be uniformly allocated to each cluster, the redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
Further, the specific mode of the third step is as follows:
3.1) local computing cost of the user:
let x bemkRepresenting the offload variable for the kth user in the mth group, for a local computation model, i.e., a user can complete a computation task locally without offloading the computation task to the MEC server, assume fmlk> 0 denotes the local computing capacity of the kth user in the mth group, then when the user performs the task locally, its time is:
Figure BDA0002558022200000041
when calculating the energy consumption of local calculation, a common model for calculating the energy consumption is used, i.e., ═ κ f2. Where κ is an energy coefficient depending on the chip structure, and the local energy consumption of the kth user in the mth group can be expressed as:
Figure BDA0002558022200000042
according to equations (5) and (6), the local computation cost of the kth user in the mth group can be expressed as:
Figure BDA0002558022200000043
wherein the content of the first and second substances,
Figure BDA0002558022200000044
and
Figure BDA0002558022200000045
weight coefficients representing delay and energy consumption, respectively, and
Figure BDA0002558022200000046
3.2) offload computation cost for the user:
in the process of unloading to the calculation of the MEC server, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
Figure BDA0002558022200000047
Figure BDA0002558022200000048
wherein f issIs the computing power of the MEC server;
the total time of the unloading process is:
Figure BDA0002558022200000049
the energy consumption in the unloading process also has two parts, namely the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively as follows:
Figure BDA00025580222000000410
Figure BDA00025580222000000411
according to equation (11) and equation (12), the total energy consumption of the unloading process is expressed as:
Figure BDA0002558022200000051
thus, the offload computation cost function for the kth user in the mth group is expressed as:
Figure BDA0002558022200000052
3.3) total calculated cost of the user:
obtaining the user local computation cost and the user offload computation cost according to 3.1 and 3.2, the overall computation cost function for the user to complete the computation task can be expressed as:
Figure BDA0002558022200000053
further, the specific steps of the fourth step are as follows:
in the NOMA-MEC system of the ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
smk(t)={τmk(t),hmk(t)} (16),
each user is dependent on the current state smk(t) selecting motion a from motion space Amk(t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action amk(t) ∈ A is expressed as:
amk(t)={pmk(t),xmkmk} (17),
in the formula (I), the compound is shown in the specification,
Figure BDA0002558022200000054
weight coefficients representing delay and energy consumption;
according to the analysis of the user calculation cost in the third step, the cost function of the user is expressed as:
Figure BDA0002558022200000055
therefore, the reward function for the kth user in the mth group is expressed as:
Figure BDA0002558022200000056
in the mean field game, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the entire system model;
when the kth user in the mth group is in state smk(t) selecting action amk(t), its FPK equation can be expressed as:
πmk(t+1)=πmk(t)Pmk(pmk,xmkmk) (20),
wherein, pimk(t +1) is the state of the kth user in the mth group at time (t +1), Pmk(pmk,xmkmk) Is the probability that the kth user in the mth group transfers from the state at the time t to the state at the time (t +1), which is mainly determined by the actions of the users;
the state s at time t according to the definition of the reward functionmkThe value function (i.e., the HJB equation) of (t) is expressed as:
Figure BDA0002558022200000061
solving a Nash equilibrium solution for the MFG based on the FPK and HJB equations.
Further, the concrete mode of the step five is as follows:
and solving the equilibrium solution of the MFG by adopting a DDPG algorithm, wherein an objective function of the DDPG algorithm is defined as:
Figure BDA0002558022200000062
wherein, thetaμIs a parameter of the policy network that generates deterministic actions, and θμUpdating through strategy gradient;
there are two main networks in the Actor part, an online policy network and a target policy network. Deterministic policy μ for directly deriving action a at each instantt=μ(stμ) A determined value. Like the Actor portion, the criticc portion also has two networks, namely an online Q network and a target Q network. The Q function (i.e. action value function) defined by bellman's equation is the reward expectation value for selecting an action under a deterministic policy, fitting the Q function using a Q network, i.e.:
Qμ(st,at)=E[R+γQ(st+1,μ(st+1))](23),
wherein Q isμ(st,at) Is shown in state stSelecting action a with deterministic policy μtThe expected value obtained, to measure the performance of the policy, defines the performance goals as follows:
Figure BDA0002558022200000071
where β represents the behavior strategy, ρβIs a probability density function of the state space. In Critic part, the mean square error is taken as a loss function, i.e.:
Figure BDA0002558022200000072
thus, the loss function L with respect to θ can be derived from standard back propagation algorithmsQI.e.:
Figure BDA0002558022200000073
by updating the gradient in real time, the objective function tends to converge, and finally an optimal strategy is obtained, namely an optimal resource allocation scheme in the mobile edge computing system is obtained.
Compared with the prior art, the invention has the beneficial effects that:
1. the NOMA-MEC system is constructed as an MFG theoretical framework, and the equilibrium solution of the MFG is solved through reinforcement learning, so that the calculation cost of a user, including energy consumption and time delay, is minimized.
2. The invention constructs an uplink NOMA-MEC system in a super-dense network, and each SBS is provided with an MEC server to serve a plurality of users. In the system, all users of each SBS service are divided into different clusters according to a user clustering algorithm to increase the data rate of the users.
3. The NOMA-MEC system under ultra-dense networks was modeled as an MFG framework. And then, solving the equilibrium solution of the MFG by adopting a DDPG method, and reducing the energy consumption and task delay of the user by learning a dynamic resource allocation strategy.
4. The method provided by the invention is verified through experiments that the optimal resource allocation strategy can be effectively learned, and compared with other methods, the method more effectively reduces the calculation time delay and energy consumption of the user.
[ description of the drawings ]
FIG. 1 is a system diagram of the mobile edge computing of the ultra-dense network proposed by the present invention;
FIG. 2 is a schematic diagram of the relationship between the mean field game and the reinforcement learning algorithm of the present invention;
FIG. 3 is a schematic diagram of the present invention employing a reinforcement learning algorithm to optimize resource allocation in a NOMA-MEC system;
FIG. 4 is a diagram illustrating the relationship between energy consumption and maximum transmit power for different algorithm comparisons according to the present invention;
fig. 5 is a schematic diagram of the relationship between the calculated delay and the maximum transmission power under comparison of different algorithms.
[ detailed description ] embodiments
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
Different from the existing documents, from the viewpoint of relieving network resources and overcoming the self limitation of the mobile equipment, the invention researches the resource optimization in the uplink NOMA-MEC system in the ultra-dense network, combines a deep reinforcement learning algorithm, and minimizes the system delay and energy consumption by optimizing power and unloading strategies.
Step one, constructing a system model:
an uplink NOMA-MEC system is constructed, each SBS being equipped with a MEC server to serve multiple users.
The specific construction mode is as follows:
as shown in fig. 1, the present invention contemplates a NOMA-MEC communication system in an ultra-dense network with M ═ {1,2, …, M } small cells, where each small cell is equipped with a MEC server to perform user offloaded computational tasks. Assuming that the set of users served by each small base station is N ═ 1,2, …, N, in order to reduce interference between users, users need to be grouped. In the present invention, N users are divided into groups of {1,2, …, Y } and each group has K ═ 1,2, …, K } users.
When information transmission is carried out, the bandwidth B of the whole system is divided into Y sub-channels, and the bandwidth of each sub-channel is represented as BscAnd the users in each group are simultaneously transmitting information in their subchannels.
And step two, clustering all users in the system through a user clustering algorithm so as to improve the data transmission rate of the users. The intra-cluster users adopt a NOMA transmission mode, and the inter-cluster users adopt a Time Division Multiple Access (TDMA) transmission mode.
The specific mode of the second step is as follows:
in the NOMA-MEC communication system model established in step one, all users of each SBS service are sorted according to their channel gains, and then the user with the first M channel gains is sequentially selected as the first user in the M NOMA clusters. Next, the user having the largest sum of channel gain differences for the NOMA cluster is selected from the remaining users according to a greedy matching method. Further, when the number of users cannot be uniformly allocated to each cluster, the redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
And step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task. Including the local computational cost and the offload computational cost of the user.
The third step is specifically as follows:
and finishing clustering by the user according to the clustering algorithm in the step two. When information is transmitted, the NOMA technology is adopted by the users in the cluster, and the TDMA technology is adopted between the clusters, so that any user is interfered by not only the users in the same cluster, but also the users served by the SBS in the same time slot during information transmission.
For users in the NOMA cluster, users with greater channel gain will be interfered by users with smaller channel gain. The user with the smallest channel gain is not interfered by other users. Thus, the interference experienced by a user within a NOMA cluster can be expressed as:
Figure BDA0002558022200000091
wherein p ismfRepresents the transmission power, h, of the f-th user in the m-th NOMA clustermfRepresenting the channel gain for the f-th user in the m groups.
Secondly, in an ultra-dense network, users served by different small base stations may generate interference when transmitting tasks in the same time slot, which may be expressed as:
Figure BDA0002558022200000101
wherein p isjkDenotes the transmission power, h, of the kth user in group jjkRepresenting the channel gain for the kth user in group j.
So the SINR of the kth user in the mth group is expressed as:
Figure BDA0002558022200000102
wherein the content of the first and second substances,
Figure BDA0002558022200000107
is the power of additive white gaussian noise, the data rate of the kth user in the mth group is expressed as:
Rmk=Wsclog(1+τmk) (4),
wherein, Wsc=Wtotal/M,WtotalIs the system bandwidth.
The computing task for the kth user in the mth group may be defined as
Figure BDA0002558022200000103
Wherein d ismkRepresenting input data required by the kth user in the mth group to complete the computing task, cmkRepresenting the k-th user in the m-th group to calculate dmkThe number of CPU cycles required for the CPU,
Figure BDA0002558022200000104
representing the last time the kth user in the mth group completed the computing task.
Let x bemkRepresenting the unloading variables of the kth user in the mth group, for the local computation model, assume
Figure BDA0002558022200000105
Representing the local computing capacity of the kth user in the mth group, then when the user performs the task locally, its time is:
Figure BDA0002558022200000106
when calculating the energy consumption of local calculation, a common model for calculating the energy consumption is used, i.e., ═ κ f2. Where κ is an energy coefficient depending on the chip structure, so that the m-th groupThe local energy consumption of k users can be expressed as:
Figure BDA0002558022200000111
according to equations (5) and (6), the computation cost of the kth user in the mth group in the local computation can be expressed as:
Figure BDA0002558022200000112
wherein the content of the first and second substances,
Figure BDA0002558022200000113
and
Figure BDA0002558022200000114
weight coefficients representing delay and energy consumption, respectively, and
Figure BDA0002558022200000115
when in use
Figure BDA0002558022200000116
Time indicates that the user is sensitive to delay, and more focuses on computing time; otherwise, the energy of the user is low, and the energy consumption of the computing task is emphasized.
In the process of unloading to the calculation of the MEC server, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
Figure BDA0002558022200000117
Figure BDA0002558022200000118
wherein f issIs the computing power of the MEC server. The total time for this unloading process is therefore:
Figure BDA0002558022200000119
similarly, the energy consumption in the unloading process also has two parts, namely the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively:
Figure BDA00025580222000001110
Figure BDA00025580222000001111
according to equation (11) and equation (12), the total energy consumption of the unloading process can be expressed as:
Figure BDA00025580222000001112
thus, the cost function for the kth user in the mth group during the offloading process can be expressed as:
Figure BDA0002558022200000121
further, the cost function of the kth user in the mth group to complete the computing task can be expressed as
Figure BDA0002558022200000122
Step four, establishing a cost function:
modeling NOMA-MEC as an MFG framework, wherein SINR and channel gain of a user are expressed as a state space, and transmitting power, an unloading decision factor and a resource allocation factor of the user are expressed as an action space; and constructing a reward function of the user according to the calculation cost of the user.
The fourth step comprises the following specific steps:
when many users are simultaneously computing tasks, the interference can become very severe. This severely reduces the data transfer rate for the user, thereby increasing the time delay and power consumption when offloading computing tasks. Since each user is an independent individual, it only considers his interests in the ultra-dense scenario. Therefore, the present invention expresses this model as the MFG theoretical framework.
The state of each user comes only from its own local observations. In the NOMA-MEC system of the ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
smk(t)={τmk(t),hmk(t)} (16),
each user is dependent on the current state smk(t) selecting motion a from motion space Amk(t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action amk(t) ∈ A is expressed as:
amk(t)={pmk(t),xmkmk} (17),
in the formula (I), the compound is shown in the specification,
Figure BDA0002558022200000123
a weighting factor representing delay and energy consumption.
It is an object of the invention to minimize the computational cost of the user on the basis of the maximum delay. From the analysis of the user's calculated cost in step three, the user's cost function can be expressed as:
Figure BDA0002558022200000131
therefore, the reward function for the kth user in the mth group can be expressed as:
Figure BDA0002558022200000132
in the mean field game, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the entire system model. When in the m-th group
k users in state smk(t) selecting action amk(t), its FPK equation can be expressed as:
πmk(t+1)=πmk(t)Pmk(pmk,xmkmk) (20),
wherein, pimk(t +1) is the state of the kth user in the mth group at time (t +1), Pmk(pmk,xmkmk) Is the probability that the kth user in the mth group transfers from the state at time t to the state at time (t +1), which is mainly determined by the user's actions.
The state s at time t according to the definition of the reward functionmkThe value function (i.e., the HJB equation) of (t) is expressed as:
Figure BDA0002558022200000133
the nash equilibrium solution for the MFG can be solved based on the FPK and HJB equations.
And step five, acquiring a balanced solution of the mean field game by using a DDPG-based reinforcement learning method.
The concrete mode of the fifth step is as follows:
the DDPG algorithm is adopted to solve the equilibrium solution of the MFG, the problem of continuous action space can be solved, and the relation between the MFG and reinforcement learning is shown in figure 2. The DDPG algorithm can be used for resource optimization problems in many communication scenarios.
A schematic diagram for optimizing resource allocation in a NOMA-MEC system using the DDPG algorithm is shown in fig. 3. The DDPG algorithm is an Actor-Critic framework, so the DDPG algorithm is mainly divided into an Actor part and a Critic part to describe the process of the DDPG algorithm. The Actor part outputs a specific action a by minimizing the action Q (s, a) through a deterministic strategy mu on the premise of inputting a state s; the criticic part is to output Q (s, a) updated by bellman's equation on the premise of inputting state s and a specific action a. Thus, the objective function of the DDPG algorithm can be defined as:
Figure BDA0002558022200000141
wherein, thetaμIs a policy network that generates deterministic actionsAnd θ isμThe update is performed by a policy gradient.
There are two main networks in the Actor part, an online policy network and a target policy network. Deterministic policy μ for directly deriving action a at each instantt=μ(stμ) A determined value. Like the Actor portion, the criticc portion also has two networks, namely an online Q network and a target Q network. The Q function (i.e. action value function) defined by bellman's equation is the reward expectation value for selecting an action under a deterministic policy, fitting the Q function using a Q network, i.e.:
Qμ(st,at)=E[R+γQ(st+1,μ(st+1))](23),
wherein Q isμ(st,at) Is shown in state stSelecting action a with deterministic policy μtThe expected value obtained, to measure the performance of the policy, defines the performance goals as follows:
Figure BDA0002558022200000142
where β represents the behavior strategy, ρβIs a probability density function of the state space. The purpose of the training is to target the performance of the Q network JβMaximizing and minimizing the loss of the Q network. In Critic part, the mean square error is taken as a loss function, i.e.:
L(θQ)=E[R+γQ′(st+1,μ′(st+1μ′)|θQ′)-Q(st,atQ)](25),
thus, the loss function L with respect to θ can be derived from standard back propagation algorithmsQI.e.:
Figure BDA0002558022200000151
example (b):
the diagrams provided in the following examples and the setting of specific parameter values in the models are mainly for explaining the basic idea of the present invention and performing simulation verification on the present invention, and can be appropriately adjusted according to the actual scene and requirements in the specific application environment.
The invention researches a NOMA-MEC system in an ultra-dense network, wherein 60 small base stations are randomly distributed within a range of 10km x 10km, the coverage range of each small base station is 20m, and 64 users are randomly distributed near the small base stations.
To implement the DDPG algorithm, the Actor network and Critic network use a fully-connected neural network with three hidden layers, each containing 300 neurons. For the Actor network, the last output layer uses a Sigmoid activation function to ensure that the probability of the last action output is between 0 and 1. For a criticic network, a ReLU activation function is used for each layer. The learning rates of the Actor network and Critic network are set to 0.0001 and 0.001, respectively.
Fig. 4 and 5 show the effect of maximum transmit power for different algorithms and different multiple access modes. In fig. 4, it can be observed that the energy consumption of the system gradually increases with increasing maximum transmit power. The NOMA scheme can achieve lower energy consumption when the maximum transmission power is fixed. This is because users in a NOMA cluster can simultaneously use the full spectrum resources to transmit information, which can reduce the energy consumption of the system. As can be seen from fig. 5, the calculation delay decreases as the maximum transmission power increases. This is because, when the maximum transmission power is large, the calculation speed and the data transmission rate of the user become large, resulting in a reduction in calculation delay.

Claims (5)

1. A method for allocating computing resources at a mobile edge of a very dense network,
the resource allocation method is based on a super-dense network, wherein a NOMA-MEC communication system in the super-dense network comprises M {1,2, …, M } small base stations, wherein each small base station is provided with an MEC server to execute calculation tasks unloaded by users; assuming that the set of users served by each small base station is N ═ {1,2, …, N }, N users are divided into Y ═ {1,2, …, Y } groups, and K ═ 1,2, …, K } users in each group;
the resource allocation method is implemented according to the following steps:
step one, an uplink NOMA-MEC communication system is constructed, and each SBS is provided with an MEC server to serve a plurality of users;
step two, performing clustering processing on all users in the NOMA-MEC communication system according to the difference of channel gains; the cluster users adopt an NOMA transmission mode, and a TDMA transmission mode is adopted among clusters;
step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task; wherein the computational cost comprises a local computational cost and an off-load computational cost of the user;
modeling the NOMA-MEC communication system into an MFG framework; the SINR and the channel gain of a user are expressed as a state space, and the transmitting power, the unloading decision factor and the resource allocation factor of the user are expressed as an action space; constructing a reward function of the user according to the calculation cost of the user;
and step five, acquiring a balanced solution of the mean field game, namely an optimal resource allocation scheme in the mobile edge computing system by using a DDPG-based reinforcement learning method.
2. The method for allocating computing resources on the mobile edge of the ultra-dense network as claimed in claim 1, wherein the specific method in the second step is:
in the NOMA-MEC communication system model established in the step one, all users of each SBS service are sequenced according to the channel gain, and then users with the first M channel gains are sequentially selected as first users in M NOMA clusters;
selecting a user which enables the NOMA cluster to have the maximum sum of channel gain differences from other users according to a greedy matching method;
when the number of users cannot be uniformly allocated to each cluster, the redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
3. The method for allocating computing resources at a mobile edge in a very dense network as claimed in claim 1 or 2, wherein the third step is specifically:
3.1) local computing cost of the user:
let x bemkRepresenting the offload variable for the kth user in the mth group, it is assumed that for local computation models, i.e., users can complete computation tasks locally, without offloading the computation tasks to the MEC server
Figure FDA0002558022190000021
Representing the local computing capacity of the kth user in the mth group, then when the user performs the task locally, its time is:
Figure FDA0002558022190000022
when calculating the energy consumption of local calculation, a common model for calculating the energy consumption is used, i.e., ═ κ f2. Where κ is an energy coefficient depending on the chip structure, and the local energy consumption of the kth user in the mth group can be expressed as:
Figure FDA0002558022190000023
according to equations (5) and (6), the local computation cost of the kth user in the mth group can be expressed as:
Figure FDA0002558022190000024
wherein the content of the first and second substances,
Figure FDA0002558022190000025
and
Figure FDA0002558022190000026
weight coefficients representing delay and energy consumption, respectively, and
Figure FDA0002558022190000027
3.2) offload computation cost for the user:
in the process of unloading to the calculation of the MEC server, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
Figure FDA0002558022190000031
Figure FDA0002558022190000032
wherein f issIs the computing power of the MEC server;
the total time of the unloading process is:
Figure FDA0002558022190000033
the energy consumption in the unloading process also has two parts, namely the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively as follows:
Figure FDA0002558022190000034
Figure FDA0002558022190000035
according to equation (11) and equation (12), the total energy consumption of the unloading process is expressed as:
Figure FDA0002558022190000036
thus, the offload computation cost function for the kth user in the mth group is expressed as:
Figure FDA0002558022190000037
3.3) total calculated cost of the user:
obtaining the user local computation cost and the user offload computation cost according to 3.1 and 3.2, the overall computation cost function for the user to complete the computation task can be expressed as:
Figure FDA0002558022190000038
4. the method for allocating the computing resources of the mobile edge of the ultra-dense network as claimed in claim 1 or 2, wherein the specific steps of the fourth step are:
in the NOMA-MEC system of the ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
smk(t)={τmk(t),hmk(t)} (16),
each user is dependent on the current state smk(t) selecting motion a from motion space Amk(t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action amk(t) ∈ A is expressed as:
amk(t)={pmk(t),xmkmk} (17),
in the formula (I), the compound is shown in the specification,
Figure FDA0002558022190000041
weight coefficients representing delay and energy consumption;
according to the analysis of the user calculation cost in the third step, the cost function of the user is expressed as:
Figure FDA0002558022190000042
therefore, the reward function for the kth user in the mth group is expressed as:
Figure FDA0002558022190000043
in the mean field game, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the entire system model;
when the kth user in the mth group is in state smk(t) selecting action amk(t), its FPK equation can be expressed as:
πmk(t+1)=πmk(t)Pmk(pmk,xmkmk) (20),
wherein, pimk(t +1) is the state of the kth user in the mth group at time (t +1), Pmk(pmk,xmkmk) Is the probability that the kth user in the mth group transfers from the state at the time t to the state at the time (t +1), which is mainly determined by the actions of the users;
the state s at time t according to the definition of the reward functionmkThe value function (i.e., the HJB equation) of (t) is expressed as:
Figure FDA0002558022190000044
solving a Nash equilibrium solution for the MFG based on the FPK and HJB equations.
5. The method for allocating computing resources at a mobile edge in a very dense network as claimed in claim 1 or 2, wherein the concrete manner of said step five is:
and solving the equilibrium solution of the MFG by adopting a DDPG algorithm, wherein an objective function of the DDPG algorithm is defined as:
Figure FDA0002558022190000051
wherein, thetaμIs a parameter of the policy network that generates deterministic actions, and θμUpdating through strategy gradient;
there are two main networks in the Actor part, an online policy network and a target policy network. Deterministic policy μ for directly deriving each epochEngraving action at=μ(stμ) A determined value. Like the Actor portion, the criticc portion also has two networks, namely an online Q network and a target Q network. The Q function (i.e. action value function) defined by bellman's equation is the reward expectation value for selecting an action under a deterministic policy, fitting the Q function using a Q network, i.e.:
Qμ(st,at)=E[R+γQ(st+1,μ(st+1))](23),
wherein Q isμ(st,at) Is shown in state stSelecting action a with deterministic policy μtThe expected value obtained, to measure the performance of the policy, defines the performance goals as follows:
Figure FDA0002558022190000052
where β represents the behavior strategy, ρβIs a probability density function of the state space. In Critic part, the mean square error is taken as a loss function, i.e.:
L(θQ)=E[R+γQ′(st+1,μ′(st+1μ′)|θQ′)-Q(st,atQ)](25),
thus, the loss function L with respect to θ can be derived from standard back propagation algorithmsQI.e.:
Figure FDA0002558022190000053
by updating the gradient in real time, the objective function tends to converge, and finally an optimal strategy is obtained, namely an optimal resource allocation scheme in the mobile edge computing system is obtained.
CN202010597779.5A 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network Active CN111800828B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010597779.5A CN111800828B (en) 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010597779.5A CN111800828B (en) 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network

Publications (2)

Publication Number Publication Date
CN111800828A true CN111800828A (en) 2020-10-20
CN111800828B CN111800828B (en) 2023-07-18

Family

ID=72803807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010597779.5A Active CN111800828B (en) 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network

Country Status (1)

Country Link
CN (1) CN111800828B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468568A (en) * 2020-11-23 2021-03-09 南京信息工程大学滨江学院 Task relay unloading method of mobile edge computing network
CN112492691A (en) * 2020-11-26 2021-03-12 辽宁工程技术大学 Downlink NOMA power distribution method of deep certainty strategy gradient
CN112601256A (en) * 2020-12-07 2021-04-02 广西师范大学 MEC-SBS clustering-based load scheduling method in ultra-dense network
CN112654081A (en) * 2020-12-14 2021-04-13 西安邮电大学 User clustering and resource allocation optimization method, system, medium, device and application
CN112738822A (en) * 2020-12-25 2021-04-30 中国石油大学(华东) NOMA-based security offload and resource allocation method in mobile edge computing environment
CN113055854A (en) * 2021-03-16 2021-06-29 西安邮电大学 NOMA-based vehicle edge computing network optimization method, system, medium and application
CN113517920A (en) * 2021-04-20 2021-10-19 东方红卫星移动通信有限公司 Calculation unloading method and system for simulation load of Internet of things in ultra-dense low-orbit constellation
CN113543342A (en) * 2021-07-05 2021-10-22 南京信息工程大学滨江学院 Reinforced learning resource allocation and task unloading method based on NOMA-MEC
CN113938997A (en) * 2021-09-30 2022-01-14 中国人民解放军陆军工程大学 Resource allocation method for secure MEC system in NOMA (non-access-oriented multi-media access) Internet of things
CN114827191A (en) * 2022-03-15 2022-07-29 华南理工大学 Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system
CN115022937A (en) * 2022-07-14 2022-09-06 合肥工业大学 Topological feature extraction method and multi-edge cooperative scheduling method considering topological features
CN115460080A (en) * 2022-08-22 2022-12-09 昆明理工大学 Block chain assisted time-varying mean field game edge calculation unloading optimization method
CN117857559A (en) * 2024-03-07 2024-04-09 北京邮电大学 Metropolitan area optical network task unloading method based on average field game and edge server

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107819840A (en) * 2017-10-31 2018-03-20 北京邮电大学 Distributed mobile edge calculations discharging method in the super-intensive network architecture
CN109548013A (en) * 2018-12-07 2019-03-29 南京邮电大学 A kind of mobile edge calculations system constituting method of the NOMA with anti-eavesdropping ability
US20190124667A1 (en) * 2017-10-23 2019-04-25 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method for allocating transmission resources using reinforcement learning
CN109951897A (en) * 2019-03-08 2019-06-28 东华大学 A kind of MEC discharging method under energy consumption and deferred constraint
CN110798849A (en) * 2019-10-10 2020-02-14 西北工业大学 Computing resource allocation and task unloading method for ultra-dense network edge computing
CN111245539A (en) * 2020-01-07 2020-06-05 南京邮电大学 NOMA-based efficient resource allocation method for mobile edge computing network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190124667A1 (en) * 2017-10-23 2019-04-25 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method for allocating transmission resources using reinforcement learning
CN107819840A (en) * 2017-10-31 2018-03-20 北京邮电大学 Distributed mobile edge calculations discharging method in the super-intensive network architecture
CN109548013A (en) * 2018-12-07 2019-03-29 南京邮电大学 A kind of mobile edge calculations system constituting method of the NOMA with anti-eavesdropping ability
CN109951897A (en) * 2019-03-08 2019-06-28 东华大学 A kind of MEC discharging method under energy consumption and deferred constraint
CN110798849A (en) * 2019-10-10 2020-02-14 西北工业大学 Computing resource allocation and task unloading method for ultra-dense network edge computing
CN111245539A (en) * 2020-01-07 2020-06-05 南京邮电大学 NOMA-based efficient resource allocation method for mobile edge computing network

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468568A (en) * 2020-11-23 2021-03-09 南京信息工程大学滨江学院 Task relay unloading method of mobile edge computing network
CN112468568B (en) * 2020-11-23 2024-04-23 南京信息工程大学滨江学院 Task relay unloading method for mobile edge computing network
CN112492691A (en) * 2020-11-26 2021-03-12 辽宁工程技术大学 Downlink NOMA power distribution method of deep certainty strategy gradient
CN112492691B (en) * 2020-11-26 2024-03-26 辽宁工程技术大学 Downlink NOMA power distribution method of depth deterministic strategy gradient
CN112601256B (en) * 2020-12-07 2022-07-15 广西师范大学 MEC-SBS clustering-based load scheduling method in ultra-dense network
CN112601256A (en) * 2020-12-07 2021-04-02 广西师范大学 MEC-SBS clustering-based load scheduling method in ultra-dense network
CN112654081A (en) * 2020-12-14 2021-04-13 西安邮电大学 User clustering and resource allocation optimization method, system, medium, device and application
CN112738822A (en) * 2020-12-25 2021-04-30 中国石油大学(华东) NOMA-based security offload and resource allocation method in mobile edge computing environment
CN113055854A (en) * 2021-03-16 2021-06-29 西安邮电大学 NOMA-based vehicle edge computing network optimization method, system, medium and application
CN113517920A (en) * 2021-04-20 2021-10-19 东方红卫星移动通信有限公司 Calculation unloading method and system for simulation load of Internet of things in ultra-dense low-orbit constellation
CN113543342A (en) * 2021-07-05 2021-10-22 南京信息工程大学滨江学院 Reinforced learning resource allocation and task unloading method based on NOMA-MEC
CN113543342B (en) * 2021-07-05 2024-03-29 南京信息工程大学滨江学院 NOMA-MEC-based reinforcement learning resource allocation and task unloading method
CN113938997A (en) * 2021-09-30 2022-01-14 中国人民解放军陆军工程大学 Resource allocation method for secure MEC system in NOMA (non-access-oriented multi-media access) Internet of things
CN113938997B (en) * 2021-09-30 2024-04-30 中国人民解放军陆军工程大学 Resource allocation method of secure MEC system in NOMA (non-volatile memory access) Internet of things
CN114827191A (en) * 2022-03-15 2022-07-29 华南理工大学 Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system
CN114827191B (en) * 2022-03-15 2023-11-03 华南理工大学 Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system
CN115022937A (en) * 2022-07-14 2022-09-06 合肥工业大学 Topological feature extraction method and multi-edge cooperative scheduling method considering topological features
CN115022937B (en) * 2022-07-14 2022-11-11 合肥工业大学 Topological feature extraction method and multi-edge cooperative scheduling method considering topological features
CN115460080A (en) * 2022-08-22 2022-12-09 昆明理工大学 Block chain assisted time-varying mean field game edge calculation unloading optimization method
CN115460080B (en) * 2022-08-22 2024-04-05 昆明理工大学 Blockchain-assisted time-varying average field game edge calculation unloading optimization method
CN117857559A (en) * 2024-03-07 2024-04-09 北京邮电大学 Metropolitan area optical network task unloading method based on average field game and edge server

Also Published As

Publication number Publication date
CN111800828B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
CN109729528B (en) D2D resource allocation method based on multi-agent deep reinforcement learning
CN113950066B (en) Single server part calculation unloading method, system and equipment under mobile edge environment
CN111586720B (en) Task unloading and resource allocation combined optimization method in multi-cell scene
CN111445111B (en) Electric power Internet of things task allocation method based on edge cooperation
Wen et al. Federated dropout—a simple approach for enabling federated learning on resource constrained devices
CN113873022A (en) Mobile edge network intelligent resource allocation method capable of dividing tasks
CN110798849A (en) Computing resource allocation and task unloading method for ultra-dense network edge computing
CN112118287B (en) Network resource optimization scheduling decision method based on alternative direction multiplier algorithm and mobile edge calculation
Chen et al. Multiuser computation offloading and resource allocation for cloud–edge heterogeneous network
CN112788605B (en) Edge computing resource scheduling method and system based on double-delay depth certainty strategy
CN110856259A (en) Resource allocation and offloading method for adaptive data block size in mobile edge computing environment
Cheng et al. Efficient resource allocation for NOMA-MEC system in ultra-dense network: A mean field game approach
CN113596785A (en) D2D-NOMA communication system resource allocation method based on deep Q network
CN112860429A (en) Cost-efficiency optimization system and method for task unloading in mobile edge computing system
Lu et al. Cost-efficient resources scheduling for mobile edge computing in ultra-dense networks
Sha et al. DRL-based task offloading and resource allocation in multi-UAV-MEC network with SDN
CN114650228A (en) Federal learning scheduling method based on computation unloading in heterogeneous network
Bhandari et al. Optimal Cache Resource Allocation Based on Deep Neural Networks for Fog Radio Access Networks
Geng et al. Deep reinforcement learning-based computation offloading in vehicular networks
Hu et al. Deep reinforcement learning for task offloading in edge computing assisted power IoT
CN114885422A (en) Dynamic edge computing unloading method based on hybrid access mode in ultra-dense network
Xu et al. Deep reinforcement learning for communication and computing resource allocation in RIS aided MEC networks
CN114828018A (en) Multi-user mobile edge computing unloading method based on depth certainty strategy gradient
Liu et al. Multi-User Dynamic Computation Offloading and Resource Allocation in 5G MEC Heterogeneous Networks With Static and Dynamic Subchannels

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant