CN116843016A - Federal learning method, system and medium based on reinforcement learning under mobile edge computing network - Google Patents
Federal learning method, system and medium based on reinforcement learning under mobile edge computing network Download PDFInfo
- Publication number
- CN116843016A CN116843016A CN202310580633.3A CN202310580633A CN116843016A CN 116843016 A CN116843016 A CN 116843016A CN 202310580633 A CN202310580633 A CN 202310580633A CN 116843016 A CN116843016 A CN 116843016A
- Authority
- CN
- China
- Prior art keywords
- machine learning
- learning model
- model parameters
- user equipment
- federal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000002787 reinforcement Effects 0.000 title claims abstract description 33
- 238000010801 machine learning Methods 0.000 claims abstract description 114
- 230000002776 aggregation Effects 0.000 claims abstract description 28
- 238000004220 aggregation Methods 0.000 claims abstract description 28
- 230000006870 function Effects 0.000 claims abstract description 20
- 230000008569 process Effects 0.000 claims abstract description 11
- 238000004590 computer program Methods 0.000 claims abstract description 7
- 238000005304 joining Methods 0.000 claims abstract description 6
- 230000009471 action Effects 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 19
- 238000013528 artificial neural network Methods 0.000 claims description 13
- 238000006116 polymerization reaction Methods 0.000 claims description 9
- 230000008447 perception Effects 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 4
- OIGNJSKKLXVSLS-VWUMJDOOSA-N prednisolone Chemical compound O=C1C=C[C@]2(C)[C@H]3[C@@H](O)C[C@](C)([C@@](CC4)(O)C(=O)CO)[C@@H]4[C@@H]3CCC2=C1 OIGNJSKKLXVSLS-VWUMJDOOSA-N 0.000 claims description 3
- 238000001228 spectrum Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 claims description 3
- 239000003990 capacitor Substances 0.000 claims description 2
- 238000005265 energy consumption Methods 0.000 abstract description 6
- 230000004075 alteration Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a federal learning method, a federal learning system and a federal learning medium based on reinforcement learning under a mobile edge computing network, which comprises the following steps: the edge server downloads a machine learning model to be trained to user equipment through a base station; each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station; according to the local data volume of the equipment to be aggregated, the edge server aggregates the machine learning model parameters of all the equipment to be aggregated to obtain an aggregate value of the machine learning model parametersAnd downloaded to the joining federation via the base stationA learned user device; the system comprises an edge server and user equipment. The medium stores a computer program. The invention comprehensively considers the energy consumption in the federal learning process and the loss function value of the task model to optimize the federal aggregation strategy, thereby reducing the energy consumption while ensuring the accuracy of the task model.
Description
Technical Field
The invention relates to the technical fields of mobile edge computing, reinforcement learning and federation learning, in particular to a federation learning method, a federation learning system and a federation learning medium based on reinforcement learning under a mobile edge computing network.
Background
In recent years, with the continuous emergence of many new technologies such as computer vision, natural language processing, recommendation systems and the like, artificial intelligence has entered a period of vigorous development. However, it is traditionally difficult to handle training data distributed across individual mobile devices in a manner that aggregates all data individually training an artificial intelligence model on one device due to data islands and green communications.
Mobile edge computing is a potentially emerging technology that can process data locally and then offload computing tasks to the network edge, and by deploying a federal learning framework in a mobile edge computing network, can efficiently train data distributed across devices in a decentralized manner to obtain a fusion model.
Federal learning is proposed to build a distributed machine learning model based on multiparty data. Typically, federal learning systems include at least one parameter server and a number of work devices. Each working device and parameter server is responsible for locally updating the model and aggregating the model, respectively. Specifically, each work device trains the model locally and then uploads the model to a parameter server, which aggregates the received models according to some policy weighting and then sends the aggregated model to each work device. The content transmitted between each working device and the parameter server only contains model parameters and has no specific data, so that the model can be trained in a decentralizing mode, the training efficiency is greatly improved, and the privacy of all devices is protected.
However, devices with many different computing resources in a mobile edge computing network, and these devices often have large uncertainties, such as offline, power down, network blocking, etc., the data volume distribution in the different devices is uneven and can vary over time, as well as the computing power and endurance of the different devices, which can result in slow model convergence and high training energy consumption.
Disclosure of Invention
The invention aims to provide a federal learning method based on reinforcement learning under a mobile edge computing network, which comprises the following steps:
1) Determining user equipment currently joining federal learning;
the edge server downloads a machine learning model to be trained to user equipment through a base station;
2) Each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station;
3) The edge server judges the received machine learning model parameters and local preset convergence conditions, if the machine learning model parameters which do not meet the convergence conditions exist, the step 4) is carried out, and if all the machine learning model parameters meet the convergence conditions, the machine learning model training is completed;
4) The edge server selects n t The user equipment is used as equipment to be aggregated;
according to the local data volume of the equipment to be aggregated, the edge server aggregates the machine learning model parameters of all the equipment to be aggregated to obtain an aggregate value of the machine learning model parametersDownloading the information to user equipment added with federal learning through a base station;
5) The user equipment joining federal learning aggregates values of machine learning model parametersAs new machine learning model parameters, updating the machine learning model, letting the iteration number k=k+1, and returning to step 2) until a trained machine learning model is obtained.
Further, the machine learning model parameters w i (k) The following is shown:
wherein w is i (k-1) machine learning model parameters updated for the (k-1) th iteration;a first order gradient of machine learning model parameters updated for the k-1 th iteration; alpha is the learning rate.
Further, machine learning model parameter aggregate valuesThe following is shown:
in I D i The I is the local data volume of the ith user equipment; w (w) i (k t,i ) Machine learning model parameters for an ith user device; x is x t,i E {0,1} indicates whether device i is engaged in the t-th round of aggregation. N is the number of user devices.
Further, based on a dynamic asynchronous federation aggregation algorithm, the edge server selects n according to the time sequence of receiving the machine learning model parameters t The user equipment is used as equipment to be aggregated.
Further, the number of devices to be polymerized n t Determined by a dynamic asynchronous federal aggregation algorithm.
Further, the number n of the equipment to be polymerized is determined t The method comprises the following steps:
s 1) using the edge server as an agent, the agent obtaining feedback information from the user equipment, thereby establishing a perception statet is the number of polymerization rounds; ΔF (delta F) t Global loss function difference values for two adjacent aggregations;
wherein the time E required for completing the parameter aggregation of the machine learning model t Energy H required for completing parameter aggregation of machine learning model t Global loss function value F t The following is shown:
in the method, in the process of the invention,the loss function value corresponding to the ith user equipment;
the i-th user equipment updates the learning model parameters w i (k) The time requiredEnergy consumption->The following is shown:
wherein, K, C,f i The device chip architecture comprises an effective switch capacitor of a device chip architecture, the number of CPU rounds required by single data training, the data quantity of each batch on the ith user equipment and the frequency of the device CUP.
The ith user equipment learns the machine model parameters w i (k) Time required for uploading to edge serverEnergy consumption->The following is shown:
wherein s, b i 、p i 、g i 、N 0 The model size, the bandwidth occupied by the ith user equipment, the average transmission power of the ith user equipment, the channel gains of the ith user equipment and the edge server and the power spectrum density of Gaussian noise are respectively obtained.
s 2) the edge server will perceive the state s t Is input into a pre-stored deep neural network as input data to obtain a value r with the maximum rewards t Action a of (2) t Action a t As the number of devices to be polymerized.
Further, the Loss function Loss (θ) of the deep neural network is as follows:
in the method, in the process of the invention,to perform the value of action a; />Is expected;
target value y j The following is shown:
wherein r is j Rewards for performing action aj; s is(s) j+1 Is a perception state; gamma is an attenuation factor; θ is a deep neural network parameter; a, a ′ Is s j+1 Is a motion space of (a);
further, the loss function gradient of the deep neural networkThe following is shown:
in the method, in the process of the invention,to reward gradients.
A system applying the reinforcement learning-based federal learning method under the mobile edge computing network of any one of claims 1-8, the system being configured to complete training of a machine learning model to obtain a machine learning model meeting preset requirements;
the system comprises an edge server and a plurality of user devices;
when in work, the edge server downloads a machine learning model to be trained to user equipment through a base station;
each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station;
the edge server judges the received machine learning model parameters and local preset convergence conditions, and if all the machine learning model parameters meet the convergence conditions, the machine learning model training is completed;
if there are machine learning model parameters which do not meet the convergence condition, then n is selected t The user equipment is used as equipment to be aggregated, and machine learning model parameters of all the equipment to be aggregated are aggregated to obtain an aggregate value of the machine learning model parametersDownloading the information to user equipment added with federal learning through a base station;
the user device aggregates machine learning model parameters into valuesAs new machine learning model parameters, the machine learning model is updated and the machine learning model is continuously trained by using the local data.
A computer-readable storage medium having a computer program stored thereon;
the steps of the above method are performed when the computer program is called.
The technical effect of the invention is undoubtedly that the invention provides a federal learning method based on reinforcement learning under a mobile edge computing network, which has the following beneficial effects:
the dynamic and uncertainty of the network are considered when optimizing the federal aggregation policy, so that the system can normally and stably operate in most network environments.
Furthermore, the invention comprehensively considers the energy consumption in the federal learning process and the loss function value of the task model to optimize the federal aggregation strategy, thereby reducing the energy consumption while ensuring the accuracy of the task model.
Furthermore, the federal aggregation strategy used by the invention is based on a reinforcement learning algorithm, can meet the requirements of different networks and users, and can optimize the algorithm network at the same time in use, so that the system achieves better effect.
Drawings
FIG. 1 is a system model diagram;
FIG. 2 is a block diagram of reinforcement learning;
FIG. 3 is a federal learning flow chart based on reinforcement learning;
FIG. 4 is a flow chart of a reinforcement learning algorithm.
Detailed Description
The present invention is further described below with reference to examples, but it should not be construed that the scope of the above subject matter of the present invention is limited to the following examples. Various substitutions and alterations are made according to the ordinary skill and familiar means of the art without departing from the technical spirit of the invention, and all such substitutions and alterations are intended to be included in the scope of the invention.
Example 1:
referring to fig. 1 to 4, a federal learning method based on reinforcement learning under a mobile edge computing network includes the steps of:
1) Determining user equipment currently joining federal learning;
the edge server downloads a machine learning model to be trained to user equipment through a base station;
2) Each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station;
3) The edge server judges the received machine learning model parameters and local preset convergence conditions, if the machine learning model parameters which do not meet the convergence conditions exist, the step 4) is carried out, and if all the machine learning model parameters meet the convergence conditions, the machine learning model training is completed;
4) The edge server selects n t The user equipment is used as equipment to be aggregated;
according to the local data volume of the equipment to be aggregated, the edge server aggregates the machine learning model parameters of all the equipment to be aggregated to obtain an aggregate value of the machine learning model parametersDownloading the information to user equipment added with federal learning through a base station;
5) The user equipment joining federal learning aggregates values of machine learning model parametersAs new machine learning model parameters, updating the machine learning model, letting the iteration number k=k+1, and returning to step 2) until a trained machine learning model is obtained.
Example 2:
referring to fig. 1 to 4, a federal learning method based on reinforcement learning in a mobile edge computing network has the same technical content as embodiment 1, and further, the machine learning model parameters w i (k) The following is shown:
wherein w is i (k-1) machine learning model parameters updated for the (k-1) th iteration;a first order gradient of machine learning model parameters updated for the k-1 th iteration; alpha is the learning rate.
Example 3:
referring to fig. 1 to 4, a federal learning method based on reinforcement learning in a mobile edge computing network is provided, which is in the technologyThe same as in any one of embodiments 1-2, further comprising machine learning model parameter aggregate valuesThe following is shown:
in I D i The I is the local data volume of the ith user equipment; w (w) i (k t,i ) Machine learning model parameters for an ith user device; x is x t,i E {0,1} indicates whether device i is engaged in the t-th round of aggregation. N is the number of user devices.
Example 4:
referring to fig. 1 to fig. 4, a federal learning method based on reinforcement learning in a mobile edge computing network includes the technical content as in any one of embodiments 1 to 3, and further, based on a dynamic asynchronous federal aggregation algorithm, the edge server selects n according to a time sequence of receiving parameters of a machine learning model t The user equipment is used as equipment to be aggregated.
Example 5:
referring to fig. 1 to fig. 4, a federal learning method based on reinforcement learning in a mobile edge computing network has the technical content as in any one of embodiments 1 to 4, and further, the number n of devices to be aggregated t Determined by a dynamic asynchronous federal aggregation algorithm.
Example 6:
referring to fig. 1 to 4, a federal learning method based on reinforcement learning in a mobile edge computing network includes the technical contents as in any one of embodiments 1 to 5, and further, determining the number n of devices to be aggregated t The method comprises the following steps:
s 1) using the edge server as an agent, the agent obtaining feedback information from the user equipment, thereby establishing a perception statet is the number of polymerization rounds; ΔF (delta F) t For two adjacent polymerization processesGlobal loss function difference;
wherein the time E required for completing the parameter aggregation of the machine learning model t Energy H required for completing parameter aggregation of machine learning model t Global loss function value F t The following is shown:
in the method, in the process of the invention,the loss function value corresponding to the ith user equipment;
the i-th user equipment updates the learning model parameters w i (k) The time requiredEnergy consumption->The following is shown:
wherein, K, C,f i Respectively isThe effective switch capacitance of the device chip architecture, the number of CPU rounds required for single data training, the data quantity of each batch on the ith user device and the device CUP frequency.
The ith user equipment learns the machine model parameters w i (k) Time required for uploading to edge serverEnergy consumption->The following is shown:
wherein s, b i 、p i 、g i 、N 0 The model size, the bandwidth occupied by the ith user equipment, the average transmission power of the ith user equipment, the channel gains of the ith user equipment and the edge server and the power spectrum density of Gaussian noise are respectively obtained.
s 2) the edge server will perceive the state s t Is input into a pre-stored deep neural network as input data to obtain a value r with the maximum rewards t Action a of (2) t Action a t As the number of devices to be polymerized.
Example 7:
referring to fig. 1 to fig. 4, a federal learning method based on reinforcement learning under a mobile edge computing network has the technical content as in any one of embodiments 1 to 6, and further, a Loss function Loss (θ) of the deep neural network is as follows:
in the method, in the process of the invention,to perform the value of action a; />Is expected;
target value y j The following is shown:
wherein r is j Rewards for performing action aj; s is(s) j+1 Is a perception state; gamma is an attenuation factor; θ is a deep neural network parameter; a, a ′ Is s j+1 Is a motion space of (a);
example 8:
referring to fig. 1 to 4, a federal learning method based on reinforcement learning under a mobile edge computing network includes the technical contents as in any one of embodiments 1 to 7, and further, a gradient of a loss function of the deep neural networkThe following is shown:
in the method, in the process of the invention,to reward gradients.
Example 9:
a system for applying reinforcement learning-based federal learning methods under the mobile edge computing network of any one of embodiments 1-8, the system being configured to complete training of a machine learning model to obtain a machine learning model that meets preset requirements;
the system comprises an edge server and a plurality of user devices;
when in work, the edge server downloads a machine learning model to be trained to user equipment through a base station;
each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station;
the edge server judges the received machine learning model parameters and local preset convergence conditions, and if all the machine learning model parameters meet the convergence conditions, the machine learning model training is completed;
if there are machine learning model parameters which do not meet the convergence condition, then n is selected t The user equipment is used as equipment to be aggregated, and machine learning model parameters of all the equipment to be aggregated are aggregated to obtain an aggregate value of the machine learning model parametersDownloading the information to user equipment added with federal learning through a base station;
the user device aggregates machine learning model parameters into valuesAs new machine learning model parameters, the machine learning model is updated and the machine learning model is continuously trained by using the local data.
Example 10:
a computer-readable storage medium having a computer program stored thereon;
the computer program, when invoked, performs the steps of the method of any one of embodiments 1-8.
Example 11:
a federal learning method based on reinforcement learning under a mobile edge computing network mainly comprises the following steps:
1) At the current time t, the federation learning is started, and N devices to be federated learning in the signal range of the edge base station are read from the network.
2) Each device incorporating federal learning is trained locallyTraining update model parameters w i (k) The specific update rules are as follows:
updating learning model parameters w i (k) The time required can be calculated by the period of the CPU:
the energy consumed by each device was also calculated:
and then uploading the updated parameters to an edge server through a base station, and calculating the time and energy consumed by uploading according to an information transmission model:
3) According to a dynamic asynchronous federation aggregation algorithm, n is selected according to the sequence of receiving the uploading model parameters of each device t Parameters uploaded by individual devices in the edge server for these model parameters according to the data volume |D of the corresponding device i Carry out weighted summation:
the edge server then sends the updated model parameters to each of the devices that join federal learning. And simultaneously obtaining a global loss function value:
meanwhile, the time and energy required by each round of federal polymerization can be calculated according to the specific equipment participating in the polymerization:
4) When model polymerization is carried out, a determination n is obtained based on DQN training of a reinforcement learning algorithm t Determining n t Specific values.
4.1 Using the edge server as an agent and the mobile edge computing network in which the device is located as an environment. Agent perceives state in messages fed back from deviceThe method comprises the steps of aggregation times, energy and time consumption and loss function values of a model, outputting the value of each action under the corresponding state, namely the number of the devices participating in federal aggregation in the round, and selecting an action a with the maximum value t To execute and obtain rewards r t . In state s t Lower execution a t The actual value is->
4.2 A deep neural network is used to formulate a policy pi that outputs the action with the greatest value when the current state is entered. Upon choosing to perform this action, the smart will get rewards:
energy consumption of federal learning is reduced by maximizing rewards.
4.3 The agent randomly selects actions in the corresponding states through the strategy pi and returns rewards. After the polymerization of this round is completed, the next round of polymerization is entered, and the step is repeated.
4.4 After the agent collects a certain experience, training the policy network of the agent:
wherein the target value is updated by a cost function:
the intelligent agent updates the parameters of the network according to the gradient descent algorithm along with the street:
5) Dynamically updating the federation aggregation strategy according to the reinforcement learning algorithm, and adopting the strategy to perform federation aggregation.
5.1 In the edge server, when the devices upload the aggregated request, the agent performs federal aggregation by selecting the number of devices participating in the aggregation by predicting a cost function through the above-described training updated network.
5.2 After performing the action, updating the current federal learning environment.
5.3 Broadcasting the aggregated parameters to each of the federal learning-participating devices by the edge server.
Claims (10)
1. The federal learning method based on reinforcement learning under the mobile edge computing network is characterized by comprising the following steps:
1) And determining the user equipment which is currently added into federal learning.
The edge server downloads a machine learning model to be trained to user equipment through a base station;
2) Each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station;
3) The edge server judges the received machine learning model parameters and local preset convergence conditions, if the machine learning model parameters which do not meet the convergence conditions exist, the step 4) is carried out, and if all the machine learning model parameters meet the convergence conditions, the machine learning model training is completed;
4) The edge server selects n t The user equipment is used as equipment to be aggregated;
according to the local data volume of the equipment to be aggregated, the edge server aggregates the machine learning model parameters of all the equipment to be aggregated to obtain an aggregate value of the machine learning model parametersDownloading the information to user equipment added with federal learning through a base station;
5) The user equipment joining federal learning aggregates values of machine learning model parametersAs new machine learning model parameters, updating the machine learning model, letting the iteration number k=k+1, and returning to step 2) until a trained machine learning model is obtained.
2. The reinforcement learning-based federal learning method under a mobile edge computing network of claim 1, wherein the machine learning model parameters w i (k) The following is shown:
wherein w is i (k-1) machine learning model parameters updated for the (k-1) th iteration;a first order gradient of machine learning model parameters updated for the k-1 th iteration; alpha is the learning rate.
3. The reinforcement learning-based federal learning method under a mobile edge computing network of claim 1, wherein machine learning model parameters aggregate valuesThe following is shown:
in I D i The I is the local data volume of the ith user equipment; w (w) i (k t,i ) Machine learning model parameters for an ith user device; x is x t,i E {0,1} indicates whether device i is engaged in the t-th round of aggregation. N is the number of user devices.
4. The reinforcement learning-based federation learning method in a mobile edge computing network of claim 1, wherein the edge server selects n in time order of receiving machine learning model parameters based on a dynamic asynchronous federation aggregation algorithm t The user equipment is used as equipment to be aggregated.
5. The federal learning method based on reinforcement learning under a mobile edge computing network according to claim 1, wherein the number of devices to be aggregated, n t Determined by a dynamic asynchronous federal aggregation algorithm.
6. The federal learning method based on reinforcement learning under a mobile edge computing network of claim 1, whereinDetermining the number n of the equipment to be polymerized t The method comprises the following steps:
s 1) using the edge server as an agent, the agent obtaining feedback information from the user equipment, thereby establishing a perception statet is the number of polymerization rounds; ΔF (delta F) t Global loss function difference values for two adjacent aggregations; />Is an energy aggregate value;
wherein the time E required for completing the parameter aggregation of the machine learning model t Energy H required for completing parameter aggregation of machine learning model t Global loss function value F t The following is shown:
in the method, in the process of the invention,the loss function value corresponding to the ith user equipment;
the i-th user equipment updates the learning model parameters w i (k) The time requiredEnergy consumption->The following is shown:
wherein, K, C,f i The device chip architecture comprises an effective switch capacitor of a device chip architecture, the number of CPU rounds required by single data training, the data quantity of each batch on the ith user equipment and the frequency of the device CUP.
The ith user equipment learns the machine model parameters w i (k) Time required for uploading to edge serverEnergy consumption->The following is shown:
wherein s, b i 、p i 、g i 、N 0 The model size, the bandwidth occupied by the ith user equipment, the average transmission power of the ith user equipment, the channel gains of the ith user equipment and the edge server and the power spectrum density of Gaussian noise are respectively obtained.
s 2) the edge server will perceive the state s t Is input into a pre-stored deep neural network as input data to obtain a value r with the maximum rewards t Action a of (2) t Action is toa t As the number of devices to be polymerized.
7. The reinforcement learning-based federal learning method under a mobile edge computing network according to claim 6, wherein the Loss function Loss (θ) of the deep neural network is as follows:
wherein Q(s) j A; θ) is the value of performing action a;is expected;
target value y j The following is shown:
wherein r is j Rewards for performing action aj; s is(s) j+1 Is a perception state; gamma is an attenuation factor; θ is a deep neural network parameter; a' is s j+1 Is a motion space of the device.
8. The reinforcement learning-based federal learning method under a mobile edge computing network of claim 6, wherein the deep neural network has a gradient of loss functionThe following is shown:
in the method, in the process of the invention,to reward gradients. />To perform the value of action a.
9. A system for applying the reinforcement learning-based federal learning method under the mobile edge computing network according to any one of claims 1 to 8, wherein the system is used for completing training of a machine learning model to obtain the machine learning model meeting preset requirements;
the system comprises an edge server and a plurality of user devices;
when in work, the edge server downloads a machine learning model to be trained to user equipment through a base station;
each user equipment trains the machine learning model by using the local data to obtain machine learning model parameters w i (k) Uploading the data to an edge server through a base station;
the edge server judges the received machine learning model parameters and local preset convergence conditions, and if all the machine learning model parameters meet the convergence conditions, the machine learning model training is completed;
if there are machine learning model parameters which do not meet the convergence condition, then n is selected t The user equipment is used as equipment to be aggregated, and machine learning model parameters of all the equipment to be aggregated are aggregated to obtain an aggregate value of the machine learning model parametersDownloading the information to user equipment added with federal learning through a base station;
the user device aggregates machine learning model parameters into valuesAs new machine learning model parameters, the machine learning model is updated and the machine learning model is continuously trained by using the local data.
10. A computer-readable storage medium, characterized in that a computer program is stored thereon;
when said computer program is called, performs the steps of the method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310580633.3A CN116843016A (en) | 2023-05-22 | 2023-05-22 | Federal learning method, system and medium based on reinforcement learning under mobile edge computing network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310580633.3A CN116843016A (en) | 2023-05-22 | 2023-05-22 | Federal learning method, system and medium based on reinforcement learning under mobile edge computing network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116843016A true CN116843016A (en) | 2023-10-03 |
Family
ID=88164265
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310580633.3A Pending CN116843016A (en) | 2023-05-22 | 2023-05-22 | Federal learning method, system and medium based on reinforcement learning under mobile edge computing network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116843016A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117938957A (en) * | 2024-03-22 | 2024-04-26 | 精为技术(天津)有限公司 | Edge cache optimization method based on federal deep learning |
-
2023
- 2023-05-22 CN CN202310580633.3A patent/CN116843016A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117938957A (en) * | 2024-03-22 | 2024-04-26 | 精为技术(天津)有限公司 | Edge cache optimization method based on federal deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yu et al. | Toward resource-efficient federated learning in mobile edge computing | |
CN112181666B (en) | Equipment assessment and federal learning importance aggregation method based on edge intelligence | |
Han et al. | Adaptive gradient sparsification for efficient federated learning: An online learning approach | |
Chen et al. | iRAF: A deep reinforcement learning approach for collaborative mobile edge computing IoT networks | |
CN110113190A (en) | Time delay optimization method is unloaded in a kind of mobile edge calculations scene | |
CN111414252B (en) | Task unloading method based on deep reinforcement learning | |
CN111522669A (en) | Method, device and equipment for optimizing horizontal federated learning system and readable storage medium | |
Liu et al. | Online computation offloading and resource scheduling in mobile-edge computing | |
Xie et al. | Adaptive online decision method for initial congestion window in 5G mobile edge computing using deep reinforcement learning | |
CN114528304A (en) | Federal learning method, system and storage medium for updating self-adaptive client parameters | |
CN113568727B (en) | Mobile edge computing task allocation method based on deep reinforcement learning | |
CN111628855B (en) | Industrial 5G dynamic multi-priority multi-access method based on deep reinforcement learning | |
CN111629380A (en) | Dynamic resource allocation method for high-concurrency multi-service industrial 5G network | |
CN112988285B (en) | Task unloading method and device, electronic equipment and storage medium | |
Ali et al. | Smart computational offloading for mobile edge computing in next-generation Internet of Things networks | |
CN114885420A (en) | User grouping and resource allocation method and device in NOMA-MEC system | |
CN116541106B (en) | Computing task unloading method, computing device and storage medium | |
CN110519849B (en) | Communication and computing resource joint allocation method for mobile edge computing | |
CN116489708B (en) | Meta universe oriented cloud edge end collaborative mobile edge computing task unloading method | |
CN116843016A (en) | Federal learning method, system and medium based on reinforcement learning under mobile edge computing network | |
CN113919483A (en) | Method and system for constructing and positioning radio map in wireless communication network | |
Mafuta et al. | Decentralized resource allocation-based multiagent deep learning in vehicular network | |
Henna et al. | Distributed and collaborative high-speed inference deep learning for mobile edge with topological dependencies | |
Wang et al. | Multi-objective joint optimization of communication-computation-caching resources in mobile edge computing | |
CN114615705B (en) | Single-user resource allocation strategy method based on 5G network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |