CN115796271A - Federal learning method based on client selection and gradient compression - Google Patents

Federal learning method based on client selection and gradient compression Download PDF

Info

Publication number
CN115796271A
CN115796271A CN202211412335.5A CN202211412335A CN115796271A CN 115796271 A CN115796271 A CN 115796271A CN 202211412335 A CN202211412335 A CN 202211412335A CN 115796271 A CN115796271 A CN 115796271A
Authority
CN
China
Prior art keywords
client
target
compression
compression ratio
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211412335.5A
Other languages
Chinese (zh)
Inventor
许杨
姜志达
徐宏力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Institute Of Higher Studies University Of Science And Technology Of China
Original Assignee
Suzhou Institute Of Higher Studies University Of Science And Technology Of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Institute Of Higher Studies University Of Science And Technology Of China filed Critical Suzhou Institute Of Higher Studies University Of Science And Technology Of China
Priority to CN202211412335.5A priority Critical patent/CN115796271A/en
Publication of CN115796271A publication Critical patent/CN115796271A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The invention provides a federal learning method based on client selection and gradient compression, which comprises the following steps in each training round: the parameter server selects target clients for the current round of training, determines corresponding compression ratios for the target clients, and sends the global model and the corresponding compression ratios to the target clients; training a global model on a local data set by a target client, updating model parameters corresponding to the global model, and thinning original model updating parameters according to a compression ratio corresponding to the target client; and the target client sends the compressed model updating parameters to the parameter server so that the parameter server can aggregate and update the global model, and the next training round is started. According to the method, the federated learning process of the heterogeneous client can be effectively accelerated through the combined optimization of client selection and compression ratio decision, the balance between resource overhead and training performance is realized, and the processing efficiency of local data is improved.

Description

Federal learning method based on client selection and gradient compression
Technical Field
The invention belongs to the field of Distributed Machine Learning (Distributed Machine Learning), and particularly relates to a federal Learning method based on client selection and gradient compression.
Background
In recent years, the internet of things and mobile devices generate massive data at the network edge, and the data have great potential for training machine learning models and developing intelligent applications. However, transmitting these edge-side data to a centralized entity for model training may cause network congestion and reveal the privacy of the user. Federal learning is a new distributed learning paradigm where multiple clients work together to train a model in coordination with a parameter server without exposing local data. Therefore, the federal study can effectively protect the data privacy and fully utilize the computing resources of the edge device.
Despite the many advantages of federal learning, there are several challenges in actual deployment. (1) limited communication resources: clients participating in federal learning need to iteratively communicate with the parameter server over a bandwidth-limited network, and the resulting huge overhead impacts the utility of federal learning. (2) dynamic network conditions: the communication conditions of the wireless channel may fluctuate over time due to link instability and bandwidth contention. (3) heterogeneous client properties: the client heterogeneity generally comprises capability heterogeneity and data heterogeneity, and on one hand, the client has a large difference in computing and communication capabilities due to hardware limitation and scattered geographic positions; on the other hand, due to user preference and local environment, local data on the client side follows different distributions, and heterogeneous statistical data will introduce deviation to the training process and finally affect model accuracy.
To reduce communication overhead, existing schemes use model/gradient compression techniques to reduce the size of the transmitted data, but they typically allocate fixed or the same compression ratio to the client, ignoring the client's heterogeneous and dynamically varying capabilities. In addition, in consideration of resource overhead and client availability, the parameter server usually selects a part of the clients rather than all the clients to participate in federal learning, but the existing client selection scheme cannot simultaneously solve the challenges of network dynamics and client heterogeneity, and thus, efficient federal learning is hindered.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a federated learning method based on client selection and gradient compression, which fully solves the key challenges of limited resources, network dynamics and client isomerism by utilizing the combined optimization of the gradient compression and the client selection, thereby realizing efficient federated learning and improving the efficiency of data processing.
The invention provides a federal learning method based on client selection and gradient compression, which comprises the following steps:
in each training round, the training rounds comprise:
s1, a parameter server selects target clients for the current round of training, determines corresponding compression ratios for the target clients, and sends a global model and the corresponding compression ratios to the target clients;
s2, the target client trains a global model on a local data set, updates model parameters corresponding to the global model, and sparsizes original model update parameters according to a compression ratio corresponding to the target client;
and S3, the target client sends the compressed model updating parameters to a parameter server for the parameter server to aggregate and update the global model, and the next training round is started.
Optionally, the parameter server selects target clients performing the current round of training and determines a corresponding compression ratio for each target client, including:
in each iteration, the parameter server selects a client firstly, determines a compression ratio for the currently selected client, and takes the client with smaller compression error in the currently selected client and the client selected in the previous iteration as the currently selected target client;
removing the client with the minimum compression ratio in the currently selected target clients and entering next iteration;
and after a certain number of iterations, obtaining the final target client and the compression ratio corresponding to each target client.
Optionally, the selecting, by the parameter server in S1, a target client for the current round of training includes:
taking the difference between the model update of the client aggregation selected by the parameter server each time and the model update of all the client aggregations as an approximate error;
and converting the approximate error minimization problem into a submodel maximization problem, and adding the clients with the maximum boundary gain into a set by using a greedy algorithm until the resource limit is reached to obtain the target client.
Optionally, the determining, in S1, a corresponding compression ratio for each target client includes:
and minimizing compression errors under the constraint of time resources, and optimally obtaining the compression ratio of each target client by using a linear programming solver.
Optionally, the thinning the original model update according to the compression ratio corresponding to the target client includes:
and according to the corresponding compression ratio, adopting a compression algorithm to reserve gradient elements with absolute values larger than a certain threshold, and setting the rest gradients as 0.
The method is based on a federal learning scene, mainly aims to fully solve the challenges of limited resources, network dynamics and client heterogeneity through the joint optimization of client selection and gradient compression, and accelerates the training process. The method is different from the prior method and mainly comprises the following steps: the client selects and considers data isomerism, gradient dispersity is encouraged, the adaptive decision of client capacity is carried out by considering isomerism and dynamic compression ratio, and the resource overhead and training performance are balanced through combined optimization of the client capacity and the dynamic client capacity.
Compared with the scheme in the prior art, the invention has the advantages that:
1. the method introduces the dispersibility in the client selection, selects the client with representative gradient information to participate in training under the resource limitation, promotes the fairness and reduces the deviation brought by non-IID data.
2. The compression ratio of the self-adaptive decision in the method takes the dynamic and heterogeneous capabilities of the client into consideration, so that each client transmits a compression gradient suitable for the capability of the client, and the client with poor capability is prevented from becoming the bottleneck of model training.
3. The method considers the coupling relation between the client selection and the gradient compression, and jointly optimizes the client selection and the gradient compression to fully solve the challenges of limited resources, network dynamics and client heterogeneity.
The invention discloses a heterogeneous sensing federated learning method based on self-adaptive client selection and gradient compression, which reduces the deviation introduced by non-label (non-IID) data by using a client selection technology, reduces the communication overhead by using a gradient compression technology, allocates different compression ratios to each client according to heterogeneous and dynamic capabilities, and reduces the difference of completion time between the clients. According to the method, the most representative client subset is selected to participate in training in consideration of data heterogeneity, and after the training is finished, the selected client adaptively updates the compression model according to the self ability and uploads the compression model to the parameter server for aggregation, so that model convergence is accelerated, efficient federal learning is realized, and the data processing efficiency is improved.
Drawings
Fig. 1 is a flowchart of a federated learning method based on client selection and gradient compression according to an embodiment of the present invention.
Fig. 2 is a diagram of client selection and gradient compression effects provided by the implementation of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings, not all of them.
Examples
Fig. 1 is a flowchart of a federal learning method based on client selection and gradient compression according to an embodiment of the present invention, where the federal learning includes a plurality of training rounds, and each round includes the following steps:
s1, a parameter server selects target clients for the training of the current round, determines corresponding compression ratios for the target clients, and sends a global model and the corresponding compression ratios to the target clients.
Because data distribution of each client in federal learning is different, similar and redundant gradient information can be provided, real data distribution cannot be reflected from a global view, and resources are wasted by selecting the client to participate in training, so that a global model is biased to a specific client. For this reason, the present embodiment reduces the negative impact of non-IID data while promoting fairness by selecting a subset of clients with decentralized gradient information such that their aggregation effect is similar to that of all clients.
Specifically, the client selection process introduces the dispersibility, the submodel is used for maximum solution, the approximate error is used for defining the difference between the model update of the selected client aggregation and the model update of all the client aggregations, then the approximate error minimization problem is converted into the submodel maximization problem, and the clients with the maximum boundary gain are continuously added into the set by using a greedy algorithm until the resource limit is reached. And selecting the client with representative gradient information to participate in training by using a submodel maximization method, so that the aggregated model update of the target client is similar to the aggregated model update of all clients. By encouraging gradient decentralization, redundant communication can be reduced, and the influence of clients with insufficient representativeness is increased, so that the deviation introduced by non-IID data is compensated.
Furthermore, the compression ratio of each target client has a very important influence on the balance of resource overhead and training performance, a large compression ratio can retain most gradient information but the communication overhead is still very large, and a small compression ratio can effectively reduce the data volume of communication but can reduce the model accuracy. In order to achieve this balance, the compression ratio decision is to minimize the compression error under the constraint of time resources, and a linear programming solver can be used to optimally obtain the compression ratio of each client. Different compression ratios are allocated to the selected clients according to the dynamic and heterogeneous capabilities, the clients with stronger capabilities slightly compress the gradient, and the clients with weaker capabilities severely compress the gradient, so that each client can achieve approximately the same completion time, and the problem of synchronization barriers is solved.
Further, in the embodiment, when selecting a target client and determining a corresponding compression ratio, one decision is fixed and the other decision is optimized by considering the tight coupling between the selected client and the corresponding compression ratio, and the client selection problem and the compression ratio decision problem are solved iteratively.
Specifically, the joint optimization process of client selection and compression ratio decision is a fixed single-point iteration process, firstly, a sub-model is used for selecting the client in a maximized mode, a linear programming problem is solved to determine the compression ratio for the selected client, and if the current strategy is helpful to reduce compression errors, the current strategy replaces the previous strategy. Then, the client with the minimum compression ratio is selected from the client terminal set, and is removed from the set, so that the client which is excessively compressed is prevented from being selected and entering the next iteration. After M iterations (M is the number of clients that need to be selected), the final client selection and compression ratio strategy is obtained, specifically referring to fig. 2, fig. 2 is a diagram of client selection and gradient compression effect provided by the implementation of the present invention. In the embodiment, the balance between the resource overhead and the training performance is realized through the joint optimization of the client selection and the compression ratio decision.
S2, the target client trains a global model on a local data set, updates model parameters corresponding to the global model, and sparsifies original model update parameters according to a compression ratio corresponding to the target client.
Specifically, by using a compression method, only gradient elements with larger absolute values will be retained according to the corresponding compression ratio, and the remaining gradients are set to 0, so as to obtain a compressed model update.
Illustratively, embodiments of the present invention may use a Top-k compression method, a Random-k compression method, a quantization method, or the like.
Furthermore, an error compensation mechanism can be used in the model compression process to further improve the compression performance, and the error compensation mechanism accumulates the error caused by uploading only the compression gradient so as to ensure that all gradient elements have the opportunity to be aggregated.
For example, the local data trained in this embodiment may be image segmentation data, image recognition data, and the like, and the efficiency of data processing may be greatly improved by selecting a target client and determining a corresponding compression ratio for the target client during the training process.
And S3, the target client sends the compressed model updating parameters to the parameter server so that the parameter server aggregates the compression gradient and updates the global model, and the next training round is started.
According to the technical scheme of the embodiment, the deviation caused by non-label (non-IID) data is reduced by using a client selection technology, the communication overhead is reduced by using a gradient compression technology, different compression ratios are distributed to each client according to the heterogeneous and dynamic capabilities, and the difference of the completion time between the clients is reduced. According to the method, the most representative client subset is selected to participate in training in consideration of data heterogeneity, and after the training is finished, the selected client updates the compression model in a self-adaptive manner according to the self capacity and uploads the compression model to the parameter server for aggregation, so that the model convergence is accelerated, the efficient federal learning is realized, and the data processing efficiency is improved.
The above examples are provided only for illustrating the technical concepts and features of the present invention, and the purpose of the present invention is to provide those skilled in the art with the understanding of the present invention and to implement the present invention, and not to limit the scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (5)

1. A federated learning method based on client selection and gradient compression is characterized in that in each training round, the federated learning method comprises the following steps:
s1, a parameter server selects target clients for the current round of training, determines corresponding compression ratios for the target clients, and sends a global model and the corresponding compression ratios to the target clients;
s2, the target client trains a global model on a local data set, updates model parameters corresponding to the global model, and sparsifies original model update parameters according to a compression ratio corresponding to the target client;
and S3, the target client sends the compressed model updating parameters to a parameter server for the parameter server to aggregate and update the global model, and the next training round is started.
2. The method of claim 1, wherein the parameter server selects target clients for the current round of training and determines a corresponding compression ratio for each of the target clients, comprising:
in each iteration, the parameter server selects a client firstly, determines a compression ratio for the currently selected client, and takes the client with smaller compression error between the currently selected client and the client selected in the previous iteration as the currently selected target client;
removing the client with the minimum compression ratio in the currently selected target clients and entering next iteration;
and after a certain number of iterations, obtaining the final target client and the compression ratio corresponding to each target client.
3. The method of claim 2, wherein the parameter server in S1 selects a target client for the current round of training, comprising:
taking the difference between the model update of the client aggregation selected by the parameter server each time and the model update of all the client aggregations as an approximate error;
and converting the approximate error minimization problem into a submodel maximization problem, and adding the clients with the maximum boundary gain into a set by using a greedy algorithm until the resource limit is reached so as to obtain the target client.
4. The method of claim 2, wherein determining a corresponding compression ratio for each of the target clients in S1 comprises:
and minimizing compression errors under the constraint of time resources, and optimally obtaining the compression ratio of each target client by using a linear programming solver.
5. The method of claim 1, wherein thinning the original model update according to the compression ratio corresponding to the target client comprises:
and according to the corresponding compression ratio, adopting a compression algorithm to reserve gradient elements with absolute values larger than a certain threshold, and setting the rest gradients as 0.
CN202211412335.5A 2022-11-11 2022-11-11 Federal learning method based on client selection and gradient compression Pending CN115796271A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211412335.5A CN115796271A (en) 2022-11-11 2022-11-11 Federal learning method based on client selection and gradient compression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211412335.5A CN115796271A (en) 2022-11-11 2022-11-11 Federal learning method based on client selection and gradient compression

Publications (1)

Publication Number Publication Date
CN115796271A true CN115796271A (en) 2023-03-14

Family

ID=85437009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211412335.5A Pending CN115796271A (en) 2022-11-11 2022-11-11 Federal learning method based on client selection and gradient compression

Country Status (1)

Country Link
CN (1) CN115796271A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196014A (en) * 2023-09-18 2023-12-08 深圳大学 Model training method and device based on federal learning, computer equipment and medium
CN117216596A (en) * 2023-08-16 2023-12-12 中国人民解放军总医院 Federal learning optimization communication method, system and storage medium based on gradient clustering
CN117349672A (en) * 2023-10-31 2024-01-05 深圳大学 Model training method, device and equipment based on differential privacy federal learning
CN117557870A (en) * 2024-01-08 2024-02-13 之江实验室 Classification model training method and system based on federal learning client selection

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216596A (en) * 2023-08-16 2023-12-12 中国人民解放军总医院 Federal learning optimization communication method, system and storage medium based on gradient clustering
CN117216596B (en) * 2023-08-16 2024-04-30 中国人民解放军总医院 Federal learning optimization communication method, system and storage medium based on gradient clustering
CN117196014A (en) * 2023-09-18 2023-12-08 深圳大学 Model training method and device based on federal learning, computer equipment and medium
CN117196014B (en) * 2023-09-18 2024-05-10 深圳大学 Model training method and device based on federal learning, computer equipment and medium
CN117349672A (en) * 2023-10-31 2024-01-05 深圳大学 Model training method, device and equipment based on differential privacy federal learning
CN117557870A (en) * 2024-01-08 2024-02-13 之江实验室 Classification model training method and system based on federal learning client selection
CN117557870B (en) * 2024-01-08 2024-04-23 之江实验室 Classification model training method and system based on federal learning client selection

Similar Documents

Publication Publication Date Title
CN115796271A (en) Federal learning method based on client selection and gradient compression
Qin et al. Federated learning and wireless communications
CN109167787B (en) resource optimization method for safety calculation unloading in mobile edge calculation network
CN111447083B (en) Federal learning framework under dynamic bandwidth and unreliable network and compression algorithm thereof
WO2021123139A1 (en) Systems and methods for enhanced feedback for cascaded federated machine learning
Liu et al. Deep reinforcement learning based dynamic channel allocation algorithm in multibeam satellite systems
CN111314889A (en) Task unloading and resource allocation method based on mobile edge calculation in Internet of vehicles
CN108495340B (en) Network resource allocation method and device based on heterogeneous hybrid cache
CN110505644B (en) User task unloading and resource allocation joint optimization method
CN110167176B (en) Wireless network resource allocation method based on distributed machine learning
CN112637883A (en) Federal learning method with robustness to wireless environment change in power Internet of things
CN114827191B (en) Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system
Younis et al. Energy-latency-aware task offloading and approximate computing at the mobile edge
CN111935783A (en) Edge cache system and method based on flow perception
CN114745383A (en) Mobile edge calculation assisted multilayer federal learning method
WO2023020502A1 (en) Data processing method and apparatus
CN113364543A (en) Edge calculation model training method based on federal reinforcement learning
CN114219094A (en) Communication cost and model robustness optimization method based on multi-task federal learning
CN110392377B (en) 5G ultra-dense networking resource allocation method and device
CN115174397B (en) Federal edge learning training method and system combining gradient quantization and bandwidth allocation
Yang et al. A resource allocation method based on the core server in the collaborative space for mobile edge computing
CN115936109A (en) Edge intelligent method for mixed distributed learning and centralized learning
CN115883601A (en) Method, system, equipment and medium for allocating cooperative resources of Internet of vehicles
Xu et al. Federated learning with client selection and gradient compression in heterogeneous edge systems
Lin et al. Channel-adaptive quantization for wireless federated learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination