CN116205311A - Federal learning method based on Shapley value - Google Patents

Federal learning method based on Shapley value Download PDF

Info

Publication number
CN116205311A
CN116205311A CN202310124072.6A CN202310124072A CN116205311A CN 116205311 A CN116205311 A CN 116205311A CN 202310124072 A CN202310124072 A CN 202310124072A CN 116205311 A CN116205311 A CN 116205311A
Authority
CN
China
Prior art keywords
model parameters
clients
client
federal learning
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310124072.6A
Other languages
Chinese (zh)
Inventor
朱亚萍
赵生捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202310124072.6A priority Critical patent/CN116205311A/en
Publication of CN116205311A publication Critical patent/CN116205311A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a federal learning method based on Shapley values. According to the method, the difference of data distribution of different clients in federal learning is considered, and when global model parameters are acquired, the parameters of the global training target are weighted and aggregated according to the contribution of a local training model of the client to the global training target. After each iteration training of federal learning, a weighted graph is constructed according to cosine similarity among local model parameters of each client, and a shape value of each client vertex in the graph is calculated. The server sets corresponding weight coefficients for the model parameters of each client based on the shape values of the clients, and carries out weighted aggregation on the model parameters of the clients according to the coefficients to obtain global model parameters of the next training until the training target is reached.

Description

Federal learning method based on Shapley value
Technical Field
The invention belongs to the field of machine learning, and particularly relates to a federal machine learning method.
Background
Along with the large-scale growth of intelligent terminals and internet of things interconnection equipment, the processing of mass data has become a necessary technology in the digital transformation era. For large-scale application scenarios, the federal machine learning method becomes one of the key technologies. As a novel distributed learning method, federal learning can alleviate the problems of high computational load and the like which are required when a single server processes large-scale data, and training the data is assigned to a plurality of clients for processing so as to share the cost. Meanwhile, by means of joint modeling of different clients on the premise that original data are not shared, privacy protection is achieved, and data safety is guaranteed.
Federal learning trains local models simultaneously through multiple clients and aggregates local models from different clients to obtain a global model. However, there is a data heterogeneity problem because data in federal learning is typically not evenly distributed across different clients and is typically of a non-independent co-distributed type. If the local models of different clients are aggregated without distinction, the overall training effect is degraded due to data heterogeneity. Therefore, a reasonably designed and effective method is needed to cope with the data heterogeneity problem in federal learning to improve the overall effect of training.
Disclosure of Invention
Technical problems: in federal learning, the data on multiple clients participating in the learning is typically non-independent and uniformly distributed, and the effect of the trained model on the overall training goal tends to be different for different clients on different rounds of iteration. Therefore, when the local parameter models from different clients are aggregated, the local parameter models need to be distinguished, the differences among the local models trained by the different clients are fully considered, and adverse effects on the whole federal learning effect due to data heterogeneity are relieved.
The technical scheme is as follows: in order to solve the technical problems, the invention provides a federal learning method based on Shapley values, which is characterized in that when local model parameters of clients are aggregated in federal learning, a weight coefficient is set based on the Shapley value of each client, and then the local model parameters are weighted and aggregated according to the weight coefficient to obtain global model parameters.
Further, a weighted graph is built at the end of each iteration training of federal learning, the vertexes of the graph are all clients participating in the learning of the federal learning, the clients are connected in pairs to form edges of the graph, and the weights of the edges are cosine similarity between local model parameters of the two clients connected with the edges.
Calculating shape of each client according to the constructed graph, wherein for a certain vertex (i.e. client) i in the graph, the shape (recorded as
Figure BDA0004081135610000021
) The calculation method of (1) is that
Figure BDA0004081135610000022
wherein ,Si Represents a set of all subsets including vertex members i, s represents the number of elements in set s, n represents the number of all vertices in the graph, j is any vertex in set s except i, e ij Is the weight corresponding to the edge connecting vertices i and j.
The global model parameter after each iteration is the weighted sum of the local model parameters of all clients participating in the current learning, wherein the weighting coefficient corresponding to the local model parameter of the client i is related to
Figure BDA0004081135610000023
The specific calculation method is that
Figure BDA0004081135610000024
wherein ,
Figure BDA0004081135610000025
is the global model parameter after the training of the t-th round is finished, L t Representing a set of all clients participating in the t-th round of learning,/>
Figure BDA0004081135610000026
Is about->
Figure BDA0004081135610000027
Function of->
Figure BDA0004081135610000028
Is the local model parameter obtained by training the client i in the t-th round.
The beneficial effects are that: according to the method, the possible data difference between different clients in federal learning is fully considered, when local model parameters are polymerized, the contribution of the local model obtained by the different clients in each iteration to the overall training target is calculated based on the Shapley value, and different weighting coefficients are set according to the contribution value, so that adverse effects of data heterogeneity on the federal learning are reduced.
Drawings
Fig. 1 is a flow chart of a federal learning method based on Shapley values according to the present invention.
Detailed Description
A federal learning method based on Shapley values is characterized in that when local model parameters of clients are aggregated in federal learning, a weight coefficient is set based on the Shapley value of each client, and then the local model parameters are weighted and aggregated according to the weight coefficient to obtain global model parameters.
The design of the scheme of the invention is further specifically described with reference to fig. 1 and related formulas.
When federal learning starts, the central server randomly selects clients participating in the next round of iterative training, and issues an initial global model parameter to all the selected clients. The client performs local training on the basis of the global model parameters to obtain a new round of local model parameters.
Assume that at the time of t-th training of federal learning, n clients participate in the present training, and are marked as a set L t . Wherein client i (i.e.L) t ) The parameter model obtained in the training of the round is
Figure BDA0004081135610000031
After the round training is finished, each client uploads the local model parameters obtained through training to a server, and the server constructs a graph according to the local model parameters of the client, wherein the top point of the graph is the client participating in the round learning, the clients are connected to form the edge of the graph, and the weight of the edge is the cosine similarity between the local model parameters of the two clients connected with the edge. Specifically, for the weight e corresponding to the edge formed between vertices i and j ij Is that
Figure BDA0004081135610000032
Wherein the molecules represent vectors
Figure BDA0004081135610000033
Vector->
Figure BDA0004081135610000034
Dot product between them, +.>
Figure BDA0004081135610000035
and />
Figure BDA0004081135610000036
Respectively represent vectors
Figure BDA0004081135610000037
and />
Figure BDA0004081135610000038
Is a mold of (a).
The server calculates the shape of each vertex in the constructed graph, specifically, for a certain vertex (i.e., client) i in the graph, the shape (denoted as
Figure BDA0004081135610000039
) The calculation method of (1) is that
Figure BDA00040811356100000310
/>
wherein ,Si Represents a set of all subsets including vertex members i, s represents the number of elements in set s, n represents the number of all vertices in the graph, j is any vertex in set s except i, e ij Is the weight corresponding to the edge connecting vertices i and j.
After the shape value of each client vertex is calculated, the server performs weighted summation on the local model parameters of all clients participating in the round of learning to obtain new global model parameters, wherein the specific calculation method is as follows
Figure BDA00040811356100000311
wherein ,
Figure BDA00040811356100000312
is the global model parameter after the training of the t th round is finished,>
Figure BDA00040811356100000313
is about->
Figure BDA00040811356100000314
Is a function of (2).
And the server transmits the global model parameters obtained by aggregation after each round of iteration to the client selected to participate in the learning in the next round, and the client performs a new round of training on the basis of the global model parameters until the overall training convergence target is reached.
The above description is merely of preferred embodiments of the present invention, and the scope of the present invention is not limited to the above embodiments, but all equivalent modifications or variations according to the present disclosure will be within the scope of the claims.

Claims (5)

1. A federal learning method based on Shapley values is characterized in that when local model parameters of clients are aggregated in federal learning, a weight coefficient is set based on the Shapley value of each client, and then the local model parameters are weighted and aggregated according to the weight coefficient to obtain global model parameters.
2. The shape-based federal learning method according to claim 1, wherein a weighted graph is constructed at the end of each iteration training of federal learning, vertices of the graph are all clients participating in the learning, the clients are connected to each other to form edges of the graph, and weights of the edges are cosine similarities between local model parameters of two clients connected to the edges.
3. The shape based federal learning method according to claim 1, wherein for the weight e corresponding to the edge formed between vertices i and j ij Is that
Figure FDA0004081135590000011
Wherein the molecules represent vectors
Figure FDA0004081135590000012
Vector->
Figure FDA0004081135590000013
Dot product between them, +.>
Figure FDA0004081135590000014
and />
Figure FDA0004081135590000015
Respectively represent vector +.>
Figure FDA0004081135590000016
And
Figure FDA0004081135590000017
is a mold of (a).
4. The shape-based federal learning method according to claim 2, wherein the shape of each client is calculated based on the constructed graph, and for a certain vertex (i.e., client) i in the graph, the shape (denoted as
Figure FDA0004081135590000018
) The calculation method of (1) is that
Figure FDA0004081135590000019
wherein ,Si Representing a set of all subsets of clients i containing vertex members, |s| represents the number of elements in set s, n represents the number of all vertices in the graph, j is any vertex in set s other than i, e ij Is the weight corresponding to the edge connecting vertices i and j.
5. The shape based federal learning method according to claim 1, wherein the global model parameters after each iteration are weighted sums of local model parameters of all clients participating in the present learning, wherein the weighting coefficients corresponding to the local model parameters of client i are the shape
Figure FDA00040811355900000115
The specific calculation method is that
Figure FDA00040811355900000110
wherein ,
Figure FDA00040811355900000111
is the global model parameter after the training of the t-th round is finished, L t Representing a set of all clients i participating in the t-th round of learning,/>
Figure FDA00040811355900000112
Is about->
Figure FDA00040811355900000113
Function of->
Figure FDA00040811355900000114
Is the local model parameter obtained by training the client i in the t-th round. />
CN202310124072.6A 2023-02-16 2023-02-16 Federal learning method based on Shapley value Pending CN116205311A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310124072.6A CN116205311A (en) 2023-02-16 2023-02-16 Federal learning method based on Shapley value

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310124072.6A CN116205311A (en) 2023-02-16 2023-02-16 Federal learning method based on Shapley value

Publications (1)

Publication Number Publication Date
CN116205311A true CN116205311A (en) 2023-06-02

Family

ID=86508965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310124072.6A Pending CN116205311A (en) 2023-02-16 2023-02-16 Federal learning method based on Shapley value

Country Status (1)

Country Link
CN (1) CN116205311A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116522988A (en) * 2023-07-03 2023-08-01 粤港澳大湾区数字经济研究院(福田) Federal learning method, system, terminal and medium based on graph structure learning
CN117057442A (en) * 2023-10-09 2023-11-14 之江实验室 Model training method, device and equipment based on federal multitask learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116522988A (en) * 2023-07-03 2023-08-01 粤港澳大湾区数字经济研究院(福田) Federal learning method, system, terminal and medium based on graph structure learning
CN116522988B (en) * 2023-07-03 2023-10-31 粤港澳大湾区数字经济研究院(福田) Federal learning method, system, terminal and medium based on graph structure learning
CN117057442A (en) * 2023-10-09 2023-11-14 之江实验室 Model training method, device and equipment based on federal multitask learning

Similar Documents

Publication Publication Date Title
CN116205311A (en) Federal learning method based on Shapley value
Kapoor et al. A grey wolf optimizer based automatic clustering algorithm for satellite image segmentation
CN112235384B (en) Data transmission method, device, equipment and storage medium in distributed system
CN110969250A (en) Neural network training method and device
CN107203891A (en) A kind of automatic many threshold values characteristic filter method and devices
CN113033712B (en) Multi-user cooperative training people flow statistical method and system based on federal learning
CN113065974B (en) Link prediction method based on dynamic network representation learning
CN109919202A (en) Disaggregated model training method and device
CN113553755A (en) Power system state estimation method, device and equipment
CN114861893B (en) Multi-channel aggregated countermeasure sample generation method, system and terminal
CN111553296B (en) Two-value neural network stereo vision matching method based on FPGA
CN113657678A (en) Power grid power data prediction method based on information freshness
CN114880538A (en) Attribute graph community detection method based on self-supervision
CN117829307A (en) Federal learning method and system for data heterogeneity
CN117056763A (en) Community discovery method based on variogram embedding
CN108427773B (en) Distributed knowledge graph embedding method
CN111414937A (en) Training method for improving robustness of multi-branch prediction single model in scene of Internet of things
CN114581470B (en) Image edge detection method based on plant community behaviors
CN114298319A (en) Method and device for determining joint learning contribution value, electronic equipment and storage medium
CN115131605A (en) Structure perception graph comparison learning method based on self-adaptive sub-graph
CN112926658B (en) Image clustering method and device based on two-dimensional data embedding and adjacent topological graph
Hu et al. Learning multi-expert distribution calibration for long-tailed video classification
CN114186168A (en) Correlation analysis method and device for intelligent city network resources
CN111754313B (en) Efficient communication online classification method for distributed data without projection
CN115019079B (en) Method for accelerating deep learning training by distributed outline optimization for image recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination