CN115907029B - Method and system for defending against federal learning poisoning attack - Google Patents

Method and system for defending against federal learning poisoning attack Download PDF

Info

Publication number
CN115907029B
CN115907029B CN202211391958.9A CN202211391958A CN115907029B CN 115907029 B CN115907029 B CN 115907029B CN 202211391958 A CN202211391958 A CN 202211391958A CN 115907029 B CN115907029 B CN 115907029B
Authority
CN
China
Prior art keywords
model
update
global
round
participant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211391958.9A
Other languages
Chinese (zh)
Other versions
CN115907029A (en
Inventor
王伟
刘敬楷
吕晓婷
李超
段莉
金�一
***
刘吉强
刘鹏睿
许向蕊
曹鸿博
李珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN202211391958.9A priority Critical patent/CN115907029B/en
Publication of CN115907029A publication Critical patent/CN115907029A/en
Application granted granted Critical
Publication of CN115907029B publication Critical patent/CN115907029B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a defending method and a defending system for federal learning poisoning attack, which belong to the technical field of network security, and a global model is transmitted to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training; aggregating the new global model by using the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters. The method calculates the deviation of the model update of each layer and the deviation of the whole model update, takes the number of the model update deviations exceeding a threshold value as an abnormal score, screens the model update of the participators with the minimum abnormal score for aggregation, realizes the screening with finer granularity than the distance considering only all parameters, ensures the convergence speed and the accuracy of the model based on the abnormal degree of the updated parameters, and can effectively cope with the targeted and non-targeted poisoning attacks.

Description

Method and system for defending against federal learning poisoning attack
Technical Field
The invention relates to the technical field of network security, in particular to a defense method and a defense system for federal learning poisoning attack based on a model structure.
Background
Federal learning (Federated Learning) serves as a machine learning framework to enable multiple participants to train a local model using private data, and complete training of a global model by sharing the local model for each round.
With the advent of General Data Protection Regulations (GDPR), california Consumer privacy Act and personal information protection Act, the concerns of enterprises and individuals on data privacy and security are increasing, and the phenomenon of data islanding is emerging. This has made federal learning interesting for solving the data islanding problem. Furthermore, federal models tend to be better than general local machine learning models in terms of benefit from the use of more data resources.
Federal learning also implies potential safety hazards while protecting the participants private data. Malicious participants may compromise the global model by poisoning private data or poisoning the local model. In a typical federal learning framework, a central server aggregates local models uploaded by participants to obtain a global model. The current research is mainly to defend against poisoning attacks in the aggregation step of the central server.
Distance defense method based on all parameters: and (3) performing distance detection on parameters of the local model submitted by the participant, and filtering parameters with larger distances to aggregate the global model. However, detecting based on all parameters is easy to ignore subtle altered poisoning attacks, which are very threatening.
The defense method based on coordinate level parameter aggregation comprises the following steps: the method compares the participants in each parameter dimension, and selects the parameter located in the median in each dimension. But this type of approach relies on the median, once affected, the global model is vulnerable.
Defense method based on differential privacy (Differential Privacy): the method influences the poisoning model by adding noise to the global model. But the noise added by this type of approach can affect the accuracy of the model. In particular, when facing non-targeted poisoning attacks, the defenses are poor.
The defense method based on the similarity of all parameters comprises the following steps: the method screens and aggregates global models based on similarity of parameters submitted by clients. However, the defense method has strong pertinence, and when model parameters with high similarity are selected for aggregation, the defense method is easy to be subjected to the poisoning attack of a plurality of colluded malicious participants; when multiple model parameters with low similarity are selected for aggregation, the model parameters are easy to be subjected to poisoning attack of a single malicious participant.
Root dataset-based defense method: the method requires the central server to maintain a local model trained from a clean dataset (root dataset) to guide and aggregate the global model. However, such methods require high demands on the root dataset, and the accuracy of the global model will be affected when there is Non-independent co-distributed (Non-IID) data.
Disclosure of Invention
The invention aims to provide a model structure-based federal learning poisoning attack-oriented defense method and system capable of effectively defending against a highly-hidden poisoning attack and a wide range of targeted and non-targeted poisoning attacks, so as to solve at least one technical problem in the background technology.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in one aspect, the invention provides a federal learning poisoning attack-oriented defense method, which comprises the following steps:
transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
aggregating the new global model by using the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
Preferably, in each round of federal training, segmenting the model after parameter update submitted by each participant based on a machine learning model structure; calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each participant overall model update compared with other participant overall model updates; and calculating the abnormal score of each user and aggregating the global model of the round.
Preferably, segmenting the parameter updated model submitted by each participant based on the machine learning model structure comprises: according to the machine learning model used, model update submitted by the participants is stored in a segmented mode according to the layers of the model structure:wherein (1)>Representing the local update parameters of the kth layer submitted by party i in the t-th round of federal training.
Preferably, the calculating the deviation of each local model update of each participant from other model updates, and calculating the deviation of each overall model update of each participant from other overall model updates at the same time, includes:
calculating the sum of the Euclidean distance between the model update of each layer of each participant and the model update of the layer of the rest participants by using a Euclidean distance calculation mode:
the method of maximum and minimum normalization is used for normalizing the Euclidean distance of the model update of the same layer of different participators to obtain the relative distance of each layer of model update of each participator as the deviation of the model update of the layer of the participator
Wherein,,representing the minimum value of the sum of Euclidean distances updated by a k-th layer model in a t-th round of global training among different users; />Representing the maximum value of the sum of Euclidean distances of the k-th layer model update in the t-th round of global training among different users.
Similarly, calculate the bias of each participant bulk model update as compared to other participant bulk model updates
Storing each layer model update bias and overall model update bias for each participant in a bias setIn (a): />
Preferably, calculating the anomaly score of each user and aggregating the global model of the round includes:
selecting a threshold lambda as the definition of abnormality and normal; calculating the size relation between each element in the deviation set of each participant and the threshold value, and counting the number of elements exceeding the threshold value in the deviation set for each participant as the anomaly Score iAnd selecting the party with the smallest abnormality score to add to the collection SelectSet.
Preferably, the number of elements in the SelectSet is one, the model update is directly used as the global model update, and the global model of the round is aggregated; when the number of elements in the SelectSet is multiple, calculating the mean value of the model updates of the participants as the global model update, and aggregating the global model of the round:
G t =G t-1 +AVERAGE(SelectSet);
wherein G is t A global model representing the wheel, G t-1 Representing the global model of the previous round.
In a second aspect, the present invention provides a federal learning poisoning attack-oriented defense system, comprising:
the initialization module is used for transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
the aggregation module is used for aggregating the new global model by utilizing the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
In a third aspect, the present invention provides a non-transitory computer readable storage medium for storing computer instructions that, when executed by a processor, implement a federal learning poisoning attack-oriented defense method as described above.
In a fourth aspect, the invention provides a computer program product comprising a computer program for implementing a federal learning-oriented poisoning attack defense method as described above when run on one or more processors.
In a fifth aspect, the present invention provides an electronic device, comprising: a processor, a memory, and a computer program; wherein the processor is connected to the memory, and the computer program is stored in the memory, and when the electronic device is running, the processor executes the computer program stored in the memory, so that the electronic device executes the instructions for implementing the federal learning poisoning attack-oriented defense method as described above.
Term interpretation:
model structure: the structure of a machine learning network generally comprises a convolution layer, a bias layer, a full connection layer and the like.
Federal learning poisoning attacks: in the federal learning environment, malicious participants modify local data or modify local models, and the purpose of embedding backdoors or damaging global models is achieved by submitting the local models.
The invention has the beneficial effects that: the model updates submitted by the participants are evaluated in a hierarchical level, the deviation of each layer of model updates and the deviation of the whole model updates are calculated, the number of model update deviations exceeding a threshold value is used as an abnormal score, the model updates of the participants with the minimum abnormal score are screened out and aggregated, the screening with finer granularity than the distance considering only all parameters is realized, the number of screening results depends on the abnormal degree of the updated parameters, the convergence speed and the accuracy of the model are ensured, and meanwhile, the targeted and non-targeted poisoning attacks can be effectively handled.
The advantages of additional aspects of the invention will be set forth in part in the description which follows, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a federal learning poisoning attack defense method based on a model structure according to an embodiment of the present invention.
Fig. 2 is a flowchart of a model structure-based federal learning poisoning attack defense method according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements throughout or elements having like or similar functionality. The embodiments described below by way of the drawings are exemplary only and should not be construed as limiting the invention.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or groups thereof.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
In order that the invention may be readily understood, a further description of the invention will be rendered by reference to specific embodiments that are illustrated in the appended drawings and are not to be construed as limiting embodiments of the invention.
It will be appreciated by those skilled in the art that the drawings are merely schematic representations of examples and that the elements of the drawings are not necessarily required to practice the invention.
Example 1
The embodiment 1 provides a defending system for federal learning poisoning attack, including:
the initialization module is used for transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training; such as an initialization module in a central server, transmits the global model to the various participants at the beginning of each round of federal training. The initialization module of the central server needs to initialize the global model during the first round of federal training.
The aggregation module is used for aggregating the new global model by utilizing the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters. For example, party i performs a prescribed round of local training based on the local data and the received global model, and transmits an update U of model parameters to the central server after the training is completed i . The central server then needs to update with the received models of the various participants to aggregate the new global model.
In this embodiment 1, the defending method for federal learning poisoning attack is implemented by using the system described above, including:
transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
aggregating the new global model by using the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
In each round of federal training, segmenting a model with updated parameters submitted by each participant based on a machine learning model structure; calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each participant overall model update compared with other participant overall model updates; and calculating the abnormal score of each user and aggregating the global model of the round.
Segmenting the parameter updated model submitted by each participant based on the machine learning model structure, comprising: according to the machine learning model used, model update submitted by the participants is stored in a segmented mode according to the layers of the model structure:wherein (1)>Representing the local update parameters of the kth layer submitted by party i in the t-th round of federal training.
The calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each overall model update of each participant compared with other overall model updates of the other participants comprises the following steps:
calculating the sum of the Euclidean distance between the model update of each layer of each participant and the model update of the layer of the rest participants by using a Euclidean distance calculation mode:
the method of maximum and minimum normalization is used for normalizing the Euclidean distance of the model update of the same layer of different participators to obtain the relative distance of each layer of model update of each participator as the deviation of the model update of the layer of the participator
Wherein,,representing different usersThe minimum value of the sum of Euclidean distances updated by a k-th layer model in the t-th round of global training; />Representing the maximum value of the sum of Euclidean distances of the k-th layer model update in the t-th round of global training among different users.
Similarly, calculate the bias of each participant bulk model update as compared to other participant bulk model updates
Storing each layer model update bias and overall model update bias for each participant in a bias setIn (a): />
Calculating the anomaly score of each user and aggregating the global models of the round, wherein the method comprises the following steps:
selecting a threshold lambda as the definition of abnormality and normal; calculating the size relation between each element in the deviation set of each participant and the threshold value, and counting the number of elements exceeding the threshold value in the deviation set for each participant as the anomaly Score iAnd selecting the party with the smallest abnormality score to add to the collection SelectSet. The number of elements in the SelectSet is one, the model update is directly used as the global model update, and the global model of the round is aggregated; when the number of elements in the SelectSet is multiple, calculating the mean value of the model updates of the participants as the global model update, and aggregating the global model of the round:
G t =G t-1 +AVERAGE(SelectSet);
wherein G is t A global model representing the wheel, G t-1 Representing the global model of the previous round.
Example 2
In this embodiment 2, a model aggregation method is provided, and under the condition of ensuring the accuracy of the federal learning model, effective defense against various poisoning attacks in federal learning is realized. A defending method facing federal learning poisoning attack based on a model structure comprises the following steps:
in federal learning, which includes a central server and a plurality of participants, each participant has data that is characteristic of the same. The central server transmits the global model to the various participants at the beginning of each round of federal training. The central server needs to initialize the global model during the first pass of federal training. The participator i carries out local training of specified rounds based on the local data and the received global model, and transmits update U of model parameters to a central server after the training is finished i . The central server then needs to update with the received models of the various participants to aggregate the new global model.
In each round of federal training, the central server segments model updates submitted by each participant based on a machine learning model structure; calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each participant overall model update compared with other participant overall model updates; and calculating the abnormal score of each user and aggregating the global model of the round.
The partitioning of model updates submitted by each participant includes:
the central server segments and stores model updates submitted by users according to layers of a model structure according to a machine learning model used, as shown in the following formula (1). The layers of the model structure described herein generally include, among other things, convolutional layers, bias layers, and fully-connected layers. Since the participants are all trained locally based on the global model, the dimensions and shape of the model update for the user after segmentation per each layer of the model structure are exactly the same.
Wherein,,representing the local update parameters of the kth layer submitted by party i in the t-th round of federal training.
The calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each overall model update of each participant compared with other overall model updates of the other participants comprises the following steps:
and calculating the sum of the Euclidean distance between the model update of each layer of each participant and the model update of each layer of the rest participants by using a Euclidean distance calculation mode, wherein the sum is shown in the following formula (2).
The method of maximum and minimum normalization is used for normalizing the Euclidean distance of the model update of the same layer of different participators to obtain the relative distance of each layer of model update of each participator as the deviation of the model update of the layer of the participatorAs shown in the following formula (3).
Wherein,,representing the minimum value of the sum of Euclidean distances updated by a k-th layer model in a t-th round of global training among different users; />Maximum value representing sum of Euclidean distances of k-th layer model update in t-th round global training among different users。
Similarly, calculate the bias of each participant bulk model update as compared to other participant bulk model updates
Storing each layer model update bias and overall model update bias for each participant in a bias setIn (2), the following formula (4) shows.
The calculating the abnormal score of each user and aggregating the global model of the round comprises the following steps:
the central server needs to select a threshold lambda as the definition of anomalies and normals. Calculating the size relation between each element in the deviation set of each participant and the threshold value, and counting the number of elements exceeding the threshold value in the deviation set for each participant as the anomaly Score i As shown in the following formula (5).
And selecting the party with the smallest abnormality score to add to the collection SelectSet. The number of elements in the SelectSet is one, the model update is directly used as the global model update, and the global model of the round is aggregated; when there are a plurality of elements in the SelectSet, the mean of the model updates of the participants is calculated as the global model update, and the global model of the round is aggregated as shown in equation (6) below.
G t =G t-1 +AVERAGE(SelectSet); (6)
Wherein G is t A global model representing the wheel, G t-1 Representing the global model of the previous round.
In summary, in embodiment 2, the model updates submitted by the participants are evaluated at a hierarchical level, the deviation of each layer of model updates and the deviation of the overall model updates are calculated, the number of model update deviations exceeding a threshold is used as an anomaly score, the model updates of the participants with the smallest anomaly score are screened out and aggregated, the screening with finer granularity than the distance considering only all parameters is realized, the number of screening results depends on the anomaly degree of the updated parameters, the theoretical value range is [1, n-1], the convergence speed and the accuracy of the model are ensured under the condition that the malicious user does not exceed fifty percent, and meanwhile, the targeted and non-targeted poisoning attacks can be effectively coped with.
Example 3
As shown in fig. 1 and fig. 2, in this embodiment 3, a defending method for federal learning poisoning attack based on a model structure is provided, and malicious parties are effectively defended against damaging a model or embedding a backdoor through the poisoning attack while the accuracy of a federal learning global model is not affected.
The execution flow of the defending method for federal learning poisoning attack based on the model structure provided in this embodiment is shown in fig. 2 below, and includes the following specific steps:
step 1, initiating a federation learning task to complete basic federation settings, such as a data set, a network structure, a batch size, a learning rate, a local training round and the like. The private data holder selects whether to apply for participating in the federal learning task, and the task publisher determines that n members participate in federal learning to jointly construct a global model.
Step 2, in each training round, the global model G is firstly broadcast to the participants by the central server t-1 (first round broadcast initialized Global model G) 0 )。
The participants receive the global model G from the central server t-1 And then, the normal participant uses own private data to carry out local training based on the global model, and model update is submitted to the global model after the training is finished. Malicious participants conduct a poison-putting attack, e.g. by performing a non-targeted poison-putting attack with a tag-reversal attack, byThe private data set is embedded in the pixel block, and the label of the sample embedded in the pixel block is changed to execute targeted poisoning attack, and malicious model update is submitted after the poisoning attack is completed.
And 3, the central server stores the received model update of each participant according to the layer segmentation of the model, and stores the model update in a tensor format, as shown in the following formula (1).
Wherein i represents a participant; t represents the number of rounds of global training; k represents a layer of the model;representing the local update parameters of the kth layer submitted by the participant t in the t-th round of federal training.
Step 4, the central server calculates the sum of the Euclidean distance between the model update of each layer of each participant and the model update of the layer of the rest participants by using the Euclidean distance calculation modeAs shown in the following formula (2).
The method of maximum and minimum normalization is used for normalizing the Euclidean distance of the model update of the same layer of different participators to obtain the relative distance of each layer of model update of each participator as the deviation of the model update of the layer of the participatorAs shown in the following formula (3).
Wherein,,representing the minimum value of the sum of Euclidean distances updated by a k-th layer model in a t-th round of global training among different users; />Representing the maximum value of the sum of Euclidean distances of the k-th layer model update in the t-th round of global training among different users.
Similarly, calculate the bias of each participant bulk model update as compared to other participant bulk model updatesAnd storing the model update bias and the overall model update bias of each participant in a bias set +.>In (2), the following formula (4) shows.
And 5, selecting a threshold lambda as definition of abnormal model update and normal model update by the central server. Calculating the size relation between each element in the deviation set of each participant and the threshold value, and counting the number of elements exceeding the threshold value in the deviation set for each participant as the anomaly Score i As shown in the following formula (5).
And 6, selecting all participants with the smallest abnormal scores by the central server, and updating the submitted model to be added into the set SelectSet. The screened participants consider that they submitted abnormal model updates.
Step 7, when the number of elements in the set of SelectSets is equal to 1, directly updating the elements as global models, and aggregating the global models of the round; when the number of elements in the set SelectSet is greater than 1, then the mean of the model updates for the participants is calculated as the global model update, and the global model for the round is aggregated as shown in equation (6) below.
G t =G t-1 +AVERAGE(SelectSet) (6)
Then carrying out convergence detection on the global model, and if the model converges, completing federal learning; and if the model is not converged, repeating the steps 2, 3, 4, 5, 6 and 7 until the model is converged. The determined convergence condition is set by the task publisher, such as a specific global training round number or model, to achieve a certain accuracy on the test set.
Example 4
Embodiment 4 of the present invention provides a non-transitory computer readable storage medium for storing computer instructions, which when executed by a processor, implement a federal learning poisoning attack-oriented defense method, the method comprising:
transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
aggregating the new global model by using the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
Example 5
Embodiment 5 of the present invention provides a computer program (product) comprising a computer program for implementing a federal learning poisoning attack-oriented defense method when run on one or more processors, the method comprising:
transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
aggregating the new global model by using the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
Example 6
Embodiment 6 of the present invention provides an electronic device, including: a processor, a memory, and a computer program; wherein the processor is connected to the memory, and the computer program is stored in the memory, and when the electronic device is running, the processor executes the computer program stored in the memory, so that the electronic device executes instructions for implementing a federal learning poisoning attack-oriented defense method, the method comprising:
transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
aggregating the new global model by using the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
In summary, according to the defense method for the federal learning poisoning attack in the embodiment of the present invention, the poisoning attack is resisted by comprehensively detecting the model update of each layer and the overall model update submitted by the federal learning participants. Compared with a distance detection method based on all parameters only, the invention provides a detection scheme with finer granularity; compared with a coordinate level aggregation method, the whole layer of parameters contain more information, so that accidental influence of parameter distribution in a certain dimension is reduced, and the speed and direction of model training are ensured. The invention realizes effective defense to non-targeted and targeted poisoning attacks under federal learning, has less use limit and more comprehensive defense effect compared with the existing defense methods, and improves the robustness of federal learning models. The method has positive contribution to the construction of the safety of the federal learning system and the safety of the model, provides a theoretical basis and a practical method, and can promote the application and development of federal learning in practice.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it should be understood that various changes and modifications could be made by one skilled in the art without the need for inventive faculty, which would fall within the scope of the invention.

Claims (6)

1. A defending method for federal learning poisoning attack is characterized by comprising the following steps:
transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
aggregating the new global model by using the global model updated by the received parameters; the method comprises the steps that a participant performs local training of a specified round based on local data and an initialized global model, and updates global model parameters;
in each round of federal training, segmenting a model with updated parameters submitted by each participant based on a machine learning model structure; calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each participant overall model update compared with other participant overall model updates; calculating the abnormal score of each user and aggregating the global model of the round;
segmenting the parameter updated model submitted by each participant based on the machine learning model structure, comprising: according to the machine learning model used, model update submitted by the participants is stored in a segmented mode according to the layers of the model structure:wherein (1)>Local update parameters representing the kth layer submitted by party i in the t-th round of federal training;
the calculating the deviation of each layer of local model update of each participant compared with other model updates, and simultaneously calculating the deviation of each overall model update of each participant compared with other overall model updates of the other participants comprises the following steps:
calculating the sum of the Euclidean distance between the model update of each layer of each participant and the model update of the layer of the rest participants by using a Euclidean distance calculation mode:
the method of maximum and minimum normalization is used for normalizing the Euclidean distance of the model update of the same layer of different participators to obtain the relative distance of each layer of model update of each participator as the deviation of the model update of the layer of the participator
Wherein,,representing the minimum value of the sum of Euclidean distances updated by a k-th layer model in a t-th round of global training among different users; />Representing the maximum value of the sum of Euclidean distances of the model update of the kth layer in the t-th round of global training among different users;
similarly, calculate the bias of each participant bulk model update as compared to other participant bulk model updates
Storing each layer model update bias and overall model update bias for each participant in a bias setIn (a):
2. the method of claim 1, wherein calculating the anomaly score for each user and aggregating the global model of the round comprises:
selecting a threshold lambda as the definition of abnormality and normal; calculating the size relation between each element in the deviation set of each participant and the threshold value, and counting the number of elements exceeding the threshold value in the deviation set for each participant as the anomaly Score iAnd selecting the party with the smallest abnormality score to add to the collection SelectSet.
3. The defending method for federal learning poisoning attack according to claim 2, wherein the number of elements in the SelectSet is one, the model update is directly used as the global model update, and the global model of the round is aggregated; when the number of elements in the SelectSet is multiple, calculating the mean value of the model updates of the participants as the global model update, and aggregating the global model of the round:
G t =G t-1 +AVERAGE(SelectSet);
wherein G is t A global model representing the wheel, G t-1 Representing the global model of the previous round.
4. A federal learning poisoning attack-oriented defense system based on the federal learning poisoning attack-oriented defense method according to any one of claims 1 to 3, comprising:
the initialization module is used for transmitting the global model to each participant in the beginning stage of each round of federal training; initializing a global model in the first round of federal training;
the aggregation module is used for aggregating the new global model by utilizing the global model updated by the received parameters; the participants perform local training of a specified round based on the local data and the initialized global model, and update global model parameters.
5. A non-transitory computer readable storage medium storing computer instructions which, when executed by a processor, implement the federal learning poisoning attack-oriented defense method of any of claims 1-3.
6. An electronic device, comprising: a processor, a memory, and a computer program; wherein the processor is connected to the memory, and wherein the computer program is stored in the memory, which processor executes the computer program stored in the memory when the electronic device is running, to cause the electronic device to execute instructions implementing the federal learning poisoning attack-oriented defense method according to any one of claims 1-3.
CN202211391958.9A 2022-11-08 2022-11-08 Method and system for defending against federal learning poisoning attack Active CN115907029B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211391958.9A CN115907029B (en) 2022-11-08 2022-11-08 Method and system for defending against federal learning poisoning attack

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211391958.9A CN115907029B (en) 2022-11-08 2022-11-08 Method and system for defending against federal learning poisoning attack

Publications (2)

Publication Number Publication Date
CN115907029A CN115907029A (en) 2023-04-04
CN115907029B true CN115907029B (en) 2023-07-21

Family

ID=86493119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211391958.9A Active CN115907029B (en) 2022-11-08 2022-11-08 Method and system for defending against federal learning poisoning attack

Country Status (1)

Country Link
CN (1) CN115907029B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11829193B2 (en) * 2020-08-14 2023-11-28 Tata Consultancy Services Limited Method and system for secure online-learning against data poisoning attack
CN117408332A (en) * 2023-10-19 2024-01-16 华中科技大学 De-centralized AI training and transaction platform and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021114821A1 (en) * 2019-12-12 2021-06-17 支付宝(杭州)信息技术有限公司 Isolation forest model construction and prediction method and device based on federated learning
CN113723477A (en) * 2021-08-16 2021-11-30 同盾科技有限公司 Cross-feature federal abnormal data detection method based on isolated forest
CN113962988A (en) * 2021-12-08 2022-01-21 东南大学 Power inspection image anomaly detection method and system based on federal learning
CN114266361A (en) * 2021-12-30 2022-04-01 浙江工业大学 Model weight alternation-based federal learning vehicle-mounted and free-mounted defense method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11170320B2 (en) * 2018-07-19 2021-11-09 Adobe Inc. Updating machine learning models on edge servers
CN113704807A (en) * 2020-05-20 2021-11-26 中国科学技术大学 Defense method aiming at user-level attack under privacy protection federal learning framework
CN114372046A (en) * 2021-05-13 2022-04-19 青岛亿联信息科技股份有限公司 Parking flow prediction model training method based on federal learning
CN113283185B (en) * 2021-07-23 2021-11-12 平安科技(深圳)有限公司 Federal model training and client imaging method, device, equipment and medium
CN113591145B (en) * 2021-07-28 2024-02-23 西安电子科技大学 Federal learning global model training method based on differential privacy and quantization
CN113779563A (en) * 2021-08-05 2021-12-10 国网河北省电力有限公司信息通信分公司 Method and device for defending against backdoor attack of federal learning
CN113965359B (en) * 2021-09-29 2023-08-04 哈尔滨工业大学(深圳) Federal learning data poisoning attack-oriented defense method and device
CN114764499A (en) * 2022-03-21 2022-07-19 大连理工大学 Sample poisoning attack resisting method for federal learning
CN115021905B (en) * 2022-05-24 2023-01-10 北京交通大学 Method for aggregating update parameters of local model for federated learning
CN115238825A (en) * 2022-09-01 2022-10-25 北京百度网讯科技有限公司 Model training method, device, equipment and medium based on federal learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021114821A1 (en) * 2019-12-12 2021-06-17 支付宝(杭州)信息技术有限公司 Isolation forest model construction and prediction method and device based on federated learning
CN113723477A (en) * 2021-08-16 2021-11-30 同盾科技有限公司 Cross-feature federal abnormal data detection method based on isolated forest
CN113962988A (en) * 2021-12-08 2022-01-21 东南大学 Power inspection image anomaly detection method and system based on federal learning
CN114266361A (en) * 2021-12-30 2022-04-01 浙江工业大学 Model weight alternation-based federal learning vehicle-mounted and free-mounted defense method and device

Also Published As

Publication number Publication date
CN115907029A (en) 2023-04-04

Similar Documents

Publication Publication Date Title
CN115907029B (en) Method and system for defending against federal learning poisoning attack
Rodríguez-Barroso et al. Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges
Liu et al. An intrusion detection method for internet of things based on suppressed fuzzy clustering
US9953160B2 (en) Applying multi-level clustering at scale to unlabeled data for anomaly detection and security
CN110334749B (en) Anti-attack defense model based on attention mechanism, construction method and application
CN110334742B (en) Graph confrontation sample generation method based on reinforcement learning and used for document classification and adding false nodes
CN103745482B (en) A kind of Dual-threshold image segmentation method based on bat algorithm optimization fuzzy entropy
CN113806546A (en) Cooperative training-based method and system for defending confrontation of graph neural network
CN112434213B (en) Training method of network model, information pushing method and related devices
Liu et al. GanDef: A GAN based adversarial training defense for neural network classifier
Xiao et al. Addressing Overfitting Problem in Deep Learning‐Based Solutions for Next Generation Data‐Driven Networks
Ye et al. Detection defense against adversarial attacks with saliency map
Nguyen et al. Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions
CN117150416B (en) Method, system, medium and equipment for detecting abnormal nodes of industrial Internet
CN113378160A (en) Graph neural network model defense method and device based on generative confrontation network
Lu et al. Defense against backdoor attack in federated learning
CN115456192A (en) Pond learning model virus exposure defense method, terminal and storage medium
Chen et al. Patch selection denoiser: An effective approach defending against one-pixel attacks
Jia et al. Improving fast adversarial training with prior-guided knowledge
Zheng et al. WMDefense: Using watermark to defense Byzantine attacks in federated learning
Wang et al. A knowledge distillation-based backdoor attack in federated learning
CN116805245A (en) Fraud detection method and system based on graph neural network and decoupling representation learning
Yang et al. DeMAC: Towards detecting model poisoning attacks in federated learning system
Huang et al. Distortion diminishing with vulnerability filters pruning
Du et al. Cell recognition using BP neural network edge computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant