CN114548429B - Safe and efficient transverse federated neural network model training method - Google Patents
Safe and efficient transverse federated neural network model training method Download PDFInfo
- Publication number
- CN114548429B CN114548429B CN202210452869.4A CN202210452869A CN114548429B CN 114548429 B CN114548429 B CN 114548429B CN 202210452869 A CN202210452869 A CN 202210452869A CN 114548429 B CN114548429 B CN 114548429B
- Authority
- CN
- China
- Prior art keywords
- weight coefficient
- neural network
- network model
- gradient
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioethics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The invention discloses a safe and efficient horizontal federal neural network model training method. The method comprises the following steps: s1: synchronously initializing an initiator and a participant; s2: the synchronous batch processing number m of the initiator and the participant; s3: the initiator and the participators respectively complete m times of batch processing, and the initiator calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federal neural network model subjected to each batch processing under the cooperation of the participators to obtain m aggregation gradients corresponding to the weight coefficient of each feature data; s4: the initiator calculates the latest value of the weight coefficient of each feature data and sends the latest value to the participants, and the initiator and the participants respectively endow the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the initiator; the steps S3-S4 are repeatedly executed until the set iteration number T is reached. The invention protects the data security of the initiator and the participator, has very high training efficiency and is convenient for realizing large-scale commercial use.
Description
Technical Field
The invention relates to the technical field of neural network model training, in particular to a safe and efficient transverse federated neural network model training method.
Background
In the horizontal federal learning wind control scenario, homomorphic encryption and secret sharing are common security protocols. However, the calculation complexity of the modeling of the horizontal federal neural network is high, if the federal neural network algorithm is designed based on the homomorphic encryption or secret sharing operator, large-scale commercial availability is difficult to realize, and especially, the time for training a complex horizontal federal neural network model like Resnet is long. On the other hand, the financial wind control scene has high requirements on safety, and a safe and efficient transverse federal neural network model training method is needed.
The existing horizontal federal neural network modeling method is mainly based on cryptology safety protocols such as secret sharing or homomorphic encryption, the calculation complexity is several times or even dozens of times higher than that of a plaintext, and the requirement of large-scale commercial use is difficult to achieve. In addition, in the two-party scenario, even if secret sharing or homomorphic encrypted ciphertext aggregation gradient is used, the plaintext gradient value of the participant can be reversely deduced. For example, the initiator's gradient is 5, the participant's gradient is 3, the initiator knows that the final gradient aggregation value is 8, and the initiator can deduce that the participant's gradient value is 8-5=3 in reverse regardless of whether the participant uses secret sharing or homomorphic encryption.
Disclosure of Invention
In order to solve the technical problems, the invention provides a safe and efficient transverse federated neural network model training method, which protects the data safety by adding the normal distribution-obeying random noise to protect the respective gradients of the initiator and the participators from being reversely deduced by the other party, and simultaneously, the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
In order to solve the problems, the invention adopts the following technical scheme:
the invention relates to a safe and efficient transverse federated neural network model training method, which is used for joint wind control modeling between financial institutions and comprises the following steps:
s1: the initiator client and the participant client synchronously initialize respective transverse federated neural network models and a weight coefficient of each feature data contained in the transverse federated neural network models;
s2: the method comprises the steps that an initiator client and a participant client synchronously carry out batch processing for m, and the initiator client and the participant client divide samples of the method for training a transverse Federal neural network model into m batches respectively;
s3: the initiator client finishes the m-time batch processing, and the participant client finishes the m-time batch processing;
the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;
the initiator client calculates the transverse Federal neural network model subjected to the ith batch processing under the cooperation of the participant clientsAggregation gradient g corresponding to weight coefficient of jth characteristic data ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federated neural network model:
n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The participator client calculates the average gradient gb corresponding to the weighting coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij ;
N2: initiator client to average gradient ga ij Adding noise N to obtain noisy gradient gan ij The participant client gives the average gradient gb ij Adding noise N, resulting in a noisy gradient gbn ij ;
Noise N is obedience desired to be 0, variance isWherein σ is a standard deviation and C is a noise coefficient;
n3: the participant client will have a noise gradient gbn ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model ij ,g ij =(gan ij +gbn ij )/2;
S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the local system;
the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;
s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.
In the scheme, the initiator client and the participant client initialize the same transverse federated neural network model and weight coefficients of each feature data contained in the transverse federated neural network model, the transverse federated neural network model comprises d feature data, and each sample of the initiator client and the participant client used for training the transverse federated neural network model also comprises the same d feature data.
The initiator client and the participant client synchronize the batch processing number m, samples used for training the transverse federated neural network model are divided into m batches, and batch alignment is completed. Then, the initiator client and the participant client substitute m batches of samples of the local into the transverse federal neural network model of the local for training, and respectively complete m batches of processing, the initiator client calculates a noisy gradient corresponding to a weight coefficient of each feature data contained in the transverse federal neural network model of the local after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gan; the participator client calculates a noisy gradient corresponding to the weight coefficient of each feature data contained in the local horizontal federal neural network model after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gbn; and the initiator client aggregates the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data obtained after batch processing of the same batch to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.
Then, the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data by combining the learning rate μ, and gives the latest value to the weight coefficients, the initiator client sends the calculated latest value of the weight coefficient of each feature data to the participant client, and the participant client gives the latest value to the weight coefficient of each feature data. And (5) repeatedly executing the steps S3-S4T times to finish the training of the transverse federal neural network model.
In the process of calculating the aggregation gradient g, firstly, the initiator client calculates an average gradient ga corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the current batch, and the participant client calculates an average gradient gb corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the same batch; then, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data to obtain a noisy gradient gan, and the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data to obtain a noisy gradient gbn; finally, the initiator client calculates the mean of the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.
Since the added noise N is desired to be 0 and the variance isThe method comprises the following steps that normally distributed random noise is generated, so when the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data finally, all noise N added by the initiator client and all noise N added by the participator client can be basically counteracted with each other, the calculated mean value of the m aggregation gradients and the actual mean value of the gradients have only very slight deviation, and the training precision of the transverse federated neural network model is not influenced basically.
In the method, in the process of calculating the aggregation gradient g, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data calculated by the initiator client, the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the noise N is random noise, so that the initiator client cannot acquire the actual value of the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the finally calculated average value of m aggregation gradients has a slight deviation from the actual gradient average value, so that the initiator client and the participant client cannot reversely deduce the gradient value of the other party, and the data security is protected. The method is not based on the cryptology security protocol to encrypt the gradient, and the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
The method can be used for joint wind control modeling among financial institutions, and the characteristic data contained in the transverse federal neural network model can be income, age, monthly telephone charge, monthly repayment amount, debt total amount and the like of a user sample.
Preferably, the step S2 includes the steps of:
the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,,the representation is rounded upwards and sent to the client side of the participant, and the client side of the initiator divides the sample used for training the transverse federated neural network model into m batches according to the batch processing size p;
the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,and dividing the samples of the participating side client side for training the horizontal federated neural network model into m batches according to the batch processing size q.
The batch size p means that a single batch can process p samples at most, and the batch size q means that a single batch can process q samples at most.
Preferably, the step S3, in which the initiator client completes one batch processing, includes the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;
the step of completing batch processing once by the client of the participator in the step S3 comprises the following steps: and inputting the characteristic data contained in each sample of the current batch into a lateral federal neural network model of the client side of the participant for training.
Preferably, the step N1 includes the following steps:
n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The method comprises the following specific steps:
m1: the initiator client inputs each sample (including feature data) subjected to the ith batch processing into the transverse federal neural network model of the initiator client, and then calculates the gradient corresponding to the weight coefficient of each feature data, and the r gradient corresponding to the weight coefficient of the j feature data calculated after the r sample subjected to the ith batch processing is input into the transverse federal neural network model of the initiator client is gak ijr R is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p;
after the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak ij1 、gak ij2 、……gak iju ;
M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;
the r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr The r normalized gradient obtained by the normalization process is gas ijr ;
M3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data ij ,
N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij The method comprises the following specific steps:
f1: the participator client inputs each sample (including characteristic data) subjected to the ith batch processing into the transverse federal neural network model of the participator for training, and calculates the gradient corresponding to the weight coefficient of each characteristic data, and the tth sample subjected to the ith batch processing is input into the transverse federal neural network model of the participator for training, and calculates the tth gradient corresponding to the weight coefficient of the jth characteristic data, wherein the tth gradient is gbk ijt T is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch processing, and v is more than or equal to 1 and less than or equal to q;
after the client of the participator inputs all the v samples of the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein the v gradients corresponding to the weight coefficient of the jth feature data are gbk ij1 、gbk ij2 、……gbk ijv ;
F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;
t gradient gbk corresponding to weight coefficient of j characteristic data ijt The t-th normalized gradient obtained by performing the normalization process was gbs ijt ;
F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data ij ,
Preferably, in the step M1, the initiator client inputs the ith sample processed in batch into the weight coefficient pair of the jth feature data calculated after the lateral federal neural network model of the initiator is trainedCorresponding r < th > gradient gak ijr The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the ith gradient gak corresponding to the weight coefficient of the jth characteristic data ijr ;
The client of the participant in the step F1 inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the lateral federal neural network model of the participant ijt The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the tth gradient gbk corresponding to the weight coefficient of the jth characteristic data ijt 。
Preferably, in the step M2, the initiator client assigns an r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr Normalizing to obtain the r normalized gradient gas ijr The calculation formula of (a) is as follows:
G=[gak ij1 ,gak ij2 ,……gak iju ],
Preferably, in step S4, the formula for the initiator client to calculate the latest value of the weight coefficient of the jth feature data from the average aggregation gradient corresponding to the learning rate μ and the weight coefficient of the jth feature data, and to assign the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model is as follows:
wherein f is j Is the weight coefficient, gm, of the jth feature data j And the average aggregation gradient corresponding to the weight coefficient of the jth characteristic data.
The invention has the beneficial effects that: the gradients of the initiator and the participators are protected from being reversely deduced by the other party by adding random noise which obeys normal distribution, so that the data safety is protected, and meanwhile, the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
Drawings
FIG. 1 is a flow chart of an embodiment.
Detailed Description
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b): the safe and efficient horizontal federal neural network model training method is used for joint wind control modeling between financial institutions, and comprises the following steps as shown in fig. 1:
s1: the initiator client and the participant client synchronously initialize respective horizontal federal neural network models and weight coefficients of each feature data contained in the horizontal federal neural network models, wherein model functions of the horizontal federal neural network models are L (f, x), and f = [ f = [ ] [ [ f ] 1 ,f 2 ,……f d ],x=[x 1 ,x 2 ,……x d ],x j Representing the jth feature data, f j Representing the jth characteristic data x j J is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federal neural network model;
s2: the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,,the method comprises the steps that the completion is expressed, the completion is sent to a client side of a participant, a client side of an initiator divides samples, which are used for training a transverse federated neural network model, of the initiator into m batches according to a batch processing size p, and the batch processing size p represents that p samples can be processed at most in single batch processing;
the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,the participating client divides samples of the method for training the transverse federated neural network model into m batches according to the batch processing size q, wherein the batch processing size q represents that at most q samples can be processed in single batch processing;
s3: the initiator client side substitutes m batches of samples of the initiator client side into the transverse federated neural network model of the initiator client side to train, and m batches of samples are processed; the method for completing one batch processing by the initiator client comprises the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;
the participator client side substitutes m batches of samples of the local side into the transverse federal neural network model of the local side respectively for training to finish m batches of processing; the method for completing one batch processing by the client side of the participant comprises the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of a participant client for training;
the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;
the initiator client side calculates the weight coefficient corresponding to the jth characteristic data contained in the ith batch processed transverse federal neural network model under the cooperation of the participant client sideGradient g of polymerization ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m:
n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The participator client calculates the average gradient gb corresponding to the weighting coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij ;
N2: initiator client to average gradient ga ij Adding noise N to obtain noisy gradient gan ij ,gan ij =ga ij +N;
Participant client gives the average gradient gb ij Adding noise N, resulting in a noisy gradient gbn ij ,gbn ij =gb ij +N;
Noise N is 0 for compliance expectation and variance isNormally distributed random noise ofWherein σ is a standard deviation, and C is a noise coefficient;
n3: the participant client will have a noise gradient gbn ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model ij ,g ij =(gan ij +gbn ij )/2;
S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral Federal neural network model of the local side;
the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;
the initiator client calculates the latest value of the weight coefficient of the jth feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of the jth feature data, and gives the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model according to the following formula:
wherein f is j Is the weight coefficient, gm, of the jth feature data j The average aggregation gradient corresponding to the weight coefficient of the jth characteristic data;
s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.
Step N1 includes the following steps:
n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The method comprises the following specific steps:
m1: the initiator client inputs each sample (including characteristic data) subjected to the ith batch processing into a transverse federated neural network model of the initiator client, and calculates a gradient corresponding to a weight coefficient of each characteristic data;
the initiator client inputs the ith sample subjected to batch processing for the ith time into the ith gradient gak corresponding to the weight coefficient of the jth feature data calculated after the transverse federated neural network model of the initiator client is trained ijr The method comprises the following steps that r is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local side for training, and carrying out model function L (f, x) of the transverse federal neural network model on the jth characteristic data x j Is a weight ofNumber f j Obtaining the jth characteristic data x by calculating the partial derivative j Weight coefficient f of j Corresponding r-th gradient gak ijr ,;
After the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak ij1 、gak ij2 、……gak iju ;
M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;
the initiator client maps the r gradient gak corresponding to the weight coefficient of the j characteristic data ijr Normalizing to obtain the r normalized gradient gas ijr The calculation formula of (a) is as follows:
G=[gak ij1 ,gak ij2 ,……gak iju ],
m3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data ij ,
N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij The method comprises the following specific steps:
f1: the participator client inputs each sample (including characteristic data) subjected to the ith batch processing into a transverse federal neural network model of the participator for training, and then calculates the gradient corresponding to the weight coefficient of each characteristic data;
the client side of the participator inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the transverse federal neural network model of the participator ijt The method comprises the following steps that t is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch, and q is more than or equal to 1 and less than or equal to q:
inputting the ith sample of the ith batch processing into the transverse federal neural network model of the local side for training, and carrying out model function L (f, x) of the transverse federal neural network model on the jth characteristic data x j Weight coefficient f of j Obtaining the jth characteristic data x by calculating the partial derivative j Weight coefficient f of j Corresponding t-th gradient gbk ijt ,;
After the participator client inputs all v samples subjected to the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein v gradients corresponding to the weight coefficient of the jth feature data are gbk ij1 、gbk ij2 、……gbk ijv ;
F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;
t-th gradient gbk corresponding to weight coefficient of j-th feature data ijt The t-th normalized gradient obtained by performing the normalization process was gbs ijt ;
F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data ij ,
In the scheme, the initiator client and the participant client initialize the same transverse federal neural network model and weight coefficients of each feature data contained in the transverse federal neural network model, the transverse federal neural network model contains d feature data, and each sample used for training the transverse federal neural network model by the initiator client and the participant client also contains the same d feature data.
The initiator client and the participant client synchronously perform batch processing for m, samples used for training the transverse federated neural network model are divided into m batches, and batch alignment is completed. Then, the initiator client and the participant client substitute m batches of samples of the local into the transverse federal neural network model of the local for training, and respectively complete m batches of processing, the initiator client calculates a noisy gradient corresponding to a weight coefficient of each feature data contained in the transverse federal neural network model of the local after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gan; the participator client calculates a noisy gradient corresponding to the weight coefficient of each feature data contained in the local horizontal federal neural network model after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gbn; the initiator client aggregates the noisy gradient gan corresponding to the weight coefficient of each piece of feature data obtained after batch processing of the same batch with the noisy gradient gbn to obtain an aggregated gradient g corresponding to the weight coefficient of each piece of feature data.
Then, the initiator client calculates the mean of m aggregation gradients corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data in combination with the learning rate μ, and gives the latest value to the weight coefficients, the initiator client sends the calculated latest value of the weight coefficient of each feature data to the participant client, and the participant client gives the latest value to the weight coefficient of each feature data. And (5) repeatedly executing the steps S3-S4T times to finish the training of the transverse federal neural network model.
In the process of calculating the aggregation gradient g, firstly, the initiator client calculates an average gradient ga corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the current batch, and the participant client calculates an average gradient gb corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the same batch; then, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data to obtain a noisy gradient gan, and the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data to obtain a noisy gradient gbn; finally, the initiator client calculates the mean of the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.
Since the added noise N is desired to be 0 and the variance isThe method comprises the following steps that normally distributed random noise is generated, so when the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data finally, all noise N added by the initiator client and all noise N added by the participator client can be basically counteracted with each other, the calculated mean value of the m aggregation gradients and the actual mean value of the gradients have only very slight deviation, and the training precision of the transverse federated neural network model is not influenced basically.
In the method, in the process of calculating the aggregation gradient g, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data calculated by the initiator client, the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the noise N is random noise, so that the initiator client cannot acquire the actual value of the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the finally calculated average value of m aggregation gradients has a slight deviation from the actual gradient average value, so that the initiator client and the participant client cannot reversely deduce the gradient value of the other party, and the data security is protected. The method is not based on the cryptology security protocol to encrypt the gradient, and the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
The method can be used for joint wind control modeling among financial institutions, and the characteristic data contained in the transverse federal neural network model can be income, age, monthly telephone charge, monthly repayment amount, debt total amount and the like of a user sample.
For example, the following steps are carried out:
the transverse federal neural network model comprises characteristic data of income, age, monthly telephone charge, monthly repayment amount and arrearage total amount; the characteristic data contained in the sample used for training the transverse federal neural network model by the initiator client is income, age, monthly telephone charge, monthly repayment amount and arrears total amount; the participator client is used for training samples of the transverse federated neural network model to contain characteristic data of income, age, monthly telephone charge, monthly repayment amount and arrearage total amount;
the batch processing numbers of the initiator client and the participant client are both 5,
the method comprises the steps that an initiator client calculates an average gradient corresponding to a weight coefficient of income characteristic data of the local horizontal federated neural network model after each batch processing to obtain an average gradient (0.2,0.3,0.25,0.4,0.6) corresponding to 5 batches of batch processing, the initiator client adds a corresponding noise N to the average gradient corresponding to each batch, the noise to be added to the average gradient corresponding to 5 batches of batch processing is (0.3,0.2,0, -0.1, -0.3), and the noise gradient obtained after the noise N is added to the average gradient corresponding to 5 batches of batch processing is (0.5,0.5,0.25,0.3, 0.3);
the participator client calculates the average gradient corresponding to the weight coefficient of the income characteristic data of the local horizontal federated neural network model after each batch processing to obtain the average gradient (0.6,0.35,0.4,0.2,0.5) corresponding to 5 batches of batch processing, the participator client adds a corresponding noise N to the average gradient corresponding to each batch processing, the noise to be added to the average gradient corresponding to 5 batches of batch processing is (-0.2,0.3,0.1,0.2, -0.3), and the noise gradient (0.4,0.65,0.5,0.4,0.2) is obtained after the noise N is added to the average gradient corresponding to 5 batches of batch processing;
the aggregation gradient corresponding to 5 batches with the addition of noise N is ((0.5,0.5,0.25,0.3,0.3) + (0.4,0.65,0.5,0.4, 0.2))/2 = (0.45,0.575,0.375,0.35,0.25), and the mean value is 0.4, i.e., the average aggregation gradient is 0.4;
the aggregation gradient corresponding to 5 batches without the addition of noise N is ((0.2,0.3,0.25,0.4,0.6) + (0.6,0.35,0.4,0.2, 0.5))/2 = (0.4,0.325,0.325,0.3,0.55), and the mean value is 0.38, i.e., the actual average aggregation gradient is 0.38;
it can be seen that after the gradient of the additive noise N is aggregated, most of the noise is cancelled, and the finally calculated average aggregated gradient value has a small deviation from the actual average aggregated gradient value. The data security is protected, and meanwhile, the gradient calculation complexity is basically consistent with that of a plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
Claims (7)
1. A safe and efficient horizontal federated neural network model training method is characterized by comprising the following steps:
s1: the initiator client and the participant client synchronously initialize respective transverse federated neural network models and a weight coefficient of each feature data contained in the transverse federated neural network models;
s2: the method comprises the steps that an initiator client and a participant client synchronously carry out batch processing for m, and the initiator client and the participant client divide samples of the method for training a transverse Federal neural network model into m batches respectively;
s3: the initiator client finishes the m-time batch processing, and the participant client finishes the m-time batch processing;
the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;
the initiator client calculates the horizontal direction subjected to the ith batch processing under the coordination of the participant clientsAggregation gradient g corresponding to weight coefficient of jth characteristic data contained in federated neural network model ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federated neural network model:
n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model which is subjected to the ith batch processing ij ;
N2: initiator client to average gradient ga ij Adding noise N to obtain noisy gradient gan ij The participant client gives the average gradient gb ij Adding noise N, resulting in a noisy gradient gbn ij ;
Noise N is 0 for compliance expectation and variance isWherein σ is a standard deviation, and C is a noise coefficient;
n3: the participant client will have a noise gradient gbn ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model ij ,g ij =(gan ij +gbn ij )/2;
S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the local system;
the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;
s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.
2. The safe and efficient method for training the horizontal federated neural network model as claimed in claim 1, wherein the step S2 includes the following steps:
the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,,the representation is rounded upwards and sent to the client side of the participant, and the client side of the initiator divides the sample used for training the transverse federated neural network model into m batches according to the batch processing size p;
the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,and dividing the samples of the participating side client side for training the horizontal federated neural network model into m batches according to the batch processing size q.
3. The safe and efficient training method for the horizontal federated neural network model according to claim 1 or 2, wherein the initiator client completing one batch process in step S3 includes the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;
the step S3 of completing the batch processing by the participant client includes the following steps: and inputting the characteristic data contained in each sample of the current batch into a lateral federal neural network model of the client side of the participant for training.
4. The safe and efficient method for training the transverse federated neural network model as claimed in claim 2, wherein the step N1 includes the following steps:
n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The method comprises the following specific steps:
m1: the initiator client inputs each sample subjected to the ith batch processing into the transverse federal neural network model of the initiator client and then calculates the gradient corresponding to the weight coefficient of each feature data, and the r gradient corresponding to the weight coefficient of the jth feature data calculated after the ith batch processing of the ith sample is input into the transverse federal neural network model of the initiator client is gak ijr R is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p;
after the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak ij1 、gak ij2 、……gak iju ;
M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;
the r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr The r normalized gradient obtained by the normalization process is gas ijr ;
M3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data ij ,
N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij The method comprises the following specific steps:
f1: the participator client inputs each sample subjected to the ith batch processing into the transverse federal neural network model of the local side for training, calculates the gradient corresponding to the weight coefficient of each feature data, and calculates the t-th gradient corresponding to the weight coefficient of the jth feature data after the ith batch processing sample is input into the transverse federal neural network model of the local side for training as gbk ijt T is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch processing, and v is more than or equal to 1 and less than or equal to q;
after the participator client inputs all v samples subjected to the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein v gradients corresponding to the weight coefficient of the jth feature data are gbk ij1 、gbk ij2 、……gbk ijv ;
F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;
t-th gradient gbk corresponding to weight coefficient of j-th feature data ijt The t-th normalized gradient obtained by performing the normalization process was gbs ijt ;
F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data ij ,
5. The safe and efficient transverse federated neural network model training method of claim 4, wherein in the step M1, the initiator client inputs the ith batch processed sample into the transverse federated neural network model of the present party for training and calculatesThe r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the ith gradient gak corresponding to the weight coefficient of the jth characteristic data ijr ;
The client of the participant in the step F1 inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the lateral federal neural network model of the participant ijt The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the tth gradient gbk corresponding to the weight coefficient of the jth characteristic data ijt 。
6. The safe and efficient transverse federated neural network model training method of claim 4, wherein in the step M2, the initiator client assigns the jth gradient gak corresponding to the weight coefficient of the jth feature data ijr Normalizing to obtain the r normalized gradient gas ijr The calculation formula of (a) is as follows:
G=[gak ij1 ,gak ij2 ,……gak iju ],
7. The safe and efficient horizontal federated neural network model training method according to claim 1 or 2, wherein, in step S4, the initiator client calculates a latest value of the weight coefficient of the jth feature data according to the learning rate μ and an average aggregation gradient corresponding to the weight coefficient of the jth feature data, and assigns the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model according to the following formula:
wherein f is j Is the weight coefficient, gm, of the jth feature data j And the average aggregation gradient corresponding to the weight coefficient of the jth characteristic data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210452869.4A CN114548429B (en) | 2022-04-27 | 2022-04-27 | Safe and efficient transverse federated neural network model training method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210452869.4A CN114548429B (en) | 2022-04-27 | 2022-04-27 | Safe and efficient transverse federated neural network model training method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114548429A CN114548429A (en) | 2022-05-27 |
CN114548429B true CN114548429B (en) | 2022-08-12 |
Family
ID=81666726
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210452869.4A Active CN114548429B (en) | 2022-04-27 | 2022-04-27 | Safe and efficient transverse federated neural network model training method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114548429B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115994161B (en) * | 2023-03-21 | 2023-06-06 | 杭州金智塔科技有限公司 | Data aggregation system and method based on multiparty security calculation |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113723477A (en) * | 2021-08-16 | 2021-11-30 | 同盾科技有限公司 | Cross-feature federal abnormal data detection method based on isolated forest |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522669A (en) * | 2020-04-29 | 2020-08-11 | 深圳前海微众银行股份有限公司 | Method, device and equipment for optimizing horizontal federated learning system and readable storage medium |
CN111898768A (en) * | 2020-08-06 | 2020-11-06 | 深圳前海微众银行股份有限公司 | Data processing method, device, equipment and medium |
CN112733967B (en) * | 2021-03-30 | 2021-06-29 | 腾讯科技(深圳)有限公司 | Model training method, device, equipment and storage medium for federal learning |
CN113515760B (en) * | 2021-05-28 | 2024-03-15 | 平安国际智慧城市科技股份有限公司 | Horizontal federal learning method, apparatus, computer device, and storage medium |
-
2022
- 2022-04-27 CN CN202210452869.4A patent/CN114548429B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113723477A (en) * | 2021-08-16 | 2021-11-30 | 同盾科技有限公司 | Cross-feature federal abnormal data detection method based on isolated forest |
Also Published As
Publication number | Publication date |
---|---|
CN114548429A (en) | 2022-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112733967B (en) | Model training method, device, equipment and storage medium for federal learning | |
CN112183730B (en) | Neural network model training method based on shared learning | |
CN112862001A (en) | Decentralized data modeling method under privacy protection | |
CN114548429B (en) | Safe and efficient transverse federated neural network model training method | |
CN111476200A (en) | Face de-identification generation method based on generation of confrontation network | |
CN114580498A (en) | Federal learning method with high communication efficiency in wireless communication scene | |
CN115189878B (en) | Shared data sorting method based on secret sharing and electronic equipment | |
CN113947211A (en) | Federal learning model training method and device, electronic equipment and storage medium | |
CN111563262A (en) | Encryption method and system based on reversible deep neural network | |
CN114004363B (en) | Method, device and system for jointly updating model | |
CN114362948A (en) | Efficient federal derivative feature logistic regression modeling method | |
CN111046857A (en) | Face recognition method, device, equipment, medium and system based on knowledge federation | |
CN114996749B (en) | Feature filtering method for federal learning | |
CN116796832A (en) | Federal learning method, system and equipment with high availability under personalized differential privacy scene | |
CN114282692A (en) | Model training method and system for longitudinal federal learning | |
Li et al. | Fast adaptive BSS algorithm for independent/dependent sources | |
CN114386071A (en) | Decentered federal clustering method and device, electronic equipment and storage medium | |
CN112651170B (en) | Efficient characteristic contribution assessment method in longitudinal federal learning scene | |
CN108492275B (en) | No-reference stereo image quality evaluation method based on deep neural network | |
CN116562366A (en) | Federal learning method based on feature selection and feature alignment | |
CN115908662B (en) | Speaker video generation model training and using method, device and equipment | |
CN111859440A (en) | Sample classification method of distributed privacy protection logistic regression model based on mixed protocol | |
CN116341636A (en) | Federal learning method, apparatus, system, and storage medium | |
CN114817997B (en) | Shared data random ordering method based on secret sharing | |
CN115564447A (en) | Credit card transaction risk detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |