CN114548429B - Safe and efficient transverse federated neural network model training method - Google Patents

Safe and efficient transverse federated neural network model training method Download PDF

Info

Publication number
CN114548429B
CN114548429B CN202210452869.4A CN202210452869A CN114548429B CN 114548429 B CN114548429 B CN 114548429B CN 202210452869 A CN202210452869 A CN 202210452869A CN 114548429 B CN114548429 B CN 114548429B
Authority
CN
China
Prior art keywords
weight coefficient
neural network
network model
gradient
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210452869.4A
Other languages
Chinese (zh)
Other versions
CN114548429A (en
Inventor
郭梁
裴阳
刘洋
毛仁歆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanxiang Zhilian Hangzhou Technology Co ltd
Original Assignee
Lanxiang Zhilian Hangzhou Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanxiang Zhilian Hangzhou Technology Co ltd filed Critical Lanxiang Zhilian Hangzhou Technology Co ltd
Priority to CN202210452869.4A priority Critical patent/CN114548429B/en
Publication of CN114548429A publication Critical patent/CN114548429A/en
Application granted granted Critical
Publication of CN114548429B publication Critical patent/CN114548429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a safe and efficient horizontal federal neural network model training method. The method comprises the following steps: s1: synchronously initializing an initiator and a participant; s2: the synchronous batch processing number m of the initiator and the participant; s3: the initiator and the participators respectively complete m times of batch processing, and the initiator calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federal neural network model subjected to each batch processing under the cooperation of the participators to obtain m aggregation gradients corresponding to the weight coefficient of each feature data; s4: the initiator calculates the latest value of the weight coefficient of each feature data and sends the latest value to the participants, and the initiator and the participants respectively endow the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the initiator; the steps S3-S4 are repeatedly executed until the set iteration number T is reached. The invention protects the data security of the initiator and the participator, has very high training efficiency and is convenient for realizing large-scale commercial use.

Description

Safe and efficient transverse federated neural network model training method
Technical Field
The invention relates to the technical field of neural network model training, in particular to a safe and efficient transverse federated neural network model training method.
Background
In the horizontal federal learning wind control scenario, homomorphic encryption and secret sharing are common security protocols. However, the calculation complexity of the modeling of the horizontal federal neural network is high, if the federal neural network algorithm is designed based on the homomorphic encryption or secret sharing operator, large-scale commercial availability is difficult to realize, and especially, the time for training a complex horizontal federal neural network model like Resnet is long. On the other hand, the financial wind control scene has high requirements on safety, and a safe and efficient transverse federal neural network model training method is needed.
The existing horizontal federal neural network modeling method is mainly based on cryptology safety protocols such as secret sharing or homomorphic encryption, the calculation complexity is several times or even dozens of times higher than that of a plaintext, and the requirement of large-scale commercial use is difficult to achieve. In addition, in the two-party scenario, even if secret sharing or homomorphic encrypted ciphertext aggregation gradient is used, the plaintext gradient value of the participant can be reversely deduced. For example, the initiator's gradient is 5, the participant's gradient is 3, the initiator knows that the final gradient aggregation value is 8, and the initiator can deduce that the participant's gradient value is 8-5=3 in reverse regardless of whether the participant uses secret sharing or homomorphic encryption.
Disclosure of Invention
In order to solve the technical problems, the invention provides a safe and efficient transverse federated neural network model training method, which protects the data safety by adding the normal distribution-obeying random noise to protect the respective gradients of the initiator and the participators from being reversely deduced by the other party, and simultaneously, the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
In order to solve the problems, the invention adopts the following technical scheme:
the invention relates to a safe and efficient transverse federated neural network model training method, which is used for joint wind control modeling between financial institutions and comprises the following steps:
s1: the initiator client and the participant client synchronously initialize respective transverse federated neural network models and a weight coefficient of each feature data contained in the transverse federated neural network models;
s2: the method comprises the steps that an initiator client and a participant client synchronously carry out batch processing for m, and the initiator client and the participant client divide samples of the method for training a transverse Federal neural network model into m batches respectively;
s3: the initiator client finishes the m-time batch processing, and the participant client finishes the m-time batch processing;
the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;
the initiator client calculates the transverse Federal neural network model subjected to the ith batch processing under the cooperation of the participant clientsAggregation gradient g corresponding to weight coefficient of jth characteristic data ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federated neural network model:
n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The participator client calculates the average gradient gb corresponding to the weighting coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij
N2: initiator client to average gradient ga ij Adding noise N to obtain noisy gradient gan ij The participant client gives the average gradient gb ij Adding noise N, resulting in a noisy gradient gbn ij
Noise N is obedience desired to be 0, variance is
Figure 100002_DEST_PATH_IMAGE001
Wherein σ is a standard deviation and C is a noise coefficient;
n3: the participant client will have a noise gradient gbn ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model ij ,g ij =(gan ij +gbn ij )/2;
S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the local system;
the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;
s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.
In the scheme, the initiator client and the participant client initialize the same transverse federated neural network model and weight coefficients of each feature data contained in the transverse federated neural network model, the transverse federated neural network model comprises d feature data, and each sample of the initiator client and the participant client used for training the transverse federated neural network model also comprises the same d feature data.
The initiator client and the participant client synchronize the batch processing number m, samples used for training the transverse federated neural network model are divided into m batches, and batch alignment is completed. Then, the initiator client and the participant client substitute m batches of samples of the local into the transverse federal neural network model of the local for training, and respectively complete m batches of processing, the initiator client calculates a noisy gradient corresponding to a weight coefficient of each feature data contained in the transverse federal neural network model of the local after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gan; the participator client calculates a noisy gradient corresponding to the weight coefficient of each feature data contained in the local horizontal federal neural network model after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gbn; and the initiator client aggregates the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data obtained after batch processing of the same batch to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.
Then, the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data by combining the learning rate μ, and gives the latest value to the weight coefficients, the initiator client sends the calculated latest value of the weight coefficient of each feature data to the participant client, and the participant client gives the latest value to the weight coefficient of each feature data. And (5) repeatedly executing the steps S3-S4T times to finish the training of the transverse federal neural network model.
In the process of calculating the aggregation gradient g, firstly, the initiator client calculates an average gradient ga corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the current batch, and the participant client calculates an average gradient gb corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the same batch; then, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data to obtain a noisy gradient gan, and the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data to obtain a noisy gradient gbn; finally, the initiator client calculates the mean of the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.
Since the added noise N is desired to be 0 and the variance is
Figure 161844DEST_PATH_IMAGE002
The method comprises the following steps that normally distributed random noise is generated, so when the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data finally, all noise N added by the initiator client and all noise N added by the participator client can be basically counteracted with each other, the calculated mean value of the m aggregation gradients and the actual mean value of the gradients have only very slight deviation, and the training precision of the transverse federated neural network model is not influenced basically.
In the method, in the process of calculating the aggregation gradient g, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data calculated by the initiator client, the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the noise N is random noise, so that the initiator client cannot acquire the actual value of the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the finally calculated average value of m aggregation gradients has a slight deviation from the actual gradient average value, so that the initiator client and the participant client cannot reversely deduce the gradient value of the other party, and the data security is protected. The method is not based on the cryptology security protocol to encrypt the gradient, and the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
The method can be used for joint wind control modeling among financial institutions, and the characteristic data contained in the transverse federal neural network model can be income, age, monthly telephone charge, monthly repayment amount, debt total amount and the like of a user sample.
Preferably, the step S2 includes the steps of:
the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,
Figure 185164DEST_PATH_IMAGE003
Figure DEST_PATH_IMAGE004
the representation is rounded upwards and sent to the client side of the participant, and the client side of the initiator divides the sample used for training the transverse federated neural network model into m batches according to the batch processing size p;
the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,
Figure 161210DEST_PATH_IMAGE005
and dividing the samples of the participating side client side for training the horizontal federated neural network model into m batches according to the batch processing size q.
The batch size p means that a single batch can process p samples at most, and the batch size q means that a single batch can process q samples at most.
Preferably, the step S3, in which the initiator client completes one batch processing, includes the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;
the step of completing batch processing once by the client of the participator in the step S3 comprises the following steps: and inputting the characteristic data contained in each sample of the current batch into a lateral federal neural network model of the client side of the participant for training.
Preferably, the step N1 includes the following steps:
n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The method comprises the following specific steps:
m1: the initiator client inputs each sample (including feature data) subjected to the ith batch processing into the transverse federal neural network model of the initiator client, and then calculates the gradient corresponding to the weight coefficient of each feature data, and the r gradient corresponding to the weight coefficient of the j feature data calculated after the r sample subjected to the ith batch processing is input into the transverse federal neural network model of the initiator client is gak ijr R is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p;
after the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak ij1 、gak ij2 、……gak iju
M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;
the r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr The r normalized gradient obtained by the normalization process is gas ijr
M3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data ij
Figure DEST_PATH_IMAGE006
N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij The method comprises the following specific steps:
f1: the participator client inputs each sample (including characteristic data) subjected to the ith batch processing into the transverse federal neural network model of the participator for training, and calculates the gradient corresponding to the weight coefficient of each characteristic data, and the tth sample subjected to the ith batch processing is input into the transverse federal neural network model of the participator for training, and calculates the tth gradient corresponding to the weight coefficient of the jth characteristic data, wherein the tth gradient is gbk ijt T is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch processing, and v is more than or equal to 1 and less than or equal to q;
after the client of the participator inputs all the v samples of the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein the v gradients corresponding to the weight coefficient of the jth feature data are gbk ij1 、gbk ij2 、……gbk ijv
F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;
t gradient gbk corresponding to weight coefficient of j characteristic data ijt The t-th normalized gradient obtained by performing the normalization process was gbs ijt
F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data ij
Figure 498651DEST_PATH_IMAGE007
Preferably, in the step M1, the initiator client inputs the ith sample processed in batch into the weight coefficient pair of the jth feature data calculated after the lateral federal neural network model of the initiator is trainedCorresponding r < th > gradient gak ijr The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the ith gradient gak corresponding to the weight coefficient of the jth characteristic data ijr
The client of the participant in the step F1 inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the lateral federal neural network model of the participant ijt The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the tth gradient gbk corresponding to the weight coefficient of the jth characteristic data ijt
Preferably, in the step M2, the initiator client assigns an r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr Normalizing to obtain the r normalized gradient gas ijr The calculation formula of (a) is as follows:
Figure DEST_PATH_IMAGE008
G=[gak ij1 ,gak ij2 ,……gak iju ],
wherein the content of the first and second substances,
Figure 938859DEST_PATH_IMAGE009
representing the 2-norm of the vector G.
Preferably, in step S4, the formula for the initiator client to calculate the latest value of the weight coefficient of the jth feature data from the average aggregation gradient corresponding to the learning rate μ and the weight coefficient of the jth feature data, and to assign the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model is as follows:
Figure DEST_PATH_IMAGE010
wherein f is j Is the weight coefficient, gm, of the jth feature data j And the average aggregation gradient corresponding to the weight coefficient of the jth characteristic data.
The invention has the beneficial effects that: the gradients of the initiator and the participators are protected from being reversely deduced by the other party by adding random noise which obeys normal distribution, so that the data safety is protected, and meanwhile, the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
Drawings
FIG. 1 is a flow chart of an embodiment.
Detailed Description
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b): the safe and efficient horizontal federal neural network model training method is used for joint wind control modeling between financial institutions, and comprises the following steps as shown in fig. 1:
s1: the initiator client and the participant client synchronously initialize respective horizontal federal neural network models and weight coefficients of each feature data contained in the horizontal federal neural network models, wherein model functions of the horizontal federal neural network models are L (f, x), and f = [ f = [ ] [ [ f ] 1 ,f 2 ,……f d ],x=[x 1 ,x 2 ,……x d ],x j Representing the jth feature data, f j Representing the jth characteristic data x j J is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federal neural network model;
s2: the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,
Figure 634545DEST_PATH_IMAGE003
Figure 832308DEST_PATH_IMAGE004
the method comprises the steps that the completion is expressed, the completion is sent to a client side of a participant, a client side of an initiator divides samples, which are used for training a transverse federated neural network model, of the initiator into m batches according to a batch processing size p, and the batch processing size p represents that p samples can be processed at most in single batch processing;
the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,
Figure 239019DEST_PATH_IMAGE011
the participating client divides samples of the method for training the transverse federated neural network model into m batches according to the batch processing size q, wherein the batch processing size q represents that at most q samples can be processed in single batch processing;
s3: the initiator client side substitutes m batches of samples of the initiator client side into the transverse federated neural network model of the initiator client side to train, and m batches of samples are processed; the method for completing one batch processing by the initiator client comprises the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;
the participator client side substitutes m batches of samples of the local side into the transverse federal neural network model of the local side respectively for training to finish m batches of processing; the method for completing one batch processing by the client side of the participant comprises the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of a participant client for training;
the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;
the initiator client side calculates the weight coefficient corresponding to the jth characteristic data contained in the ith batch processed transverse federal neural network model under the cooperation of the participant client sideGradient g of polymerization ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m:
n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The participator client calculates the average gradient gb corresponding to the weighting coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij
N2: initiator client to average gradient ga ij Adding noise N to obtain noisy gradient gan ij ,gan ij =ga ij +N;
Participant client gives the average gradient gb ij Adding noise N, resulting in a noisy gradient gbn ij ,gbn ij =gb ij +N;
Noise N is 0 for compliance expectation and variance is
Figure 205838DEST_PATH_IMAGE001
Normally distributed random noise of
Figure 243064DEST_PATH_IMAGE012
Wherein σ is a standard deviation, and C is a noise coefficient;
n3: the participant client will have a noise gradient gbn ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model ij ,g ij =(gan ij +gbn ij )/2;
S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral Federal neural network model of the local side;
the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;
the initiator client calculates the latest value of the weight coefficient of the jth feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of the jth feature data, and gives the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model according to the following formula:
Figure 787178DEST_PATH_IMAGE010
wherein f is j Is the weight coefficient, gm, of the jth feature data j The average aggregation gradient corresponding to the weight coefficient of the jth characteristic data;
s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.
Step N1 includes the following steps:
n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The method comprises the following specific steps:
m1: the initiator client inputs each sample (including characteristic data) subjected to the ith batch processing into a transverse federated neural network model of the initiator client, and calculates a gradient corresponding to a weight coefficient of each characteristic data;
the initiator client inputs the ith sample subjected to batch processing for the ith time into the ith gradient gak corresponding to the weight coefficient of the jth feature data calculated after the transverse federated neural network model of the initiator client is trained ijr The method comprises the following steps that r is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local side for training, and carrying out model function L (f, x) of the transverse federal neural network model on the jth characteristic data x j Is a weight ofNumber f j Obtaining the jth characteristic data x by calculating the partial derivative j Weight coefficient f of j Corresponding r-th gradient gak ijr
Figure 138525DEST_PATH_IMAGE013
After the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak ij1 、gak ij2 、……gak iju
M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;
the initiator client maps the r gradient gak corresponding to the weight coefficient of the j characteristic data ijr Normalizing to obtain the r normalized gradient gas ijr The calculation formula of (a) is as follows:
Figure 818905DEST_PATH_IMAGE008
G=[gak ij1 ,gak ij2 ,……gak iju ],
wherein the content of the first and second substances,
Figure 230294DEST_PATH_IMAGE009
a 2-norm representing vector G;
m3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data ij
Figure DEST_PATH_IMAGE014
N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij The method comprises the following specific steps:
f1: the participator client inputs each sample (including characteristic data) subjected to the ith batch processing into a transverse federal neural network model of the participator for training, and then calculates the gradient corresponding to the weight coefficient of each characteristic data;
the client side of the participator inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the transverse federal neural network model of the participator ijt The method comprises the following steps that t is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch, and q is more than or equal to 1 and less than or equal to q:
inputting the ith sample of the ith batch processing into the transverse federal neural network model of the local side for training, and carrying out model function L (f, x) of the transverse federal neural network model on the jth characteristic data x j Weight coefficient f of j Obtaining the jth characteristic data x by calculating the partial derivative j Weight coefficient f of j Corresponding t-th gradient gbk ijt
Figure 28748DEST_PATH_IMAGE015
After the participator client inputs all v samples subjected to the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein v gradients corresponding to the weight coefficient of the jth feature data are gbk ij1 、gbk ij2 、……gbk ijv
F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;
t-th gradient gbk corresponding to weight coefficient of j-th feature data ijt The t-th normalized gradient obtained by performing the normalization process was gbs ijt
F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data ij
Figure 918207DEST_PATH_IMAGE007
In the scheme, the initiator client and the participant client initialize the same transverse federal neural network model and weight coefficients of each feature data contained in the transverse federal neural network model, the transverse federal neural network model contains d feature data, and each sample used for training the transverse federal neural network model by the initiator client and the participant client also contains the same d feature data.
The initiator client and the participant client synchronously perform batch processing for m, samples used for training the transverse federated neural network model are divided into m batches, and batch alignment is completed. Then, the initiator client and the participant client substitute m batches of samples of the local into the transverse federal neural network model of the local for training, and respectively complete m batches of processing, the initiator client calculates a noisy gradient corresponding to a weight coefficient of each feature data contained in the transverse federal neural network model of the local after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gan; the participator client calculates a noisy gradient corresponding to the weight coefficient of each feature data contained in the local horizontal federal neural network model after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gbn; the initiator client aggregates the noisy gradient gan corresponding to the weight coefficient of each piece of feature data obtained after batch processing of the same batch with the noisy gradient gbn to obtain an aggregated gradient g corresponding to the weight coefficient of each piece of feature data.
Then, the initiator client calculates the mean of m aggregation gradients corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data in combination with the learning rate μ, and gives the latest value to the weight coefficients, the initiator client sends the calculated latest value of the weight coefficient of each feature data to the participant client, and the participant client gives the latest value to the weight coefficient of each feature data. And (5) repeatedly executing the steps S3-S4T times to finish the training of the transverse federal neural network model.
In the process of calculating the aggregation gradient g, firstly, the initiator client calculates an average gradient ga corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the current batch, and the participant client calculates an average gradient gb corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the same batch; then, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data to obtain a noisy gradient gan, and the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data to obtain a noisy gradient gbn; finally, the initiator client calculates the mean of the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.
Since the added noise N is desired to be 0 and the variance is
Figure 718673DEST_PATH_IMAGE002
The method comprises the following steps that normally distributed random noise is generated, so when the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data finally, all noise N added by the initiator client and all noise N added by the participator client can be basically counteracted with each other, the calculated mean value of the m aggregation gradients and the actual mean value of the gradients have only very slight deviation, and the training precision of the transverse federated neural network model is not influenced basically.
In the method, in the process of calculating the aggregation gradient g, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data calculated by the initiator client, the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the noise N is random noise, so that the initiator client cannot acquire the actual value of the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the finally calculated average value of m aggregation gradients has a slight deviation from the actual gradient average value, so that the initiator client and the participant client cannot reversely deduce the gradient value of the other party, and the data security is protected. The method is not based on the cryptology security protocol to encrypt the gradient, and the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.
The method can be used for joint wind control modeling among financial institutions, and the characteristic data contained in the transverse federal neural network model can be income, age, monthly telephone charge, monthly repayment amount, debt total amount and the like of a user sample.
For example, the following steps are carried out:
the transverse federal neural network model comprises characteristic data of income, age, monthly telephone charge, monthly repayment amount and arrearage total amount; the characteristic data contained in the sample used for training the transverse federal neural network model by the initiator client is income, age, monthly telephone charge, monthly repayment amount and arrears total amount; the participator client is used for training samples of the transverse federated neural network model to contain characteristic data of income, age, monthly telephone charge, monthly repayment amount and arrearage total amount;
the batch processing numbers of the initiator client and the participant client are both 5,
the method comprises the steps that an initiator client calculates an average gradient corresponding to a weight coefficient of income characteristic data of the local horizontal federated neural network model after each batch processing to obtain an average gradient (0.2,0.3,0.25,0.4,0.6) corresponding to 5 batches of batch processing, the initiator client adds a corresponding noise N to the average gradient corresponding to each batch, the noise to be added to the average gradient corresponding to 5 batches of batch processing is (0.3,0.2,0, -0.1, -0.3), and the noise gradient obtained after the noise N is added to the average gradient corresponding to 5 batches of batch processing is (0.5,0.5,0.25,0.3, 0.3);
the participator client calculates the average gradient corresponding to the weight coefficient of the income characteristic data of the local horizontal federated neural network model after each batch processing to obtain the average gradient (0.6,0.35,0.4,0.2,0.5) corresponding to 5 batches of batch processing, the participator client adds a corresponding noise N to the average gradient corresponding to each batch processing, the noise to be added to the average gradient corresponding to 5 batches of batch processing is (-0.2,0.3,0.1,0.2, -0.3), and the noise gradient (0.4,0.65,0.5,0.4,0.2) is obtained after the noise N is added to the average gradient corresponding to 5 batches of batch processing;
the aggregation gradient corresponding to 5 batches with the addition of noise N is ((0.5,0.5,0.25,0.3,0.3) + (0.4,0.65,0.5,0.4, 0.2))/2 = (0.45,0.575,0.375,0.35,0.25), and the mean value is 0.4, i.e., the average aggregation gradient is 0.4;
the aggregation gradient corresponding to 5 batches without the addition of noise N is ((0.2,0.3,0.25,0.4,0.6) + (0.6,0.35,0.4,0.2, 0.5))/2 = (0.4,0.325,0.325,0.3,0.55), and the mean value is 0.38, i.e., the actual average aggregation gradient is 0.38;
it can be seen that after the gradient of the additive noise N is aggregated, most of the noise is cancelled, and the finally calculated average aggregated gradient value has a small deviation from the actual average aggregated gradient value. The data security is protected, and meanwhile, the gradient calculation complexity is basically consistent with that of a plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.

Claims (7)

1. A safe and efficient horizontal federated neural network model training method is characterized by comprising the following steps:
s1: the initiator client and the participant client synchronously initialize respective transverse federated neural network models and a weight coefficient of each feature data contained in the transverse federated neural network models;
s2: the method comprises the steps that an initiator client and a participant client synchronously carry out batch processing for m, and the initiator client and the participant client divide samples of the method for training a transverse Federal neural network model into m batches respectively;
s3: the initiator client finishes the m-time batch processing, and the participant client finishes the m-time batch processing;
the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;
the initiator client calculates the horizontal direction subjected to the ith batch processing under the coordination of the participant clientsAggregation gradient g corresponding to weight coefficient of jth characteristic data contained in federated neural network model ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federated neural network model:
n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model which is subjected to the ith batch processing ij
N2: initiator client to average gradient ga ij Adding noise N to obtain noisy gradient gan ij The participant client gives the average gradient gb ij Adding noise N, resulting in a noisy gradient gbn ij
Noise N is 0 for compliance expectation and variance is
Figure DEST_PATH_IMAGE001
Wherein σ is a standard deviation, and C is a noise coefficient;
n3: the participant client will have a noise gradient gbn ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model ij ,g ij =(gan ij +gbn ij )/2;
S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the local system;
the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;
s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.
2. The safe and efficient method for training the horizontal federated neural network model as claimed in claim 1, wherein the step S2 includes the following steps:
the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,
Figure 976818DEST_PATH_IMAGE002
Figure DEST_PATH_IMAGE003
the representation is rounded upwards and sent to the client side of the participant, and the client side of the initiator divides the sample used for training the transverse federated neural network model into m batches according to the batch processing size p;
the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,
Figure 956276DEST_PATH_IMAGE004
and dividing the samples of the participating side client side for training the horizontal federated neural network model into m batches according to the batch processing size q.
3. The safe and efficient training method for the horizontal federated neural network model according to claim 1 or 2, wherein the initiator client completing one batch process in step S3 includes the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;
the step S3 of completing the batch processing by the participant client includes the following steps: and inputting the characteristic data contained in each sample of the current batch into a lateral federal neural network model of the client side of the participant for training.
4. The safe and efficient method for training the transverse federated neural network model as claimed in claim 2, wherein the step N1 includes the following steps:
n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing ij The method comprises the following specific steps:
m1: the initiator client inputs each sample subjected to the ith batch processing into the transverse federal neural network model of the initiator client and then calculates the gradient corresponding to the weight coefficient of each feature data, and the r gradient corresponding to the weight coefficient of the jth feature data calculated after the ith batch processing of the ith sample is input into the transverse federal neural network model of the initiator client is gak ijr R is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p;
after the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak ij1 、gak ij2 、……gak iju
M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;
the r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr The r normalized gradient obtained by the normalization process is gas ijr
M3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data ij
Figure DEST_PATH_IMAGE005
N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing ij The method comprises the following specific steps:
f1: the participator client inputs each sample subjected to the ith batch processing into the transverse federal neural network model of the local side for training, calculates the gradient corresponding to the weight coefficient of each feature data, and calculates the t-th gradient corresponding to the weight coefficient of the jth feature data after the ith batch processing sample is input into the transverse federal neural network model of the local side for training as gbk ijt T is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch processing, and v is more than or equal to 1 and less than or equal to q;
after the participator client inputs all v samples subjected to the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein v gradients corresponding to the weight coefficient of the jth feature data are gbk ij1 、gbk ij2 、……gbk ijv
F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;
t-th gradient gbk corresponding to weight coefficient of j-th feature data ijt The t-th normalized gradient obtained by performing the normalization process was gbs ijt
F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data ij
Figure 565374DEST_PATH_IMAGE006
5. The safe and efficient transverse federated neural network model training method of claim 4, wherein in the step M1, the initiator client inputs the ith batch processed sample into the transverse federated neural network model of the present party for training and calculatesThe r-th gradient gak corresponding to the weight coefficient of the j-th feature data ijr The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the ith gradient gak corresponding to the weight coefficient of the jth characteristic data ijr
The client of the participant in the step F1 inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the lateral federal neural network model of the participant ijt The method comprises the following steps:
inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the tth gradient gbk corresponding to the weight coefficient of the jth characteristic data ijt
6. The safe and efficient transverse federated neural network model training method of claim 4, wherein in the step M2, the initiator client assigns the jth gradient gak corresponding to the weight coefficient of the jth feature data ijr Normalizing to obtain the r normalized gradient gas ijr The calculation formula of (a) is as follows:
Figure DEST_PATH_IMAGE007
G=[gak ij1 ,gak ij2 ,……gak iju ],
wherein the content of the first and second substances,
Figure 757321DEST_PATH_IMAGE008
representing the 2 norm of the vector G.
7. The safe and efficient horizontal federated neural network model training method according to claim 1 or 2, wherein, in step S4, the initiator client calculates a latest value of the weight coefficient of the jth feature data according to the learning rate μ and an average aggregation gradient corresponding to the weight coefficient of the jth feature data, and assigns the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model according to the following formula:
Figure DEST_PATH_IMAGE009
wherein f is j Is the weight coefficient, gm, of the jth feature data j And the average aggregation gradient corresponding to the weight coefficient of the jth characteristic data.
CN202210452869.4A 2022-04-27 2022-04-27 Safe and efficient transverse federated neural network model training method Active CN114548429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210452869.4A CN114548429B (en) 2022-04-27 2022-04-27 Safe and efficient transverse federated neural network model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210452869.4A CN114548429B (en) 2022-04-27 2022-04-27 Safe and efficient transverse federated neural network model training method

Publications (2)

Publication Number Publication Date
CN114548429A CN114548429A (en) 2022-05-27
CN114548429B true CN114548429B (en) 2022-08-12

Family

ID=81666726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210452869.4A Active CN114548429B (en) 2022-04-27 2022-04-27 Safe and efficient transverse federated neural network model training method

Country Status (1)

Country Link
CN (1) CN114548429B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115994161B (en) * 2023-03-21 2023-06-06 杭州金智塔科技有限公司 Data aggregation system and method based on multiparty security calculation

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723477A (en) * 2021-08-16 2021-11-30 同盾科技有限公司 Cross-feature federal abnormal data detection method based on isolated forest

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522669A (en) * 2020-04-29 2020-08-11 深圳前海微众银行股份有限公司 Method, device and equipment for optimizing horizontal federated learning system and readable storage medium
CN111898768A (en) * 2020-08-06 2020-11-06 深圳前海微众银行股份有限公司 Data processing method, device, equipment and medium
CN112733967B (en) * 2021-03-30 2021-06-29 腾讯科技(深圳)有限公司 Model training method, device, equipment and storage medium for federal learning
CN113515760B (en) * 2021-05-28 2024-03-15 平安国际智慧城市科技股份有限公司 Horizontal federal learning method, apparatus, computer device, and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723477A (en) * 2021-08-16 2021-11-30 同盾科技有限公司 Cross-feature federal abnormal data detection method based on isolated forest

Also Published As

Publication number Publication date
CN114548429A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN112733967B (en) Model training method, device, equipment and storage medium for federal learning
CN112183730B (en) Neural network model training method based on shared learning
CN112862001A (en) Decentralized data modeling method under privacy protection
CN114548429B (en) Safe and efficient transverse federated neural network model training method
CN111476200A (en) Face de-identification generation method based on generation of confrontation network
CN114580498A (en) Federal learning method with high communication efficiency in wireless communication scene
CN115189878B (en) Shared data sorting method based on secret sharing and electronic equipment
CN113947211A (en) Federal learning model training method and device, electronic equipment and storage medium
CN111563262A (en) Encryption method and system based on reversible deep neural network
CN114004363B (en) Method, device and system for jointly updating model
CN114362948A (en) Efficient federal derivative feature logistic regression modeling method
CN111046857A (en) Face recognition method, device, equipment, medium and system based on knowledge federation
CN114996749B (en) Feature filtering method for federal learning
CN116796832A (en) Federal learning method, system and equipment with high availability under personalized differential privacy scene
CN114282692A (en) Model training method and system for longitudinal federal learning
Li et al. Fast adaptive BSS algorithm for independent/dependent sources
CN114386071A (en) Decentered federal clustering method and device, electronic equipment and storage medium
CN112651170B (en) Efficient characteristic contribution assessment method in longitudinal federal learning scene
CN108492275B (en) No-reference stereo image quality evaluation method based on deep neural network
CN116562366A (en) Federal learning method based on feature selection and feature alignment
CN115908662B (en) Speaker video generation model training and using method, device and equipment
CN111859440A (en) Sample classification method of distributed privacy protection logistic regression model based on mixed protocol
CN116341636A (en) Federal learning method, apparatus, system, and storage medium
CN114817997B (en) Shared data random ordering method based on secret sharing
CN115564447A (en) Credit card transaction risk detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant