CN114548429B

CN114548429B - Safe and efficient transverse federated neural network model training method

Info

Publication number: CN114548429B
Application number: CN202210452869.4A
Authority: CN
Inventors: 郭梁; 裴阳; 刘洋; 毛仁歆
Original assignee: Lanxiang Zhilian Hangzhou Technology Co ltd
Current assignee: Lanxiang Zhilian Hangzhou Technology Co ltd
Priority date: 2022-04-27
Filing date: 2022-04-27
Publication date: 2022-08-12
Anticipated expiration: 2042-04-27
Also published as: CN114548429A

Abstract

The invention discloses a safe and efficient horizontal federal neural network model training method. The method comprises the following steps: s1: synchronously initializing an initiator and a participant; s2: the synchronous batch processing number m of the initiator and the participant; s3: the initiator and the participators respectively complete m times of batch processing, and the initiator calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federal neural network model subjected to each batch processing under the cooperation of the participators to obtain m aggregation gradients corresponding to the weight coefficient of each feature data; s4: the initiator calculates the latest value of the weight coefficient of each feature data and sends the latest value to the participants, and the initiator and the participants respectively endow the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the initiator; the steps S3-S4 are repeatedly executed until the set iteration number T is reached. The invention protects the data security of the initiator and the participator, has very high training efficiency and is convenient for realizing large-scale commercial use.

Description

Safe and efficient transverse federated neural network model training method

Technical Field

The invention relates to the technical field of neural network model training, in particular to a safe and efficient transverse federated neural network model training method.

Background

In the horizontal federal learning wind control scenario, homomorphic encryption and secret sharing are common security protocols. However, the calculation complexity of the modeling of the horizontal federal neural network is high, if the federal neural network algorithm is designed based on the homomorphic encryption or secret sharing operator, large-scale commercial availability is difficult to realize, and especially, the time for training a complex horizontal federal neural network model like Resnet is long. On the other hand, the financial wind control scene has high requirements on safety, and a safe and efficient transverse federal neural network model training method is needed.

The existing horizontal federal neural network modeling method is mainly based on cryptology safety protocols such as secret sharing or homomorphic encryption, the calculation complexity is several times or even dozens of times higher than that of a plaintext, and the requirement of large-scale commercial use is difficult to achieve. In addition, in the two-party scenario, even if secret sharing or homomorphic encrypted ciphertext aggregation gradient is used, the plaintext gradient value of the participant can be reversely deduced. For example, the initiator's gradient is 5, the participant's gradient is 3, the initiator knows that the final gradient aggregation value is 8, and the initiator can deduce that the participant's gradient value is 8-5=3 in reverse regardless of whether the participant uses secret sharing or homomorphic encryption.

Disclosure of Invention

In order to solve the technical problems, the invention provides a safe and efficient transverse federated neural network model training method, which protects the data safety by adding the normal distribution-obeying random noise to protect the respective gradients of the initiator and the participators from being reversely deduced by the other party, and simultaneously, the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.

In order to solve the problems, the invention adopts the following technical scheme:

the invention relates to a safe and efficient transverse federated neural network model training method, which is used for joint wind control modeling between financial institutions and comprises the following steps:

s1: the initiator client and the participant client synchronously initialize respective transverse federated neural network models and a weight coefficient of each feature data contained in the transverse federated neural network models;

s2: the method comprises the steps that an initiator client and a participant client synchronously carry out batch processing for m, and the initiator client and the participant client divide samples of the method for training a transverse Federal neural network model into m batches respectively;

s3: the initiator client finishes the m-time batch processing, and the participant client finishes the m-time batch processing;

the initiator client side calculates the aggregation gradient corresponding to the weight coefficient of each feature data contained in the transverse federated neural network model subjected to batch processing each time under the cooperation of the participant client side, and m aggregation gradients corresponding to the weight coefficient of each feature data are obtained;

the initiator client calculates the transverse Federal neural network model subjected to the ith batch processing under the cooperation of the participant clientsAggregation gradient g corresponding to weight coefficient of jth characteristic data _ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federated neural network model:

n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing _ij The participator client calculates the average gradient gb corresponding to the weighting coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing _ij ；

N2: initiator client to average gradient ga _ij Adding noise N to obtain noisy gradient gan _ij The participant client gives the average gradient gb _ij Adding noise N, resulting in a noisy gradient gbn _ij ；

Noise N is obedience desired to be 0, variance is

Wherein σ is a standard deviation and C is a noise coefficient;

n3: the participant client will have a noise gradient gbn _ij Sending the aggregate gradient g to an initiator client, wherein the initiator client calculates the aggregation gradient g corresponding to the weight coefficient of the jth feature data contained in the ith batch processed transverse Federal neural network model _ij ，g _ij =(gan _ij +gbn _ij )/2；

S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral federal neural network model of the local system;

the initiator client side sends the latest value of the weight coefficient of each feature data to the participant client side, and the participant client side gives the latest value to the weight coefficient of each feature data contained in the local lateral federal neural network model;

s5: the steps S3-S4 are repeatedly executed until the set iteration number T is reached.

In the scheme, the initiator client and the participant client initialize the same transverse federated neural network model and weight coefficients of each feature data contained in the transverse federated neural network model, the transverse federated neural network model comprises d feature data, and each sample of the initiator client and the participant client used for training the transverse federated neural network model also comprises the same d feature data.

The initiator client and the participant client synchronize the batch processing number m, samples used for training the transverse federated neural network model are divided into m batches, and batch alignment is completed. Then, the initiator client and the participant client substitute m batches of samples of the local into the transverse federal neural network model of the local for training, and respectively complete m batches of processing, the initiator client calculates a noisy gradient corresponding to a weight coefficient of each feature data contained in the transverse federal neural network model of the local after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gan; the participator client calculates a noisy gradient corresponding to the weight coefficient of each feature data contained in the local horizontal federal neural network model after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gbn; and the initiator client aggregates the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data obtained after batch processing of the same batch to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.

Then, the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data by combining the learning rate μ, and gives the latest value to the weight coefficients, the initiator client sends the calculated latest value of the weight coefficient of each feature data to the participant client, and the participant client gives the latest value to the weight coefficient of each feature data. And (5) repeatedly executing the steps S3-S4T times to finish the training of the transverse federal neural network model.

In the process of calculating the aggregation gradient g, firstly, the initiator client calculates an average gradient ga corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the current batch, and the participant client calculates an average gradient gb corresponding to the weight coefficient of each feature data of the local horizontal federal neural network model after batch processing of the same batch; then, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data to obtain a noisy gradient gan, and the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data to obtain a noisy gradient gbn; finally, the initiator client calculates the mean of the noisy gradient gan and the noisy gradient gbn corresponding to the weight coefficient of each feature data to obtain an aggregate gradient g corresponding to the weight coefficient of each feature data.

Since the added noise N is desired to be 0 and the variance is

The method comprises the following steps that normally distributed random noise is generated, so when the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data finally, all noise N added by the initiator client and all noise N added by the participator client can be basically counteracted with each other, the calculated mean value of the m aggregation gradients and the actual mean value of the gradients have only very slight deviation, and the training precision of the transverse federated neural network model is not influenced basically.

In the method, in the process of calculating the aggregation gradient g, the initiator client adds noise N to the average gradient ga corresponding to the weight coefficient of each feature data calculated by the initiator client, the participant client adds noise N to the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the noise N is random noise, so that the initiator client cannot acquire the actual value of the average gradient gb corresponding to the weight coefficient of each feature data calculated by the participant client, and the finally calculated average value of m aggregation gradients has a slight deviation from the actual gradient average value, so that the initiator client and the participant client cannot reversely deduce the gradient value of the other party, and the data security is protected. The method is not based on the cryptology security protocol to encrypt the gradient, and the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.

The method can be used for joint wind control modeling among financial institutions, and the characteristic data contained in the transverse federal neural network model can be income, age, monthly telephone charge, monthly repayment amount, debt total amount and the like of a user sample.

Preferably, the step S2 includes the steps of:

the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,

，

the representation is rounded upwards and sent to the client side of the participant, and the client side of the initiator divides the sample used for training the transverse federated neural network model into m batches according to the batch processing size p;

the client side of the participator calculates the batch processing size q of the participator according to the number B of samples used for training the transverse federated neural network model and the batch processing number m of the participator,

and dividing the samples of the participating side client side for training the horizontal federated neural network model into m batches according to the batch processing size q.

The batch size p means that a single batch can process p samples at most, and the batch size q means that a single batch can process q samples at most.

Preferably, the step S3, in which the initiator client completes one batch processing, includes the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;

the step of completing batch processing once by the client of the participator in the step S3 comprises the following steps: and inputting the characteristic data contained in each sample of the current batch into a lateral federal neural network model of the client side of the participant for training.

Preferably, the step N1 includes the following steps:

n11: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing _ij The method comprises the following specific steps:

m1: the initiator client inputs each sample (including feature data) subjected to the ith batch processing into the transverse federal neural network model of the initiator client, and then calculates the gradient corresponding to the weight coefficient of each feature data, and the r gradient corresponding to the weight coefficient of the j feature data calculated after the r sample subjected to the ith batch processing is input into the transverse federal neural network model of the initiator client is gak _ijr R is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p;

after the initiator client inputs all u samples of the ith batch processing into the transverse federal neural network model of the initiator client respectively for training, u gradients corresponding to the weight coefficient of each feature data can be calculated, and the u gradients corresponding to the weight coefficient of the jth feature data are gak _ij1 、gak _ij2 、……gak _iju ；

M2: the initiator client respectively standardizes u gradients corresponding to the weight coefficient of the jth feature data to obtain u standardized gradients;

the r-th gradient gak corresponding to the weight coefficient of the j-th feature data _ijr The r normalized gradient obtained by the normalization process is gas _ijr ；

M3: the initiator client calculates an average gradient ga corresponding to the weight coefficient of the jth feature data _ij ，

；

N12: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model after the ith batch processing _ij The method comprises the following specific steps:

f1: the participator client inputs each sample (including characteristic data) subjected to the ith batch processing into the transverse federal neural network model of the participator for training, and calculates the gradient corresponding to the weight coefficient of each characteristic data, and the tth sample subjected to the ith batch processing is input into the transverse federal neural network model of the participator for training, and calculates the tth gradient corresponding to the weight coefficient of the jth characteristic data, wherein the tth gradient is gbk _ijt T is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch processing, and v is more than or equal to 1 and less than or equal to q;

after the client of the participator inputs all the v samples of the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein the v gradients corresponding to the weight coefficient of the jth feature data are gbk _ij1 、gbk _ij2 、……gbk _ijv ；

F2: the participant client side respectively standardizes v gradients corresponding to the weight coefficient of the jth characteristic data to obtain v standardized gradients;

t gradient gbk corresponding to weight coefficient of j characteristic data _ijt The t-th normalized gradient obtained by performing the normalization process was gbs _ijt ；

F3: the participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data _ij ，

。

Preferably, in the step M1, the initiator client inputs the ith sample processed in batch into the weight coefficient pair of the jth feature data calculated after the lateral federal neural network model of the initiator is trainedCorresponding r < th > gradient gak _ijr The method comprises the following steps:

inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the ith gradient gak corresponding to the weight coefficient of the jth characteristic data _ijr ；

The client of the participant in the step F1 inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the lateral federal neural network model of the participant _ijt The method comprises the following steps:

inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local for training, and solving the partial derivative of the model function of the transverse federal neural network model on the weight coefficient of the jth characteristic data to obtain the tth gradient gbk corresponding to the weight coefficient of the jth characteristic data _ijt 。

Preferably, in the step M2, the initiator client assigns an r-th gradient gak corresponding to the weight coefficient of the j-th feature data _ijr Normalizing to obtain the r normalized gradient gas _ijr The calculation formula of (a) is as follows:

，

G=[gak _ij1 ,gak _ij2 ,……gak _iju ]，

wherein the content of the first and second substances,

representing the 2-norm of the vector G.

Preferably, in step S4, the formula for the initiator client to calculate the latest value of the weight coefficient of the jth feature data from the average aggregation gradient corresponding to the learning rate μ and the weight coefficient of the jth feature data, and to assign the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model is as follows:

，

wherein f is _j Is the weight coefficient, gm, of the jth feature data _j And the average aggregation gradient corresponding to the weight coefficient of the jth characteristic data.

The invention has the beneficial effects that: the gradients of the initiator and the participators are protected from being reversely deduced by the other party by adding random noise which obeys normal distribution, so that the data safety is protected, and meanwhile, the gradient calculation complexity is basically consistent with the plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.

Drawings

FIG. 1 is a flow chart of an embodiment.

Detailed Description

The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.

Example (b): the safe and efficient horizontal federal neural network model training method is used for joint wind control modeling between financial institutions, and comprises the following steps as shown in fig. 1:

s1: the initiator client and the participant client synchronously initialize respective horizontal federal neural network models and weight coefficients of each feature data contained in the horizontal federal neural network models, wherein model functions of the horizontal federal neural network models are L (f, x), and f = [ f = [ ] [ [ f ] ₁ ,f ₂ ,……f _d ]，x=[x ₁ ,x ₂ ,……x _d ]，x _j Representing the jth feature data, f _j Representing the jth characteristic data x _j J is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federal neural network model;

s2: the initiator client calculates the batch processing number m according to the number A of samples used for training the transverse Federal neural network model and the batch processing size p of the initiator,

，

the method comprises the steps that the completion is expressed, the completion is sent to a client side of a participant, a client side of an initiator divides samples, which are used for training a transverse federated neural network model, of the initiator into m batches according to a batch processing size p, and the batch processing size p represents that p samples can be processed at most in single batch processing;

the participating client divides samples of the method for training the transverse federated neural network model into m batches according to the batch processing size q, wherein the batch processing size q represents that at most q samples can be processed in single batch processing;

s3: the initiator client side substitutes m batches of samples of the initiator client side into the transverse federated neural network model of the initiator client side to train, and m batches of samples are processed; the method for completing one batch processing by the initiator client comprises the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;

the participator client side substitutes m batches of samples of the local side into the transverse federal neural network model of the local side respectively for training to finish m batches of processing; the method for completing one batch processing by the client side of the participant comprises the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of a participant client for training;

the initiator client side calculates the weight coefficient corresponding to the jth characteristic data contained in the ith batch processed transverse federal neural network model under the cooperation of the participant client sideGradient g of polymerization _ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m:

N2: initiator client to average gradient ga _ij Adding noise N to obtain noisy gradient gan _ij ，gan _ij =ga _ij +N；

Participant client gives the average gradient gb _ij Adding noise N, resulting in a noisy gradient gbn _ij ，gbn _ij =gb _ij +N；

Noise N is 0 for compliance expectation and variance is

Normally distributed random noise of

Wherein σ is a standard deviation, and C is a noise coefficient;

S4: the initiator client calculates the mean value of m aggregation gradients corresponding to the weight coefficient of each feature data to obtain an average aggregation gradient corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of each feature data, and gives the latest value to the weight coefficient of each feature data contained in the lateral Federal neural network model of the local side;

the initiator client calculates the latest value of the weight coefficient of the jth feature data according to the learning rate mu and the average aggregation gradient corresponding to the weight coefficient of the jth feature data, and gives the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model according to the following formula:

，

wherein f is _j Is the weight coefficient, gm, of the jth feature data _j The average aggregation gradient corresponding to the weight coefficient of the jth characteristic data;

Step N1 includes the following steps:

m1: the initiator client inputs each sample (including characteristic data) subjected to the ith batch processing into a transverse federated neural network model of the initiator client, and calculates a gradient corresponding to a weight coefficient of each characteristic data;

the initiator client inputs the ith sample subjected to batch processing for the ith time into the ith gradient gak corresponding to the weight coefficient of the jth feature data calculated after the transverse federated neural network model of the initiator client is trained _ijr The method comprises the following steps that r is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p:

inputting the ith sample subjected to batch processing for the ith time into the transverse federal neural network model of the local side for training, and carrying out model function L (f, x) of the transverse federal neural network model on the jth characteristic data x _j Is a weight ofNumber f _j Obtaining the jth characteristic data x by calculating the partial derivative _j Weight coefficient f of _j Corresponding r-th gradient gak _ijr ，

；

the initiator client maps the r gradient gak corresponding to the weight coefficient of the j characteristic data _ijr Normalizing to obtain the r normalized gradient gas _ijr The calculation formula of (a) is as follows:

，

G=[gak _ij1 ，gak _ij2 ，……gak _iju ]，

wherein the content of the first and second substances,

a 2-norm representing vector G;

；

f1: the participator client inputs each sample (including characteristic data) subjected to the ith batch processing into a transverse federal neural network model of the participator for training, and then calculates the gradient corresponding to the weight coefficient of each characteristic data;

the client side of the participator inputs the ith sample processed in batch for the ith time into the tth gradient gbk corresponding to the weight coefficient of the jth feature data calculated after the training of the transverse federal neural network model of the participator _ijt The method comprises the following steps that t is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch, and q is more than or equal to 1 and less than or equal to q:

inputting the ith sample of the ith batch processing into the transverse federal neural network model of the local side for training, and carrying out model function L (f, x) of the transverse federal neural network model on the jth characteristic data x _j Weight coefficient f of _j Obtaining the jth characteristic data x by calculating the partial derivative _j Weight coefficient f of _j Corresponding t-th gradient gbk _ijt ，

；

After the participator client inputs all v samples subjected to the ith batch processing into the transverse federated neural network model of the participator respectively for training, v gradients corresponding to the weight coefficient of each feature data can be calculated, wherein v gradients corresponding to the weight coefficient of the jth feature data are gbk _ij1 、gbk _ij2 、……gbk _ijv ；

t-th gradient gbk corresponding to weight coefficient of j-th feature data _ijt The t-th normalized gradient obtained by performing the normalization process was gbs _ijt ；

。

In the scheme, the initiator client and the participant client initialize the same transverse federal neural network model and weight coefficients of each feature data contained in the transverse federal neural network model, the transverse federal neural network model contains d feature data, and each sample used for training the transverse federal neural network model by the initiator client and the participant client also contains the same d feature data.

The initiator client and the participant client synchronously perform batch processing for m, samples used for training the transverse federated neural network model are divided into m batches, and batch alignment is completed. Then, the initiator client and the participant client substitute m batches of samples of the local into the transverse federal neural network model of the local for training, and respectively complete m batches of processing, the initiator client calculates a noisy gradient corresponding to a weight coefficient of each feature data contained in the transverse federal neural network model of the local after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gan; the participator client calculates a noisy gradient corresponding to the weight coefficient of each feature data contained in the local horizontal federal neural network model after each batch processing, and the weight coefficient of each feature data corresponds to m noisy gradients gbn; the initiator client aggregates the noisy gradient gan corresponding to the weight coefficient of each piece of feature data obtained after batch processing of the same batch with the noisy gradient gbn to obtain an aggregated gradient g corresponding to the weight coefficient of each piece of feature data.

Then, the initiator client calculates the mean of m aggregation gradients corresponding to the weight coefficient of each feature data, calculates the latest value of the weight coefficient of each feature data in combination with the learning rate μ, and gives the latest value to the weight coefficients, the initiator client sends the calculated latest value of the weight coefficient of each feature data to the participant client, and the participant client gives the latest value to the weight coefficient of each feature data. And (5) repeatedly executing the steps S3-S4T times to finish the training of the transverse federal neural network model.

Since the added noise N is desired to be 0 and the variance is

For example, the following steps are carried out:

the transverse federal neural network model comprises characteristic data of income, age, monthly telephone charge, monthly repayment amount and arrearage total amount; the characteristic data contained in the sample used for training the transverse federal neural network model by the initiator client is income, age, monthly telephone charge, monthly repayment amount and arrears total amount; the participator client is used for training samples of the transverse federated neural network model to contain characteristic data of income, age, monthly telephone charge, monthly repayment amount and arrearage total amount;

the batch processing numbers of the initiator client and the participant client are both 5,

the method comprises the steps that an initiator client calculates an average gradient corresponding to a weight coefficient of income characteristic data of the local horizontal federated neural network model after each batch processing to obtain an average gradient (0.2,0.3,0.25,0.4,0.6) corresponding to 5 batches of batch processing, the initiator client adds a corresponding noise N to the average gradient corresponding to each batch, the noise to be added to the average gradient corresponding to 5 batches of batch processing is (0.3,0.2,0, -0.1, -0.3), and the noise gradient obtained after the noise N is added to the average gradient corresponding to 5 batches of batch processing is (0.5,0.5,0.25,0.3, 0.3);

the participator client calculates the average gradient corresponding to the weight coefficient of the income characteristic data of the local horizontal federated neural network model after each batch processing to obtain the average gradient (0.6,0.35,0.4,0.2,0.5) corresponding to 5 batches of batch processing, the participator client adds a corresponding noise N to the average gradient corresponding to each batch processing, the noise to be added to the average gradient corresponding to 5 batches of batch processing is (-0.2,0.3,0.1,0.2, -0.3), and the noise gradient (0.4,0.65,0.5,0.4,0.2) is obtained after the noise N is added to the average gradient corresponding to 5 batches of batch processing;

the aggregation gradient corresponding to 5 batches with the addition of noise N is ((0.5,0.5,0.25,0.3,0.3) + (0.4,0.65,0.5,0.4, 0.2))/2 = (0.45,0.575,0.375,0.35,0.25), and the mean value is 0.4, i.e., the average aggregation gradient is 0.4;

the aggregation gradient corresponding to 5 batches without the addition of noise N is ((0.2,0.3,0.25,0.4,0.6) + (0.6,0.35,0.4,0.2, 0.5))/2 = (0.4,0.325,0.325,0.3,0.55), and the mean value is 0.38, i.e., the actual average aggregation gradient is 0.38;

it can be seen that after the gradient of the additive noise N is aggregated, most of the noise is cancelled, and the finally calculated average aggregated gradient value has a small deviation from the actual average aggregated gradient value. The data security is protected, and meanwhile, the gradient calculation complexity is basically consistent with that of a plaintext, so that the model training is very efficient, and the large-scale commercial use is convenient to realize.

Claims

1. A safe and efficient horizontal federated neural network model training method is characterized by comprising the following steps:

the initiator client calculates the horizontal direction subjected to the ith batch processing under the coordination of the participant clientsAggregation gradient g corresponding to weight coefficient of jth characteristic data contained in federated neural network model _ij The method comprises the following steps that i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to d, and d is the number of characteristic data contained in the transverse federated neural network model:

n1: the initiator client calculates an average gradient ga corresponding to a weight coefficient of jth characteristic data of the local horizontal federated neural network model subjected to ith batch processing _ij The participator client calculates the average gradient gb corresponding to the weight coefficient of the jth characteristic data of the local horizontal federated neural network model which is subjected to the ith batch processing _ij ；

Noise N is 0 for compliance expectation and variance is

Wherein σ is a standard deviation, and C is a noise coefficient;

2. The safe and efficient method for training the horizontal federated neural network model as claimed in claim 1, wherein the step S2 includes the following steps:

，

3. The safe and efficient training method for the horizontal federated neural network model according to claim 1 or 2, wherein the initiator client completing one batch process in step S3 includes the following steps: inputting the characteristic data contained in each sample of the current batch into a transverse federal neural network model of the initiator client for training;

the step S3 of completing the batch processing by the participant client includes the following steps: and inputting the characteristic data contained in each sample of the current batch into a lateral federal neural network model of the client side of the participant for training.

4. The safe and efficient method for training the transverse federated neural network model as claimed in claim 2, wherein the step N1 includes the following steps:

m1: the initiator client inputs each sample subjected to the ith batch processing into the transverse federal neural network model of the initiator client and then calculates the gradient corresponding to the weight coefficient of each feature data, and the r gradient corresponding to the weight coefficient of the jth feature data calculated after the ith batch processing of the ith sample is input into the transverse federal neural network model of the initiator client is gak _ijr R is more than or equal to 1 and less than or equal to u, u is the total number of samples contained in the ith batch, and u is more than or equal to 1 and less than or equal to p;

；

f1: the participator client inputs each sample subjected to the ith batch processing into the transverse federal neural network model of the local side for training, calculates the gradient corresponding to the weight coefficient of each feature data, and calculates the t-th gradient corresponding to the weight coefficient of the jth feature data after the ith batch processing sample is input into the transverse federal neural network model of the local side for training as gbk _ijt T is more than or equal to 1 and less than or equal to v, v is the total number of samples contained in the ith batch processing, and v is more than or equal to 1 and less than or equal to q;

。

5. The safe and efficient transverse federated neural network model training method of claim 4, wherein in the step M1, the initiator client inputs the ith batch processed sample into the transverse federated neural network model of the present party for training and calculatesThe r-th gradient gak corresponding to the weight coefficient of the j-th feature data _ijr The method comprises the following steps:

6. The safe and efficient transverse federated neural network model training method of claim 4, wherein in the step M2, the initiator client assigns the jth gradient gak corresponding to the weight coefficient of the jth feature data _ijr Normalizing to obtain the r normalized gradient gas _ijr The calculation formula of (a) is as follows:

，

G=[gak _ij1 ,gak _ij2 ,……gak _iju ]，

wherein the content of the first and second substances,

representing the 2 norm of the vector G.

7. The safe and efficient horizontal federated neural network model training method according to claim 1 or 2, wherein, in step S4, the initiator client calculates a latest value of the weight coefficient of the jth feature data according to the learning rate μ and an average aggregation gradient corresponding to the weight coefficient of the jth feature data, and assigns the latest value to the weight coefficient of the jth feature data included in the local horizontal federated neural network model according to the following formula:

，