CN114358307A

CN114358307A - Federal learning method and device based on differential privacy law

Info

Publication number: CN114358307A
Application number: CN202111447012.5A
Authority: CN
Inventors: 高志鹏; 杨杨; 芮兰兰; 段应文; 赵晨; 莫梓嘉; 林怡静
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2022-04-15

Abstract

The embodiment of the application provides a federal learning method and a device based on a differential privacy law, wherein the federal learning method based on the differential privacy law comprises the following steps: starting to monitor signal receiving condition data in a target period based on a stage completion signal corresponding to any one of a plurality of currently received training data for federal learning; if the monitored signal receiving condition data in the target period meet the preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the target period, and training a target model based on a differential privacy method and the stage training result data. According to the method and the device, under the premise that differential privacy protection is provided for the federal learning process of the target model, the interference of the effect of a person falling behind is effectively avoided, the waste of computing resources is avoided, the convergence rate of the target model can be increased, and the training efficiency and the application timeliness of the target model can be effectively increased.

Description

Federal learning method and device based on differential privacy law

Technical Field

The application relates to the technical field of data processing, in particular to a federal learning method and a device based on a differential privacy law.

Background

The method comprises the steps that after a server obtains a target model according to local model training sent by each participating node, the updated target model and parameters are sent to the participating nodes, and gradient information of the model can be intercepted by a malicious user and privacy data of the participating nodes can be deduced reversely in the process. In order to prevent the leakage of gradient information and further improve the security of federal learning, a differential privacy technology can be added into the federal learning process. The differential privacy protection is a privacy protection technology based on data distortion, and a mode of adding noise to data is adopted to obscure the data and cover up sensitive data information, so that the data cannot be restored.

Currently, the federal learning process based on the differential privacy technology is as follows: all the participating nodes based on synchronous communication execute the same local training times, and after the server receives the local models sent by all the participating nodes, difference privacy protection is provided for the whole federal system by introducing Gaussian noise, so that the intrusion behaviors of external attackers such as background knowledge attack and the like can be resisted.

However, synchronous communication has obvious limitations on resource utilization and configuration of federal learning based on a differential privacy technology, and a synchronous communication mode is adopted to realize model aggregation in the federal learning process, and due to factors such as system differences among various participating nodes of synchronous communication, particularly in a heterogeneous network environment, the convergence speed of an aggregation model is greatly influenced by the 'falling behind effect', so that the existing federal learning process based on the differential privacy technology has the problems of long waiting time of the participating nodes, waste of computing resources and the like, and further causes the problems of low convergence speed and poor training efficiency of a target model in the federal learning.

Disclosure of Invention

Aiming at the problems in the prior art, the application provides a federal learning method and a device based on a differential privacy law, which can effectively avoid the interference of the effect of a person falling behind on the premise of providing differential privacy protection for the federal learning process of a target model, avoid the waste of computing resources, improve the convergence rate of the target model and further effectively improve the training efficiency and the application timeliness of the target model.

In order to solve the technical problem, the application provides the following technical scheme:

in a first aspect, the present application provides a federal learning method based on a differential privacy law, including:

starting to monitor signal receiving condition data in a target period based on a stage completion signal corresponding to any one of a plurality of currently received training data for federal learning;

if the signal receiving condition data in the target period meet a preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the target period, and training a target model based on a differential privacy method and the stage training result data.

Further, the obtaining of the stage training result data corresponding to each of the stage completion signals received in the target period, and training the target model based on each of the stage training result data includes:

acquiring respective stage training result data of each stage completion signal received in the target period, wherein the stage training result data comprises a local model and a model update value;

respectively acquiring the relative old value of each local model according to the training result data of each stage of the target cycle, wherein the relative old value is the ratio of the total step length of the corresponding local model in the target cycle to the step length difference of the target model, and the step length difference of the target model is the difference between the total step length of the target model in the target cycle and the total step length of the previous cycle;

and training the target model based on a differential privacy method, the corresponding relative old value and the model update value of each local model.

Further, before the monitoring of the signal reception condition data in the target period is started based on the phase completion signal corresponding to any one of the currently received multiple sets of training data for federal learning, the method further includes:

and respectively sending the same stage training time threshold to each federal learning participation node which is set in a distributed mode, so that each federal learning participation node sends a corresponding stage completion signal and carries out local model training again based on the stage training time threshold after the times for training a local model based on local training data of each federal learning participation node meet the stage training time threshold.

Further, the starting of monitoring the signal reception condition data in the target period based on the phase completion signal corresponding to any one of the currently received multiple pieces of training data for federal learning includes:

if a stage completion signal sent by any one of the federal learning participation nodes in distributed arrangement is received, taking the stage completion signal as a first signal of the current target period and starting to monitor signal receiving condition data in the target period;

judging whether the signal receiving condition data in the target period meet a preset semi-synchronous training rule in real time, wherein the semi-synchronous training rule comprises the following steps: the total number of the phase complete signals received within the target period has reached a total number of signals threshold and/or the duration of the target period has reached a period duration threshold.

Further, the obtaining of the stage training result data of each stage completion signal received in the target period includes:

ending the monitoring for the current target period;

respectively sending a stage training result request to the federal learning participation nodes which send out each stage completion signal in the target period based on a preset signal loss prevention communication protocol, so that each federal learning participation node which receives the stage training result request respectively sends a local model corresponding to each stage completion signal based on the signal loss prevention communication protocol;

and receiving each local model, and acquiring a model update value corresponding to each local model.

Further, the training the target model based on the differential privacy method, the relative old value and the model update value corresponding to each local model includes:

acquiring a second-order norm corresponding to each local model;

respectively generating global sensitivities corresponding to the local models according to the relative aging values and the second-order norms corresponding to the local models;

respectively cutting model update values corresponding to the local models based on the global sensitivities corresponding to the local models, and aggregating the cut model update values to obtain model update aggregate values;

and setting Gaussian noise in the process of training the target model according to the model update aggregation value and each local model based on a differential privacy method, and stopping training the target model when the privacy budget corresponding to the Gaussian noise is 0.

Further, after training the target model based on the differential privacy method and the training result data of each stage, the method further includes:

judging whether the target model meets the preset convergence requirement or not, if not, re-receiving a stage completion signal corresponding to any one of the multiple training data sets, starting monitoring for signal receiving condition data in a new target period, if the signal receiving condition data in the new target period meets the preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the new target period, and re-training the target model based on a differential privacy method and each stage training result data.

In a second aspect, the present application provides a federal learning device based on a differential privacy law, including:

the signal monitoring module is used for starting to monitor signal receiving condition data in a target period based on a stage completion signal corresponding to any one of the currently received multiple pieces of training data for federal learning;

and the semi-synchronous training module is used for acquiring stage training result data corresponding to each stage completion signal received in the target period if the signal receiving condition data in the target period is monitored to meet a preset semi-synchronous training rule, and training a target model based on a differential privacy method and the stage training result data.

In a third aspect, the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the differential privacy laws-based federal learning method when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the differential privacy laws-based federated learning method described herein.

In a fifth aspect, the present application provides a federated learning system, comprising: the system comprises a server and all federal learning nodes which are in communication connection with the server respectively;

the server is used for executing the differential privacy law-based federal learning method;

and the federated learning node is used for sending a corresponding stage completion signal to the server and carrying out local model training again based on the stage training time threshold after the times of training the local model based on the local training data respectively meet the corresponding stage training time threshold.

According to the technical scheme, the federal learning method based on the differential privacy law provided by the application starts to monitor signal receiving condition data in a target period based on a stage completion signal corresponding to any one of a plurality of currently received training data for federal learning; if the signal receiving condition data in the target period meet the preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the target period, training a target model based on a differential privacy method and each stage training result data, and performing semi-synchronous model training by controlling the participator nodes with normal partial speed of the current network environment to perform semi-synchronous model training, so that the interference of the effect of a fall behind can be effectively avoided, the waste of computing resources of the participator nodes is avoided, the convergence speed of the target model can be improved, and the training efficiency and the application timeliness of the target model can be effectively improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a federal learning system in an embodiment of the present application.

Fig. 2 is a first flowchart of a federal learning method based on a differential privacy law in an embodiment of the present application.

Fig. 3 is a second flowchart of the federal learning method based on the differential privacy law in the embodiment of the present application.

Fig. 4 is a third flowchart of the federal learning method based on the differential privacy law in the embodiment of the present application.

Fig. 5 is a schematic structural diagram of a federal learning device based on a differential privacy law in an embodiment of the present application.

Fig. 6 is a schematic logic diagram illustrating an example of a differential privacy federated learning framework based on semi-synchronous communication provided in an application example of the present application.

Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The Federal learning is essentially a distributed machine learning framework which meets privacy protection and data safety and supervision, and realizes common modeling and establishment of a virtual common model on the basis of ensuring data privacy safety and legal compliance through a parameter exchange mode under an encryption mechanism, so that the effect of an AI model is improved. The work steps of federal learning can be divided into three parts: selecting a model. The central server pre-trains a model, initializes the parameters of the model, and then distributes the parameters and the model to each client participating in federal learning; and secondly, training a local model. After receiving the model and the parameters, the client can establish the model and then train by using local data; and polymerizing the model. And after each client side is trained for a certain number of times locally, the trained model parameters are sent to the server. The server receives the parameters of each client and then performs certain processing (such as averaging), then updates the model of the server by using the processed parameters, and finally sends the updated global model and parameters to each client.

In the process that the server uploads the global model and the parameters to the client, gradient information of the model can be intercepted by a malicious user and the privacy data of the user can be deduced reversely. To prevent the leakage of gradient information and further improve the security of federal learning, many existing efforts have incorporated differential privacy techniques into the federal learning system. Differential privacy is a new privacy definition proposed in 2006 for the privacy disclosure problem of databases. The differential privacy protection is a privacy protection technology based on data distortion, and a mode of adding noise to data is adopted to obscure the data and cover up sensitive data information, so that the data cannot be restored. The differential privacy protection theory reduces the risk of data leakage, is simple to operate and is not limited by the size of data volume. The method mainly ensures that the result of the inquiry request disclosing the visible information can not reveal the individual privacy information by using random noise, namely, the method maximizes the accuracy of data inquiry and simultaneously reduces the opportunity of identifying the record of the data inquiry when inquiring from a statistical database, and simply speaking, removes the individual characteristics on the premise of keeping the statistical characteristics to protect the privacy of the user.

The basic principle of differential privacy is as follows: when a user (and possibly a hidden attacker) submits a query request to a data provider, privacy disclosure may result if the data provider directly issues accurate query results, because the user may reverse the privacy information through the query results. In order to avoid the problem, the differential privacy system requires that a middleware is extracted from a database, and a proper amount of noise is injected into the middleware by using a specially designed random algorithm to obtain a noisy middleware; and deducing a noisy query result by the noisy middleware, and returning the noisy query result to the user. Therefore, even if an attacker can reversely deduce the noisy middleware from the noisy result, the attacker cannot accurately deduce the noiseless middleware and can not reason the original database, and the purpose of protecting the privacy is achieved.

The core idea of differential privacy is as follows: for two data sets that differ by one record, the probability of querying them to obtain the same result is determined by the privacy budget (i.e., the degree of privacy protection measured by the user). When the privacy budget is smaller, the probability distribution of the query results returned by the differential privacy algorithm acting on a pair of adjacent data sets is more similar, the more difficult it is for an attacker to distinguish the pair of adjacent data sets, and the higher the protection degree is. When the privacy budget is 0, the attacker cannot distinguish the pair of adjacent data sets, which means that the protection degree is the highest.

Compared with the traditional federal learning system, the difference privacy federal learning system mainly modifies the model aggregation. The modified steps are as follows: and after each client is trained for a certain number of times locally, sending the trained model parameters and the gradient update values to the server. After receiving the gradient update values of the clients, the server firstly performs gradient cutting, and limits the median values of the gradient update values of all users by using a second-order norm; then, aggregating the clipped gradient update values (such as averaging or weighted aggregation); the server then adds noise to the aggregated model to provide differential privacy protection. And finally, sending the updated model and parameters to each client.

In the model aggregation part, a synchronous communication mode is adopted: after all the clients are trained locally for the same number of times (the specific training number is determined by the server), the trained model parameters are uniformly sent to the server.

The differential privacy Federal learning system provides differential privacy protection for the whole Federal system by introducing Gaussian noise, so that the intrusion behaviors of external attackers such as background knowledge attack and the like can be resisted. However, existing experiments have shown that the accuracy and convergence rate of the entire federal model are significantly reduced after gaussian noise is added. And traditional federal learning is based on synchronous communications, all participants perform the same number of local training sessions. But due to heterogeneous external network settings such as hardware devices, network connections, and device power, the characteristics of the federated system are significantly different. Systematic differences result in significant differences between different participants in the time of performing model training operations and the time of uploading the model to the server, whereas in real-world conditions, heterogeneous network environments amplify the "fall behind effect": the faster participant has to wait for the slower participant (the slowest participant may never upload the local model parameters due to network instability and the participant may quit at any time). The faster participants (and often participants with rich computing and network resources) do not train the model during the waiting period, which results in huge wasted computing resources. Synchronous communication imposes significant restrictions on federally learned resource utilization and configuration, which is more strongly restricted to differential privacy federally learned due to the presence of privacy budgets. Therefore, the federal learning system at the present stage adopts a synchronous communication mode to realize model aggregation, and is easily interfered by a 'squander effect' in a heterogeneous network environment to greatly influence the convergence speed of an aggregation model. Therefore, it is a long-term challenge how to accelerate the convergence speed of the federal model on the premise of providing differential privacy protection, how to provide a protection method meeting the scene requirements on the premise of contradicting privacy protection and data availability, minimize the leakage risk of user privacy in machine learning, and improve the training efficiency and reliability of the federal model.

Based on the semi-synchronous communication method, the semi-synchronous communication among all the participated nodes is realized by adopting a brand-new method for determining the semi-synchronous time nodes, then at least one stage semi-synchronous training is carried out on the model through the semi-synchronous communication mode, the semi-synchronous model training is carried out by controlling the participated nodes with higher speed of the normal part of the current network environment, the interference of the effect of a person falling behind can be effectively avoided, the waste of computing resources is avoided, the convergence speed of the target model can be improved, and the training efficiency and the application timeliness of the target model can be effectively improved.

Based on the above, the present application further provides a differential privacy law-based federal learning device for implementing the differential privacy law-based federal learning method provided in one or more embodiments of the present application, where the differential privacy law-based federal learning device may be a server, as shown in fig. 1, the differential privacy law-based federal learning device may be in communication connection with each client device by itself or through a third-party server, the client devices are federal learning participation nodes, and a federal learning system is formed between the federal learning participation nodes and the differential privacy law-based federal learning device, and the differential privacy law-based federal learning device may receive phase completion signals sent by the client devices after the number of times that the local model is trained based on local training data respectively satisfies the phase training number threshold, then, starting to monitor signal receiving condition data in a target period based on a stage completion signal corresponding to any one of the currently received multiple pieces of training data for federal learning; if the signal receiving condition data in the target period meet a preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the target period, and training a target model based on a differential privacy method and the stage training result data. Then, the federal learning device based on the differential privacy law can also judge whether the target model currently meets the preset convergence requirement, if not, re-receiving the stage completion signal corresponding to any one of the plurality of training data, starting monitoring the signal receiving condition data in the new round of target period, if the signal receiving condition data in the new round of target period is monitored to meet the preset semi-synchronous training rule, then the phase training result data corresponding to each phase completion signal received in the new round of target period is obtained, and training the target model again based on the differential privacy method and the training result data of each stage until the target model currently meets the preset convergence requirement, and finishing the training, and respectively sending the finally obtained training models to the user for practical application.

It can be understood that the target model may be a machine learning model for performing data prediction, recognition, classification, or the like, for example, a deep neural network model for performing face recognition, so that interference of a fall behind effect is effectively avoided, waste of computing resources is avoided, convergence rate of the target model can be increased, and training efficiency of the deep neural network model and timeliness of application of the face recognition can be effectively increased on the premise that a difference privacy protection is provided for a federal learning process of the target model by a federal learning method based on a difference privacy law.

The part of the foregoing federal learning device under the differential privacy law for the federal learning under the differential privacy law may be executed in the server as described above, but in another practical application scenario, all operations may be performed in the client device. The selection may be specifically performed according to the processing capability of the client device, the limitation of the user usage scenario, and the like. This is not a limitation of the present application. If all the operations are completed in the client device, the client device may further include a processor for performing a specific process of federal learning based on a differential privacy law.

It is understood that the client device may include any mobile device capable of loading applications, such as a smart phone, a tablet electronic device, a network set-top box, a portable computer, a Personal Digital Assistant (PDA), a vehicle-mounted device, a smart wearable device, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..

The client device may have a communication module (i.e., a communication unit), and may be communicatively connected to a remote server to implement data transmission with the server. The server may include a server on the task scheduling center side, and in other implementation scenarios, the server may also include a server on an intermediate platform, for example, a server on a third-party server platform that is communicatively linked to the task scheduling center server. The server may include a single computer device, or may include a server cluster formed by a plurality of servers, or a server structure of a distributed apparatus.

The server and the client device may communicate using any suitable network protocol, including a network protocol that has not been developed at the filing date of the present application. The network protocol may include, for example, a TCP/IP protocol, a UDP/IP protocol, an HTTP protocol, an HTTPS protocol, or the like. Of course, the network Protocol may also include, for example, an RPC Protocol (Remote Procedure Call Protocol), a REST Protocol (Representational State Transfer Protocol), and the like used above the above Protocol.

The following embodiments and application examples are specifically and individually described in detail.

In order to solve the problems of long waiting time of participating nodes, waste of computing resources and the like in the existing federated learning process based on a differential privacy technology, and further to cause the problems of low convergence rate, poor training efficiency and the like of a target model in federated learning, the application provides an embodiment of a federated learning method based on a differential privacy law, and referring to fig. 2, the federated learning method based on the differential privacy law, which is executed by the federated learning device based on the differential privacy law, specifically includes the following contents:

step 100: and starting to monitor signal receiving condition data in a target period based on a stage completion signal corresponding to any one of the currently received multiple pieces of training data for federal learning.

In step 100, after the number of times that each federal learning participation node in the federal learning system trains the local model based on the local training data meets the phase training number threshold, each federal learning participation node sends a corresponding phase completion signal to the federal learning device based on the differential privacy law, and when the federal learning device based on the differential privacy law receives the first phase completion signal, the federal learning device based on the differential privacy law starts to generate a new target period for signal monitoring and starts to monitor signal receiving condition data in the newly established target period.

It is understood that the signal reception condition data refers to: the federal learning device based on the differential privacy law completes the source and the total number of signals of each received stage after the monitoring starts, and can also record the duration of a target period so as to be convenient for whether the preset semi-synchronous training rule is met or not in the following process.

Step 200: if the signal receiving condition data in the target period meet a preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the target period, and training a target model based on a differential privacy method and the stage training result data.

It can be understood that the semi-synchronous training rule refers to a semi-synchronous communication rule between all federal learning participating nodes for federal learning, that is, only part of stage training result data of the federal learning participating nodes needs to be adopted to start stage training of the target model. Therefore, the semi-synchronous model training can be performed on the participating nodes with higher speed in the normal part of the current network environment, the waste of computing resources is avoided, and the convergence speed of the target model can be improved.

It can be understood that the stage training result data refers to local models and related data obtained by the federal learning participating nodes through training by using model parameters and local training data sent by the federal learning device based on the differential privacy law in advance, and the local models corresponding to the federal learning participating nodes are used for final clustering by the federal learning device based on the differential privacy law to generate target models, so the target models may be called global models compared with the local models.

From the above description, it can be seen that the federal learning method based on the differential privacy law provided by the embodiments of the present application, when the stage completion signal receiving condition data in the target period is monitored to meet the preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the target period, training a target model based on a preset differential privacy method and training result data of each stage, the semi-synchronous model training is carried out through the participatory nodes with higher speed of the normal part of the current network environment, the interference of the effect of a person falling behind can be effectively avoided on the premise of providing differential privacy protection for the federal learning process of the target model, the waste of computing resources is avoided, the convergence speed of the target model can be improved, and then the training efficiency and the application timeliness of the target model can be effectively improved.

In addition, because the number of times of local training of each participant is different, a certain difference exists between the local model generated by each node and the ideal model. For example, in extreme casesNext, node A may have only trained once locally while node B has trained E locally_maxThen, the local model trained by the node A and the node B will necessarily deviate from the ideal model.

That is to say, in order to resist model inference attacks, model extraction attacks, backdoor attacks and the like and improve differential privacy protection for the federated learning system, the accuracy of the federated aggregation model is seriously reduced by directly adding a noise disturbance algorithm to the aggregation model, so that the final aggregation model is in a non-available state, the federated system is unbalanced in data availability and privacy protection, and the large-scale use of the differential privacy federated learning system in an enterprise is limited.

Based on this, after the problems of long waiting time of participating nodes, waste of computing resources and the like in the federated learning process based on the differential privacy technology in the prior art are solved, and then the problems of low convergence rate, poor training efficiency and the like of a target model in federated learning are caused, in order to solve the problems that the local model training effect difference is large and the target model accuracy is poor and the like caused by different times of local training of each participating node in the prior art, a new concept is also provided in the application: relative staleness. The global sensitivity is limited by adopting the relative obsolescence degree, so that the range of gradient clipping is more accurately limited, and the high-precision Federal aggregation model is generated on the premise that the whole Federal learning system provides differential privacy protection.

For the above contents, in order to solve the problem in the prior art that the training effect of the local model is greatly different due to different times of local training of each participating node, and thus the accuracy of the target model is poor, and the like, in an embodiment of the federal learning method based on the differential privacy law provided by the present application, referring to fig. 3, step 200 of the federal learning method based on the differential privacy law specifically includes the following contents:

step 210: and acquiring the stage training result data of each stage completion signal received in the target period, wherein the stage training result data comprises a local model and a model updating value.

Step 220: and respectively acquiring the relative old value of each local model according to the training result data of each stage of the target cycle, wherein the relative old value is the ratio of the total step length of the corresponding local model in the target cycle to the step length difference of the target model, and the step length difference of the target model is the difference between the total step length of the target model in the target cycle and the total step length of the previous cycle.

In particular, the present application proposes a relative aging value RS_iTo characterize the portion of the difference. Order:

namely, it is

The total step length of the target model in the t round is obtained;

namely, it is

The total step length of the target model in t +1 round;

namely, it is

The total step length of the client i in t +1 round is obtained;

therefore, there are:

step 230: and training the target model based on a differential privacy method, the corresponding relative old value and the model update value of each local model.

From the above description, it can be seen that the federated learning method based on the differential privacy law provided in the embodiment of the present application limits the global sensitivity by using the relative obsolescence degree, and can solve the problems that in the prior art, the training effect of the local model is greatly different due to different times of local training of each participating node, and thus the accuracy of the target model is poor, and the like, thereby more accurately limiting the range of gradient clipping, and ensuring that the whole federated learning system generates a high-precision federated aggregation model on the premise of providing differential privacy protection. And further, the training precision, the application accuracy and the effectiveness of the target model can be further improved on the basis of improving the training efficiency and the application timeliness of the target model.

In order to further solve the problem that the local model training effect is greatly different due to different times of local training of each participating node, in an embodiment of the federal learning method based on the differential privacy law provided by the present application, referring to fig. 4, before step 100 of the federal learning method based on the differential privacy law, the following contents are further specifically included:

step 010: and respectively sending the same stage training time threshold to each federal learning participation node which is set in a distributed mode, so that each federal learning participation node sends a corresponding stage completion signal and carries out local model training again based on the stage training time threshold after the times for training a local model based on local training data of each federal learning participation node meet the stage training time threshold.

In step 010, the phase training times are specified in advance, so that the problem that the local model training effect is greatly different due to different local training times of each participating node is solved.

In this application, the step 010 and the step 210-.

In order to improve the reliability and effectiveness of the signal reception condition data in the monitoring target period, in an embodiment of the federal learning method based on the differential privacy law provided in the present application, step 100 of the federal learning method based on the differential privacy law specifically includes the following contents:

step 110: and if a phase completion signal sent by any one of the distributed federal learning participation nodes is received, taking the phase completion signal as a first signal of the current target period and starting to monitor signal receiving condition data in the target period.

Step 120: judging whether the signal receiving condition data in the target period meet a preset semi-synchronous training rule in real time, wherein the semi-synchronous training rule comprises the following steps: the total number of the phase complete signals received within the target period has reached a total number of signals threshold and/or the duration of the target period has reached a period duration threshold.

Specifically, all participants train the base number of times locally (e.g., the base is set to five rounds), and the participant who completed training sends a completion signal to the central administrator indicating that the base number of times has been completed, and then continues to train locally. The central administrator starts a timer after receiving the first completion signal (the specific set time is related to the node average completion speed of the actual federal system). The central administrator receives the completion signals from the participants until the number of completion signals reaches a target value or the timer reaches a timing time (i.e., a maximum endurance time), and then the central administrator stops receiving the subsequent completion signals that may exist, and sends participation signals (representing that the round is allowed to participate in the federal aggregation process) to the participating nodes that have sent the completion signals before, and waits for the participants to upload their respective local models.

In order to improve reliability and effectiveness of obtaining the stage training result data of each stage completion signal received in the target period, in an embodiment of the differential privacy law-based federal learning method provided in the present application, step 210 of the differential privacy law-based federal learning method specifically includes the following contents:

step 211: ending the monitoring for the current target period;

step 212: respectively sending a stage training result request to the federal learning participation nodes which send out each stage completion signal in the target period based on a preset signal loss prevention communication protocol, so that each federal learning participation node which receives the stage training result request respectively sends a local model corresponding to each stage completion signal based on the signal loss prevention communication protocol;

it is to be understood that the loss of signal communication protocol may employ the TCP/IP protocol or the like. In order to provide delivery guarantees (e.g., retransmission may occur after loss of the transmission signal) and order guarantees, it is proposed to use the TCP/IP protocol to perform the above-described communication procedures.

Step 213: and receiving each local model, and acquiring a model update value corresponding to each local model.

In order to further improve the accuracy of the differential privacy training model, in an embodiment of the federal learning method based on differential privacy laws provided in the present application, step 230 of the federal learning method based on differential privacy laws specifically includes the following contents:

step 231: and acquiring a second-order norm corresponding to each local model.

Step 232: and respectively generating the global sensitivities respectively corresponding to the local models according to the relative old values and the second-order norms respectively corresponding to the local models.

Step 233: and respectively cutting model updating values corresponding to the local models based on the global sensitivities corresponding to the local models, and aggregating the cut model updating values to obtain model updating aggregation values.

Step 234: and setting Gaussian noise in the process of training the target model according to the model update aggregation value and each local model based on a differential privacy method, and stopping training the target model when the privacy budget corresponding to the Gaussian noise is 0.

For all selected participants, i.e. individual clients, their respective model update values are first calculated

And second order norm delta_i. Because the model updating matrix is too large, the method uses a clipping technology to reduce the range of the updating value of the model by using the global sensitivity S, so that the updating value of the model can be ensured

Is defined within a global sensitivity. The application uses the relative aging value to determine the global sensitivity of each round, and the order is:

therefore, the relative obsolescence degree of the nodes participating in each round is used for determining, the global sensitivity S can be ensured not to be too large or too small, and the range of gradient clipping can be controlled more accurately. Model update value Δ w after polymerization^t+1Comprises the following steps:

wherein p is_i＝RS_iη, representing the contribution of node i, Z_tRepresents the selected node set of the t round, delta_iA second order norm representing the update value of the model of node i,

model update value, Δ w, representing the ith node^t+1Representing the aggregated model update values.

And finally, adding Gaussian noise to the aggregated federal model to provide differential privacy protection. Setting the upper privacy limit to Q, consuming a certain amount of privacy budget every time Gaussian noise is added, and when the privacy budget is consumed, stopping the aggregation and noise adding process of the whole Federal system.

Wherein m is_tRepresenting the number of nodes participated in by the t-th round, N (0, S)²σ²) Representing Gaussian noise, S being global sensitivity, σ²Is the variance of gaussian noise.

In addition, due to the limitation of privacy budget in the differential privacy law, each round of federal aggregation needs to consume a certain degree of privacy budget, and when the privacy budget is consumed to 0, the training process has to be stopped to maintain differential privacy protection. In other words, due to the privacy budget constraints, the federated model typically has not converged yet and the training of the federated learning system has been aborted in advance, thereby severely impacting the ultimate accuracy of the model.

Based on this, after the application solves the problems of long waiting time of participating nodes, waste of computing resources and the like in the federal learning process based on the differential privacy technology in the prior art, and further causes the problems of low convergence rate and poor training efficiency of the target model in the federal learning, in order to further solve the problems that the federal model in the prior art is not converged yet and the training of the federal learning system is stopped in advance, so that the final accuracy of the model is seriously affected, and the like, in an embodiment of the federal learning method based on the differential privacy law provided by the application, the federal learning method based on the differential privacy law further specifically includes the following steps after step 200:

step 300: judging whether the target model currently meets a preset convergence requirement or not;

if not, returning to execute the

steps

100 and 200, namely; and re-receiving a stage completion signal corresponding to any one of the plurality of training data, starting monitoring for signal receiving condition data in a new round of target period, if the signal receiving condition data in the new round of target period is monitored to meet a preset semi-synchronous training rule, acquiring stage training result data corresponding to each stage completion signal received in the new round of target period, and re-training the target model based on a differential privacy method and each stage training result data.

If so, finishing the training, and respectively sending the finally obtained training models to the user for practical application.

Based on the above, the present application further provides a differential privacy law-based federal learning device for implementing the differential privacy law-based federal learning method provided in one or more embodiments of the present application, where the specific implementation of the differential privacy law-based federal learning device may be a client, and in a specific example, see fig. 5, the differential privacy law-based federal learning device specifically includes the following contents:

the signal monitoring module 10 is configured to start to monitor signal reception condition data in a target period based on a stage completion signal corresponding to any one of currently received multiple sets of training data for federal learning;

and a semi-synchronous training module 20, configured to, if it is monitored that the signal reception condition data in the target period meets a preset semi-synchronous training rule, obtain stage training result data corresponding to each stage completion signal received in the target period, and train a target model based on a differential privacy method and the stage training result data.

The embodiment of the federal learning device based on the differential privacy laws provided in the present application may be specifically used to execute the processing procedure of the embodiment of the federal learning method based on the differential privacy laws in the above embodiment, and the functions of the processing procedure are not described herein again, and reference may be made to the detailed description of the embodiment of the federal learning method based on the differential privacy laws.

It can be known from the above description that the federal learning device based on the differential privacy law provided in the embodiment of the present application acquires each of the phase completion signals received in the target period through monitoring that the phase completion signal reception condition data in the target period satisfy the preset semi-synchronous training rule, and trains the target model based on the preset differential privacy law and each of the phase training result data, and performs semi-synchronous model training through controlling the faster participating node of the part of the current network environment normal speed, can effectively avoid the interference of the fall behind effect, avoid the waste of computing resources and can improve the convergence speed of the target model, and further can effectively improve the training efficiency and the application timeliness of the target model.

In order to further explain the scheme, the application also provides a differential privacy federal learning framework based on semi-synchronous communication, and the application example aims to provide the differential privacy federal learning framework based on semi-synchronous communication on the basis of the existing federal learning system. The method is characterized in that a brand-new method for determining semi-synchronous time nodes is provided to realize a semi-synchronous communication-based federal learning system, a fair hierarchical weighted random client selection algorithm is provided to select proper clients to participate in federal aggregation, the relative staleness is used for limiting the global sensitivity so as to more accurately limit the range of gradient clipping, and therefore the whole federal learning system is guaranteed to generate a high-precision federal aggregation model on the premise of providing differential privacy protection.

The application example is subjected to model training by relying on a traditional transverse federated learning framework and a deep neural network framework (such as a pitorch and a tensorflow), the federated models are aggregated by weighting relative obsolescence values, and appropriate Gaussian noise is added to the aggregated models to provide differential privacy protection, so that a differential privacy federated learning system is formed.

Firstly, a semi-synchronous communication part is adopted, and the application example of the application provides a brand-new method for determining semi-synchronous time nodes and provides a new concept of 'relative old value' for subsequent federal weighted aggregation and Gaussian noise addition.

Referring to fig. 6, the application example of the present application proposes a completely new method for determining a semi-synchronous time node to implement a semi-synchronous communication-based federal learning system. It is understood that past research on federally learned semi-synchronous communications neglects or bypasses the determination of the semi-synchronous time point of the federal node, and the application example of the application provides a completely new method for determining the semi-synchronous time point. All participants train the base number of times locally (e.g., the base is set to five rounds), and the training-completed participants send a completion signal to the central administrator indicating that the base number of times has been completed and then continue training locally. The central administrator starts a timer after receiving the first completion signal (the specific set time is related to the node average completion speed of the actual federal system). The central administrator receives the completion signals from the participants until the number of completion signals reaches a target value or the timer reaches a timing time (i.e., a maximum endurance time), and then the central administrator stops receiving the subsequent completion signals that may exist, and sends participation signals (representing that the round is allowed to participate in the federal aggregation process) to the participating nodes that have sent the completion signals before, and waits for the participants to upload their respective local models. In order to provide delivery guarantees (e.g. retransmission after loss of transmission signal) and order guarantees, the application example of the present application suggests to use the TCP/IP protocol to perform the above-mentioned communication procedure.

Because each participant trains locally for different times, the local model generated by each node has a certain difference from the ideal model. For example, in an extreme case, node A may only train locally once while node B trains locally E_maxThen, the local model trained by the node A and the node B will necessarily deviate from the ideal model.

The application proposes a relative aging value RS_iTo characterize the portion of the difference. Order:

namely, it is

The total step length of the target model in the t round is obtained;

namely, it is

The total step length of the target model in t +1 round;

namely, it is

The total step length of the client i in t +1 round is obtained;

therefore, there are:

then the aggregation process of the central server.

In order to realize a differential privacy federated learning framework based on semi-synchronous communication and accelerate the convergence speed of a federated model on the premise of meeting data availability and privacy protection, so that the final accuracy of the federated model is improved, the application example of the application provides the following improvements:

1. a brand-new method for determining semi-synchronous time nodes is provided to realize a federal learning system based on semi-synchronous communication. After determining the semi-synchronous time node, the federal system realizes semi-synchronous communication: i.e., allowing partially compute-resource-rich nodes to perform more training locally, while compute-resource-poor nodes only need to complete the underlying training times. Meanwhile, the relative old value is used for representing the gap of the nodes, so that the subsequent weighted aggregation is carried out according to the relative old value in the federal aggregation process, and the proportion of the contribution value of the nodes is controlled.

2. A novel method for determining the global sensitivity is provided, and the global sensitivity of the current round is determined by weighting the relative old value of each round of the participating nodes, so that the updated value of the model is ensured to be limited within the range of the global sensitivity. The global sensitivity S is ensured not to be too large or too small, and the range of gradient cutting is controlled more accurately.

3. An efficient federated aggregation and noise perturbation method is provided, firstly, a Gaussian mechanism is used for scaling the sum of model update values of all nodes, and then the contribution degree of each node is used for weighting the aggregated model update value, so that an approximation value of a true average value of all user updates can be obtained, and related personal information can be prevented from being leaked. And then, Gaussian noise is added to the aggregated model, and the variance of the Gaussian noise is controlled by using the global average value, so that the noise of each round is controlled more accurately.

Based on the method, compared with a synchronous communication method, the semi-synchronous communication method can fully utilize idle computing resources of high-computing-power nodes, so that each node is always in a high-efficiency computing state, the resource utilization rate is fundamentally improved, the aggregation process of federal learning is energized, the convergence speed of an aggregation model is accelerated, and the robustness of a federal system is improved. Meanwhile, the application example of the application provides a brand-new method for determining the global sensitivity, and the global sensitivity is determined through the relative obsolescence degree of each round, so that the gradient cutting range of each round is controlled more accurately, and the effect of federal polymerization is improved. Because the convergence rate of the model is improved and the aggregation effect of the model is enhanced, the accuracy rate which can be finally achieved by the model is greatly improved in a limited communication turn (limited by privacy budget), and the training capability of the model is remarkably improved.

From a hardware aspect, the present application provides an embodiment of an electronic device for implementing all or part of the content in the differential privacy law-based federal learning method or model training method, where the electronic device specifically includes the following contents:

fig. 7 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 7, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 7 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.

In one embodiment, the federated learning functionality based on differential privacy laws may be integrated into a central processor. Wherein the central processor may be configured to control:

It can be known from the above description that the electronic equipment that this application embodiment provided can be central server, through monitoring in the target period when stage completion signal reception condition data satisfies predetermined semi-synchronous training rule, acquire each received in the target period stage completion signal each phase training result data that corresponds to based on predetermined difference privacy method and each the target model is trained to stage training result data, carries out semi-synchronous model training through the faster participation node of the part speed that control current network environment is normal, can effectively avoid falling behind the interference of person's effect, avoids the waste of computational resource and can improve the convergence rate of target model, and then can effectively improve the training efficiency and the application timeliness of target model.

In another embodiment, the differential privacy law based federal learning or model training device may be configured separately from the central processor 9100, for example, the differential privacy law based federal learning or model training device may be configured as a chip connected to the central processor 9100, and the differential privacy law based federal learning or model training function is realized by the control of the central processor.

As shown in fig. 7, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 7; further, the electronic device 9600 may further include components not shown in fig. 7, which may be referred to in the art.

As shown in fig. 7, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.

The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.

The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.

The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.

The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).

The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.

Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.

An embodiment of the present application further provides a computer-readable storage medium capable of implementing all steps in the differential privacy law-based federal learning method or the model training method in the foregoing embodiments, where the computer-readable storage medium stores thereon a computer program, and when the computer program is executed by a processor, the computer program implements all steps of the differential privacy law-based federal learning method or the model training method in the foregoing embodiments, for example, when the processor executes the computer program, the processor implements the following steps:

As can be seen from the above description, the computer-readable storage medium provided in the embodiment of the present application acquires each of the phase completion signals received in the target period through monitoring that the phase completion signal reception condition data in the target period satisfies the preset semi-synchronous training rule, and trains the target model based on the preset differential privacy method and each of the phase training result data, and performs semi-synchronous model training through controlling the participating nodes with the normal partial speed of the current network environment, so as to effectively avoid the interference of the fall behind effect, avoid the waste of computing resources, improve the convergence rate of the target model, and further effectively improve the training efficiency and the application timeliness of the target model.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A federal learning method based on a differential privacy law is characterized by comprising the following steps:

2. The federal learning method based on differential privacy laws according to claim 1, wherein the obtaining of phase training result data corresponding to each phase completion signal received in the target period and training of the target model based on each phase training result data includes:

3. The federal learning method based on differential privacy laws according to claim 1, wherein before the monitoring of the signal reception condition data in the target period is started based on the phase completion signal corresponding to any one of the currently received multiple sets of training data for federal learning, the method further comprises:

4. The federal learning method based on differential privacy laws according to claim 1, wherein the monitoring of the signal reception condition data in the target period is started based on a phase completion signal corresponding to any one of the currently received multiple sets of training data for federal learning, includes:

5. The federal learning method based on differential privacy laws as claimed in claim 2, wherein the obtaining of the respective phase training result data of each phase completion signal received in the target period, wherein the phase training result data includes a local model and a model update value, comprises:

ending the monitoring for the current target period;

6. The differential privacy laws-based federal learning method as claimed in claim 2, wherein the training of the target model based on the differential privacy laws, the relative aging values and model update values corresponding to the local models respectively comprises:

acquiring a second-order norm corresponding to each local model;

7. The federal learning method based on differential privacy laws as claimed in any one of claims 1-6, wherein after training a target model based on differential privacy laws and training result data of each stage, further comprising:

8. A federal learning device based on a differential privacy law, comprising:

9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the differential privacy laws-based federal learning method of any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the differential privacy laws-based federal learning method as claimed in any one of claims 1 to 7.