CN114417420A - Privacy protection method, system and terminal based on centerless flow type federal learning - Google Patents

Privacy protection method, system and terminal based on centerless flow type federal learning Download PDF

Info

Publication number
CN114417420A
CN114417420A CN202210085813.XA CN202210085813A CN114417420A CN 114417420 A CN114417420 A CN 114417420A CN 202210085813 A CN202210085813 A CN 202210085813A CN 114417420 A CN114417420 A CN 114417420A
Authority
CN
China
Prior art keywords
parameters
sampling
edge node
privacy protection
model parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210085813.XA
Other languages
Chinese (zh)
Inventor
杨树森
任雪斌
赵鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Cumulus Technology Co ltd
Original Assignee
Hangzhou Cumulus Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Cumulus Technology Co ltd filed Critical Hangzhou Cumulus Technology Co ltd
Priority to CN202210085813.XA priority Critical patent/CN114417420A/en
Publication of CN114417420A publication Critical patent/CN114417420A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a privacy protection method, a system and a terminal based on centerless flow type federal learning, which are characterized in that online learning is carried out through edge nodes according to local real-time data flow, then the communication interaction time between nodes is determined in a self-adaptive manner based on the change of local model parameters, broadcast type interactive sharing is carried out on the model parameters and adjacent nodes after privacy protection based on a Laplacian mechanism is carried out on the model parameters during communication interaction, and parameter transmission is not carried out at the non-communication interaction time so as to reduce the communication overhead and budget privacy. Finally, the dynamic model training and updating of the global data stream are cooperatively performed by the edge node on the premise of privacy protection. The method has a good application effect in a privacy protection scene of actual large-scale distributed node collaborative online machine learning, and can be used in data privacy protection scenes in application scenes such as vehicle networking driving intelligence, mobile social contact and online recommendation.

Description

Privacy protection method, system and terminal based on centerless flow type federal learning
Technical Field
The invention belongs to the field of artificial intelligence, and particularly relates to a privacy protection method, a system and a terminal based on centerless flow type federal learning.
Background
With the popularization of big data technology and the increasing awareness of people on privacy protection, data privacy protection becomes one of the core bottlenecks hindering the development of big data. Data typically implies a large amount of sensitive information and is subject to different owners, thus having a high degree of fragmentation characteristics, ultimately leading to the current ubiquitous data islanding problem. The continuous deepening of the privacy consciousness of people leads to the increasing limitation of all links of data acquisition, flow and analysis; data collection and application are also subject to increasing regional laws and regulations. Therefore, with the further development of digital economic requirements and data analysis technologies represented by deep learning, the data privacy protection and data islanding problems become more obvious and become a serious obstacle to the development of big data and artificial intelligence.
Aiming at the privacy problem in sensitive data, the machine learning is protected based on the privacy of differential privacy, and random noise is introduced into a machine learning model, so that the risk of indirect disclosure of the model to the privacy of training data is limited. While differential privacy has a strict mathematical basis and flexible implementation mechanisms, the flexibility of the data analysis tasks supported by it and the utility of data analysis remain quite limited. Aiming at the problem of non-through data sharing, federal learning is taken as a brand-new distributed machine learning paradigm, and the main idea is that a user does not need to upload data to a server for centralized training, but the distributed training of a model is realized by exchanging intermediate information such as model parameters or gradients for many times. While federal learning can guarantee direct privacy protection without exposing and exchanging raw data, indirect privacy protection in intermediate parameter interactions cannot be guaranteed. It can be seen that the combination of machine learning and federal learning based on differential privacy protection is an effective way to simultaneously solve data privacy and data islanding.
In the existing differential privacy federal learning, a scene based on a cloud center server still needs to be subjected to model aggregation on the cloud server, and the difference is only whether the center server is credible or not. In an actual scene, edge nodes are often not affiliated to each other to form an equal cooperation relationship. At this time, the federal learning will not have a centrally coordinated cloud server, but rather form a centerless federal learning in the Peer-to-Peer (P2P) manner. Due to the mutual privacy trust problem among the nodes, the protection of interaction model parameters still needs to be carried out by utilizing a differential privacy mechanism. Meanwhile, most of the existing privacy protection federal learning methods aim at static training of batch data, and efficient support for complex tasks and scenes such as real-time online training of streaming data is lacked. For example, in an internet of vehicles scenario, an intelligent vehicle needs to acquire rich driving and road condition information in real time and perform cooperative interaction and intelligent decision with neighboring vehicles, so as to achieve a series of intelligent driving functions such as path planning and real-time obstacle avoidance. On the one hand, however, the relevant driving behavior, vehicle position and vehicle environment data may contain privacy information of relevant drivers and passengers, and real-time privacy protection is required; on the other hand, the volume of data generated in real time by automatic driving is very large, and the driving safety is affected by great expense or communication delay caused by real-time continuous communication and interaction.
Disclosure of Invention
The invention aims to solve the problems in the prior art, provides a privacy protection method, a system and a terminal based on centerless streaming federal learning, and can effectively solve the problems of communication overhead and privacy protection in federal learning aiming at streaming data in a centerless scene.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
a privacy protection method based on centerless streaming federal learning comprises the following steps:
step 1: updating the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round obtained by the edge nodes by utilizing the final model parameters of the previous round;
step 2: adding a noise vector based on a calibration result of the local model parameter sensitivity, and performing privacy protection on the updated local model parameter;
and step 3: each edge node shares the parameters after the privacy protection of the edge node to the adjacent edge nodes and simultaneously receives the parameters shared by all the adjacent edge nodes;
and 4, step 4: each edge node updates the model parameter of the edge node according to the received parameters of all adjacent edge nodes to obtain the posterior model parameter; after a plurality of times of updating, each edge node gradually converges to obtain the same model parameter;
and 5: performing node parameter interaction based on an intermittent interaction mode, dividing a training turn into a sampling turn and a non-sampling turn, repeating the steps 1 to 4 at each edge node of the sampling turn, and only executing the step 1 without performing broadcast communication at each node of the non-sampling turn; from the 1 st round, each edge node adaptively adjusts the interval of the next sampling round according to the model parameter change condition after the model of the two sampling rounds is updated in the sampling round, and reduces the communication frequency between devices and the consumption of privacy budget.
The invention is further improved in that:
updating the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round obtained by predicting the final model parameters of the previous round by the edge nodes, wherein the method specifically comprises the following steps:
at the initial moment, the edge nodes randomly initialize the model parameters
Figure BDA0003487812130000031
Thereafter, at each time t in each training round, the edge nodes utilize the model parameters of the previous round t-1
Figure BDA0003487812130000032
And predicting the prior model parameters of the round, calculating an instant loss function value and updating the local model parameters by using a gradient descent method.
Adding a noise vector based on a calibration result of the local model parameter sensitivity, and performing privacy protection on the updated local model parameters, specifically:
estimating the gradient upper bound of the model parameters, calculating the sensitivity of the model parameters according to the gradient upper bound, adding a noise vector meeting a specific differential privacy level according to the sensitivity, and performing privacy protection on the updated local model parameters, namely
Figure BDA0003487812130000033
Wherein the content of the first and second substances,
Figure BDA0003487812130000034
for the local model parameters at the time t,
Figure BDA0003487812130000035
satisfying ∈ privacy budget for model parameter sensitivity based calibrationt-a differential privacy noise vector, the noise shape being laplacian noise or gaussian noise.
Each edge node shares the parameters after the privacy protection of the edge node to the adjacent edge nodes, and simultaneously receives the parameters shared by all the adjacent edge nodes, specifically:
each edge node protects the privacy of each edge node
Figure BDA0003487812130000036
Sharing to all its neighbouring nodes N by broadcastjAnd simultaneously receives the parameters shared by the neighbor nodes.
Each edge node updates the model parameter of the edge node according to the received parameters of all adjacent edge nodes to obtain the posterior model parameter; after a plurality of updates, each edge node gradually converges to obtain the same model parameters, which specifically comprises the following steps:
after the edge node i receives the parameters of all the adjacent edge nodes, the model parameters of the edge node i are updated cooperatively according to all the received parameters to obtain the posterior model parameters
Figure BDA0003487812130000041
Wherein, aijIs an adjacency matrix A ═ aij]I, j ═ 1, …, parameters for N;
after a plurality of rounds of collaborative training, each edge node is fused through continuous iteration and interaction to obtain consistent model parameters.
Dividing the training round into a sampling round and a non-sampling round based on an intermittent interaction mode, repeating the steps 1 to 4 at each edge node of the sampling round, and only executing the step 1 without performing broadcast communication at each node of the non-sampling round; from the 1 st round, each edge node adaptively adjusts the interval of the next sampling round according to the model parameter change condition after the model of the front and the back sampling rounds is updated in the sampling round, reduces the communication frequency and the consumption of privacy budget among devices, and specifically comprises the following steps:
initially, privacy protection is performed for the first time t-1, the initial sampling interval is set to I-1, and for each time tnCalculating a sampling interval, where n represents the nth sampling instant, and calculating a feedback error based on the a priori estimate and the a posteriori result
Figure BDA0003487812130000042
To measure the change of the data, the data is displayed,
Figure BDA0003487812130000043
wherein the non-sampling points have no defined error;
a control gain is calculated by PID control based on the error,
Figure BDA0003487812130000044
wherein, Cp,Ci,CdProportional, integral and differential control gains, which are positive values, and the sum of the three is 1, TiIntegration time for integration control;
the update of the current sampling interval $ { I }' $ is defined as
Figure BDA0003487812130000045
Wherein, theta and xi are sampling interval adjusting parameters, and I is the last sampling interval; and setting the next sampling time as t + I' according to the sampling interval, thereby realizing the self-adaptive updating of the sampling frequency.
A privacy protection system based on centerless streaming federal learning, comprising:
the first updating module updates the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round, which are obtained by the edge nodes through the final model parameter prediction of the previous round;
the noise adding module is used for adding a noise vector based on a calibration result of the local model parameter sensitivity and carrying out privacy protection on the updated local model parameter;
the data sharing module is used for sharing the parameters after the privacy protection of each edge node to the adjacent edge nodes by each edge node and simultaneously receiving the parameters shared by all the adjacent edge nodes;
the second updating module is used for updating the model parameters of each edge node according to the received parameters of all the adjacent edge nodes to obtain posterior model parameters; after a plurality of times of updating, each edge node gradually converges to obtain the same model parameter;
and the self-adaptive adjusting module is used for reducing the communication frequency and the privacy budget consumption among the devices in a self-adaptive sampling mode according to the parameter change condition in real time on the basis of each edge node.
A terminal device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, said processor implementing the steps of the above method when executing said computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
Compared with the prior art, the invention has the following beneficial effects:
according to the method, each edge node is used for training and updating the local model, the model parameters are added with differential privacy protection and then are interacted with other edge nodes, the model is updated cooperatively, and finally cooperative federal learning and model training with privacy protection of large-scale nodes in a centerless scene are achieved. According to the invention, the time correlation of the communication data stream is analyzed in real time, and the communication frequency and the consumption of privacy budget among the devices are greatly reduced in a self-adaptive sampling mode according to the data change condition.
Drawings
In order to more clearly explain the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart of a privacy protection method based on centerless streaming federal learning according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a privacy protection system based on centerless streaming federated learning according to an embodiment of the present invention;
fig. 3 is a block diagram of a privacy protection system based on centerless streaming federal learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the embodiments of the present invention, it should be noted that if the terms "upper", "lower", "horizontal", "inner", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings or the orientation or positional relationship which is usually arranged when the product of the present invention is used, the description is merely for convenience and simplicity, and the indication or suggestion that the referred device or element must have a specific orientation, be constructed and operated in a specific orientation, and thus, cannot be understood as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
Furthermore, the term "horizontal", if present, does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.
In the description of the embodiments of the present invention, it should be further noted that unless otherwise explicitly stated or limited, the terms "disposed," "mounted," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
The invention is described in further detail below with reference to the accompanying drawings:
the P2P communication network formed by a plurality of edge nodes is modeled as a unicom structure, denoted as G (V, E). WhereinV ═ {1, …, N } denotes a set of edge nodes, and
Figure BDA0003487812130000071
representing all edge sets. (i, j) ∈ E then means that the edge node (i, j) ∈ V can communicate directly and exchange information, i.e. is adjacent, so that an N × N adjacency matrix a ═ a may be usedij]And i, j is 1, …, N to indicate the communication mode of the centerless network. Considering symmetry, assume that adjacency-matrix A is a dual-random matrix aij=aji. Further, NjRepresenting all neighbor nodes that can communicate directly with node j, including j itself.
And the N edge nodes perform centerless online federal learning. At each training round T-1, …, T, each edge node i performs an independent learning task (e.g., classifier) fiX → Y, i.e. based on data received from the convex set X
Figure BDA0003487812130000072
Give a prediction result
Figure BDA0003487812130000073
Wherein, in order to measure the accuracy of the predicted result and the real result
Figure BDA0003487812130000074
Make a comparison and calculate a loss function
Figure BDA0003487812130000075
For all nodes, the federal learning training goal is to exchange privacy-preserving information with local information and with direct adjacencies to achieve minimization of the global loss function for the T time span
Figure BDA0003487812130000076
Referring to fig. 1, the invention discloses a privacy protection method based on centerless streaming federal learning, which comprises the following steps:
referring to fig. 2, there are a total of several (say m) distributed edge servers (Di)stributed Server) to participate in the online federal learning training process of streaming data. At any time t, the local data received by the edge server i is
Figure BDA0003487812130000077
Each server carries out local model training based on local data, then adds differential privacy noise to model parameters to realize local privacy protection of the model, and it needs to be noted that the privacy protection of the model is dynamically determined according to self-adaptive sampling, the server does not need to interact with other servers at non-sampling time, and does not need to carry out model privacy protection, and the privacy protection and parameter interactive communication of the model are carried out only at sampling time. Then, on one hand, the server firstly carries out local prediction on the model parameters based on the historical information to obtain estimated model parameters, and on the other hand, the server sends the updated model parameters to the neighbor nodes through broadcasting. Meanwhile, the server receives messages sent by other neighbor nodes in a broadcast mode together with the local model of the server to obtain updated model parameters. And estimating the dynamic of the current training data stream of the node according to the change condition between the updated model parameters and the estimation model parameters, thereby adaptively adjusting the next sampling time. Eventually, the nodes converge consistently to the global model due to cooperative communication and interaction between the nodes.
S101, updating the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round obtained by the edge nodes through the final model parameter prediction of the previous round.
At the initial moment, the edge nodes randomly initialize the model parameters
Figure BDA0003487812130000081
Thereafter, at each time t of each training round, each edge node utilizes the final model parameters of the previous round t-1 (a posteriori model parameters of the round t-1)
Figure BDA0003487812130000082
And predicting prior model parameters of the current round. For example, consider the phase of a data streamCustoms property, order
Figure BDA0003487812130000083
The edge node then calculates an instantaneous loss function value
Figure BDA0003487812130000084
And updating local model parameters by using a gradient descent method
Figure BDA0003487812130000085
Wherein the update gradient is mapped onto the constraint set using a projection operation
Figure BDA0003487812130000086
S102: and adding a noise vector based on the calibration result of the local model parameter sensitivity, and performing privacy protection on the updated local model parameters.
Estimating the gradient upper bound of the model parameters, calculating the sensitivity of the model parameters according to the gradient upper bound, adding a noise vector meeting a specific differential privacy level according to the sensitivity, performing privacy protection on the updated local model parameters,
Figure BDA0003487812130000087
wherein the content of the first and second substances,
Figure BDA0003487812130000088
for the local model parameters at the time t,
Figure BDA0003487812130000089
satisfying ∈ privacy budget for model parameter sensitivity based calibrationt-a differential privacy noise vector, the noise shape may be laplacian noise or gaussian noise. Wherein, in order to make the whole T length time satisfy the E-difference privacy
Figure BDA0003487812130000091
S103: each edge node shares the parameters after the privacy protection of the edge node to the adjacent edge nodes, and simultaneously receives the parameters shared by all the adjacent edge nodes.
In order to ensure that the edge nodes perform effective cooperation under the condition of no central server cooperation, each edge node performs parameter after respective privacy protection
Figure BDA0003487812130000092
Sharing to all adjacent nodes N thereof by broadcasting MsgjAnd simultaneously receives the parameters shared by the neighbor nodes. Due to the differential privacy protection in the S102, the broadcast shared parameter information is subjected to privacy protection, and the privacy of the leaked nodes is effectively limited.
S104: each edge node updates the model parameter of the edge node according to the received parameters of all adjacent edge nodes to obtain the posterior model parameter; after a plurality of updates, each edge node gradually converges to obtain the same model parameter.
After receiving all the parameters of the adjacent edge nodes, the edge node i cooperatively updates the model parameters thereof according to all the received parameters (i.e. performs linear combination) to obtain the posterior model parameters
Figure BDA0003487812130000093
Wherein, aijIs an adjacency matrix A ═ aij]I, j ═ 1, …, parameters for N;
after a plurality of rounds of collaborative training, each edge node can be fused to obtain consistent model parameters through continuous iteration and interaction, namely distributed global model training is realized without depending on a central server.
S105: performing node parameter interaction based on an intermittent interaction mode, dividing a training turn into a sampling turn and a non-sampling turn, repeating S101 to S104 at each edge node of the sampling turn, and performing only S101 without performing broadcast communication at each node of the non-sampling turn; from the 1 st round, each edge node adaptively adjusts the interval of the next sampling round according to the model parameter change condition after the model of the two sampling rounds is updated in the sampling round, and reduces the communication frequency between devices and the consumption of privacy budget.
In online learning, privacy budgets and communication overhead are doubly accumulated as time goes on. Therefore, the adaptive sampling method is to be used to reduce the state update frequency in the online learning process. Namely, the time correlation of the data stream is analyzed in real time, and the communication frequency and the privacy budget consumption between the devices are reduced in a self-adaptive sampling mode according to the parameter change condition. In the following an adaptive asynchronous acquisition process based on PID (proportional, integral and derivative) control is employed. Initially, since the first time t is 1, the privacy protection is performed, and therefore, the initial sampling interval is set to I1. Thereafter, for each time tnStarting to calculate the sampling interval, where n represents the nth sampling time, and calculating the feedback error according to the prior estimation and the posterior result
Figure BDA0003487812130000101
To measure the change of the data, the data is displayed,
Figure BDA0003487812130000102
where the non-sampled points define no error. Then, a control gain is calculated by PID control based on the error
Figure BDA0003487812130000103
Wherein, Cp,Ci,CdProportional, integral and differential control gains, which are positive values, and the sum of the three is 1, TiIntegration time for integration control;
the update of the current sampling interval $ { I }' $ is defined as
Figure BDA0003487812130000104
Wherein, theta and xi are sampling interval adjusting parameters which can be adjusted manually, and I is the last sampling interval. According to the sampling interval, the next sampling time can be set to be t + I', so that the sampling frequency can be updated in a self-adaptive mode.
Referring to fig. 3, the present invention discloses a privacy protection system based on centerless streaming federal learning, which includes:
the first updating module updates the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round, which are obtained by the edge nodes through the final model parameter prediction of the previous round;
the noise adding module is used for adding a noise vector based on a calibration result of the local model parameter sensitivity and carrying out privacy protection on the updated local model parameter;
the data sharing module is used for sharing the parameters after the privacy protection of each edge node to the adjacent edge nodes by each edge node and simultaneously receiving the parameters shared by all the adjacent edge nodes;
the second updating module is used for updating the model parameters of each edge node according to the received parameters of all the adjacent edge nodes to obtain posterior model parameters; after a plurality of times of updating, each edge node gradually converges to obtain the same model parameter;
and the self-adaptive adjusting module is used for reducing the communication frequency and the privacy budget consumption among the devices in a self-adaptive sampling mode according to the parameter change condition in real time on the basis of each edge node.
The terminal device provided by the embodiment of the invention. The terminal device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor realizes the steps of the above-mentioned method embodiments when executing the computer program. Alternatively, the processor implements the functions of the modules/units in the above device embodiments when executing the computer program.
The computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention.
The terminal device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The terminal device may include, but is not limited to, a processor, a memory.
The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.
The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the terminal device by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory.
The terminal device integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A privacy protection method based on centerless streaming federal learning is characterized by comprising the following steps:
step 1: updating the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round obtained by the edge nodes by utilizing the final model parameters of the previous round;
step 2: adding a noise vector based on a calibration result of the local model parameter sensitivity, and performing privacy protection on the updated local model parameter;
and step 3: each edge node shares the parameters after the privacy protection of the edge node to the adjacent edge nodes and simultaneously receives the parameters shared by all the adjacent edge nodes;
and 4, step 4: each edge node updates the model parameter of the edge node according to the received parameters of all adjacent edge nodes to obtain the posterior model parameter; after a plurality of times of updating, each edge node gradually converges to obtain the same model parameter;
and 5: performing node parameter interaction based on an intermittent interaction mode, dividing a training turn into a sampling turn and a non-sampling turn, repeating the steps 1 to 4 at each edge node of the sampling turn, and only executing the step 1 without performing broadcast communication at each node of the non-sampling turn; from the 1 st round, each edge node adaptively adjusts the interval of the next sampling round according to the model parameter change condition after the model of the two sampling rounds is updated in the sampling round, and reduces the communication frequency between devices and the consumption of privacy budget.
2. The privacy protection method based on centerless streaming federal learning of claim 1, wherein the local model is updated based on the edge node random initialization model parameters and the current round prior model parameters obtained by the edge node by using the final model parameter prediction of the previous round, specifically:
at the initial moment, the edge nodes randomly initialize the model parameters
Figure FDA0003487812120000011
Thereafter, at each time t in each training round, the edge nodes utilize the model parameters of the previous round t-1
Figure FDA0003487812120000012
And predicting the prior model parameters of the round, calculating an instant loss function value and updating the local model parameters by using a gradient descent method.
3. The privacy protection method based on centerless streaming federal learning of claim 1, wherein the calibration result based on the local model parameter sensitivity adds a noise vector to perform privacy protection on the updated local model parameter, specifically:
estimating the gradient upper bound of the model parameters, calculating the sensitivity of the model parameters according to the gradient upper bound, adding a noise vector meeting a specific differential privacy level according to the sensitivity, and performing privacy protection on the updated local model parameters, namely
Figure FDA0003487812120000021
Wherein the content of the first and second substances,
Figure FDA0003487812120000022
for the local model parameters at the time t,
Figure FDA0003487812120000023
satisfying ∈ privacy budget for model parameter sensitivity based calibrationt-a differential privacy noise vector, the noise shape being laplacian noise or gaussian noise.
4. The privacy protection method based on centerless streaming federal learning of claim 1, wherein each edge node shares the parameters after respective privacy protection to the adjacent edge nodes, and receives the parameters shared by all the adjacent edge nodes at the same time, specifically:
each edge node protects the privacy of each edge node
Figure FDA0003487812120000024
Sharing to all its neighbouring nodes N by broadcastjAnd simultaneously receives the parameters shared by the neighbor nodes.
5. The privacy protection method based on centerless streaming federal learning of claim 1, wherein each edge node updates its own model parameters according to the received parameters of all adjacent edge nodes to obtain posterior model parameters; after a plurality of updates, each edge node gradually converges to obtain the same model parameters, which specifically comprises the following steps:
after the edge node i receives the parameters of all the adjacent edge nodes, the model parameters of the edge node i are updated cooperatively according to all the received parameters to obtain the posterior model parameters
Figure FDA0003487812120000025
Wherein, aijIs an adjacency matrix A ═ aij]I, j ═ 1, …, parameters for N;
after a plurality of rounds of collaborative training, each edge node is fused through continuous iteration and interaction to obtain consistent model parameters.
6. The privacy protection method based on centerless streaming federal learning as claimed in claim 1, wherein the training round is divided into a sampling round and a non-sampling round based on an intermittent interaction mode, steps 1 to 4 are repeated at each edge node of the sampling round, and each node does not perform broadcast communication in the non-sampling round, and only step 1 is performed; from the 1 st round, each edge node adaptively adjusts the interval of the next sampling round according to the model parameter change condition after the model of the front and the back sampling rounds is updated in the sampling round, reduces the communication frequency and the consumption of privacy budget among devices, and specifically comprises the following steps:
initially, privacy protection is performed for the first time t-1, the initial sampling interval is set to I-1, and for each time tnCalculating a sampling interval, where n represents the nth sampling instant, and calculating a feedback error based on the a priori estimate and the a posteriori result
Figure FDA0003487812120000031
To measure the change of the data, the data is displayed,
Figure FDA0003487812120000032
wherein the non-sampling points have no defined error;
a control gain is calculated by PID control based on the error,
Figure FDA0003487812120000033
wherein, Cp,Ci,CdProportional, integral and differential control gains, which are positive values, and the sum of the three is 1, TiIntegration time for integration control;
the update of the current sampling interval $ { I }' $ is defined as
Figure FDA0003487812120000034
Wherein, theta and xi are sampling interval adjusting parameters, and I is the last sampling interval; and setting the next sampling time as t + I' according to the sampling interval, thereby realizing the self-adaptive updating of the sampling frequency.
7. A privacy protection system based on centerless streaming federal learning, comprising:
the first updating module updates the local model based on the random initialization model parameters of the edge nodes and the prior model parameters of the current round, which are obtained by the edge nodes through the final model parameter prediction of the previous round;
the noise adding module is used for adding a noise vector based on a calibration result of the local model parameter sensitivity and carrying out privacy protection on the updated local model parameter;
the data sharing module is used for sharing the parameters after the privacy protection of each edge node to the adjacent edge nodes by each edge node and simultaneously receiving the parameters shared by all the adjacent edge nodes;
the second updating module is used for updating the model parameters of each edge node according to the received parameters of all the adjacent edge nodes to obtain posterior model parameters; after a plurality of times of updating, each edge node gradually converges to obtain the same model parameter;
and the self-adaptive adjusting module is used for reducing the communication frequency and the privacy budget consumption among the devices in a self-adaptive sampling mode according to the parameter change condition in real time on the basis of each edge node.
8. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-6 when executing the computer program.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202210085813.XA 2022-01-25 2022-01-25 Privacy protection method, system and terminal based on centerless flow type federal learning Pending CN114417420A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210085813.XA CN114417420A (en) 2022-01-25 2022-01-25 Privacy protection method, system and terminal based on centerless flow type federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210085813.XA CN114417420A (en) 2022-01-25 2022-01-25 Privacy protection method, system and terminal based on centerless flow type federal learning

Publications (1)

Publication Number Publication Date
CN114417420A true CN114417420A (en) 2022-04-29

Family

ID=81277317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210085813.XA Pending CN114417420A (en) 2022-01-25 2022-01-25 Privacy protection method, system and terminal based on centerless flow type federal learning

Country Status (1)

Country Link
CN (1) CN114417420A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114968404A (en) * 2022-05-24 2022-08-30 武汉大学 Distributed unloading method for computing task with position privacy protection

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114968404A (en) * 2022-05-24 2022-08-30 武汉大学 Distributed unloading method for computing task with position privacy protection
CN114968404B (en) * 2022-05-24 2023-11-17 武汉大学 Distributed unloading method for computing tasks of location privacy protection

Similar Documents

Publication Publication Date Title
Li et al. NOMA-enabled cooperative computation offloading for blockchain-empowered Internet of Things: A learning approach
CN110968426B (en) Edge cloud collaborative k-means clustering model optimization method based on online learning
CN115455471A (en) Federal recommendation method, device, equipment and storage medium for improving privacy and robustness
Arkian et al. FcVcA: A fuzzy clustering-based vehicular cloud architecture
CN113315978A (en) Collaborative online video edge caching method based on federal learning
CN115686846B (en) Container cluster online deployment method integrating graph neural network and reinforcement learning in edge calculation
CN114417420A (en) Privacy protection method, system and terminal based on centerless flow type federal learning
Wang et al. Online service migration in mobile edge with incomplete system information: A deep recurrent actor-critic learning approach
Karschau et al. Renormalization group theory for percolation in time-varying networks
Lei et al. Oes-fed: a federated learning framework in vehicular network based on noise data filtering
Chiang et al. Deep Q-learning-based dynamic network slicing and task offloading in edge network
Zheng et al. Data synchronization in vehicular digital twin network: A game theoretic approach
CN116541106B (en) Computing task unloading method, computing device and storage medium
Binh et al. Reinforcement Learning for Optimizing Delay-Sensitive Task Offloading in Vehicular Edge-Cloud Computing
Gao et al. A deep learning framework with spatial-temporal attention mechanism for cellular traffic prediction
Dangi et al. 5G network traffic control: a temporal analysis and forecasting of cumulative network activity using machine learning and deep learning technologies
Kang et al. Time efficient offloading optimization in automotive multi-access edge computing networks using mean-field games
US10454776B2 (en) Dynamic computer network classification using machine learning
CN113747450A (en) Service deployment method and device in mobile network and electronic equipment
Huang et al. A hierarchical pseudonyms management approach for software-defined vehicular networks
Lovén et al. A dark and stormy night: Reallocation storms in edge computing
Shinkuma et al. Data assessment and prioritization in mobile networks for real-time prediction of spatial information using machine learning
Duan et al. Sensor scheduling design for complex networks under a distributed state estimation framework
Azzaoui et al. A survey on data dissemination in internet of vehicles networks
Henna et al. Distributed and collaborative high-speed inference deep learning for mobile edge with topological dependencies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination