CN115277696A - Cross-network federal learning system and method - Google Patents

Cross-network federal learning system and method Download PDF

Info

Publication number
CN115277696A
CN115277696A CN202210823096.6A CN202210823096A CN115277696A CN 115277696 A CN115277696 A CN 115277696A CN 202210823096 A CN202210823096 A CN 202210823096A CN 115277696 A CN115277696 A CN 115277696A
Authority
CN
China
Prior art keywords
network
external network
participant
intranet
gradient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210823096.6A
Other languages
Chinese (zh)
Other versions
CN115277696B (en
Inventor
王济平
黎刚
汤克云
周健雄
杨劲业
高俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingxin Data Technology Co ltd
Original Assignee
Jingxin Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingxin Data Technology Co ltd filed Critical Jingxin Data Technology Co ltd
Priority to CN202210823096.6A priority Critical patent/CN115277696B/en
Publication of CN115277696A publication Critical patent/CN115277696A/en
Application granted granted Critical
Publication of CN115277696B publication Critical patent/CN115277696B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0806Configuration setting for initial configuration or provisioning, e.g. plug-and-play
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a cross-network federal learning system and a method, which comprises at least two completely physically isolated networks and an off-line transmission module, wherein: the outer network initiator and the inner network participant together complete a federated learning joint modeling task; the intranet participant is used for providing local data, receiving the invitation of the extranet initiator and participating in the federated learning joint modeling task; the outer network coordinator is used for performing aggregation optimization on intermediate parameters in the process of executing federated learning iterative computation by the outer network initiator and the inner network participant to generate new intermediate parameters; the outer network transmission monitoring module is used for reading the intermediate parameters from the off-line transmission module and then sending the intermediate parameters to the outer network coordinator, and writing the intermediate parameters into the off-line transmission module after receiving the intermediate parameters from the outer network coordinator; the off-line transmission module is used for interactively transmitting the encrypted intermediate parameters between the two networks. The invention can realize the effective transmission of the intermediate data of the isolation network of the federal learning calculation task, thereby completing the cross-network federal learning calculation task.

Description

Cross-network federal learning system and method
Technical Field
The invention relates to a federal learning system, in particular to a cross-network federal learning system and a method.
Background
The definition of federal learning (fed learning) refers to: a mode for a plurality of participants to collaboratively complete a certain machine learning task under the premise of ensuring that respective original private data does not exceed private boundaries defined by a data party. The federated learning mainly comprises three roles of a participant, a central coordinator and an initiator, the participant and the initiator jointly co-establish a machine learning model to perform joint calculation tasks, the coordinator performs intermediate parameter transmission, and the multi-party data fusion analysis application is realized under the condition that data does not exist in the local domain.
In the prior art, federal learning requires that each data resource participant can realize federal learning joint modeling and calculation only under the condition that networks are mutually communicated, and the condition that both parties or multiple parties realize data fusion application in a federal learning calculation mode under the condition that both data resource participants are isolated by the networks cannot be met, for example, the government affair field is divided into a government affair outer network and a government affair inner network, so that the prior mainstream technology cannot establish a federal learning task under the condition of crossing the networks and cannot meet application requirements due to complete physical isolation of the two networks.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a system and a method for realizing effective transmission of the intermediate data of the isolation network of the federal learning calculation task so as to complete the cross-network federal learning calculation task aiming at the defects of the prior art.
In order to solve the technical problems, the invention adopts the following technical scheme.
The utility model provides a cross network federal learning system, it includes at least two complete physically isolated networks and connects the off-line transmission module between two networks, and two networks are defined as extranet and intranet respectively, and every network is including initiator, participant, coordinator and transmission monitoring module, wherein: the outer network initiator is used for completing a federated learning joint modeling task together with the inner network participant; the intranet participant is used for providing local data, receiving the invitation of the extranet initiator and participating in the federated learning joint modeling task; the outer network coordinator is used for performing aggregation optimization on intermediate parameters in the process of executing federated learning iterative computation by the outer network initiator and the inner network participant and generating new intermediate parameters; the outer network transmission monitoring module is used for reading the intermediate parameters from the off-line transmission module and then sending the intermediate parameters to the outer network coordinator, and receiving the intermediate parameters from the outer network coordinator and then writing the intermediate parameters into the off-line transmission module; the off-line transmission module is used for interactively transmitting the encrypted intermediate parameters between the two networks.
Preferably, the offline transmission module is further configured to store intermediate parameters interactively transmitted between the two networks.
A cross-network federal learning method is realized based on a system, the system comprises at least two completely physically isolated networks and an offline transmission module connected between the two networks, the two networks are respectively defined as an extranet and an intranet, each network comprises an initiator, a participant, a coordinator and a transmission monitoring module, and the method comprises the following steps: step S10, initializing the transmission monitoring module; step S20, the external network initiator and the internal network participant complete the creation of a federal learning model together, the external network coordinator performs environment initialization and starts a first batch iterative computation task; step S30, the external network coordinator respectively sends the two-party unilateral gradient after the iterative optimization in the step S20 to an external network initiator and an external network transmission monitoring module, sets a data interval for executing a next round of iterative batch, and starts the next round of iterative batch calculation; step S40: and repeating the step S30 in an iterative manner until the federal learning task is finished.
In the cross-network federated learning system disclosed by the invention, after the initialization of the transmission monitoring module is completed, the external network initiator and the internal network participant together complete the creation of a federated learning model, the external network coordinator performs environment initialization, starts a first batch of iterative computation tasks and respectively sends the two-side gradient after iterative optimization to the external network initiator and the external network transmission monitoring module, and then sets a data interval for executing the next round of iterative batch, starts the next round of iterative batch computation and repeatedly executes the iterative computation until the federated learning task is finished. Compared with the prior art, the method and the device realize that any party data node and the isolation party network federal learning calculation node can create a federal learning task under the isolation network. The invention provides a cross-network federated learning offline transmission module, which realizes offline transmission of intermediate parameters in the iterative process of a federated learning task and well completes the interaction function of the intermediate parameters in the calculation of the cross-network federated learning task.
Drawings
FIG. 1 is a block diagram of the components of the cross-network federated learning system of the present invention;
FIG. 2 is a flow chart of a cross-network federated learning method of the present invention;
FIG. 3 is a flow chart of a first embodiment of the present invention;
FIG. 4 is a flow chart of a second embodiment of the present invention;
FIG. 5 is a first flowchart illustrating a third embodiment of the present invention;
FIG. 6 is a second flowchart illustrating a third embodiment of the present invention.
Detailed Description
The invention is described in more detail below with reference to the figures and examples.
The invention discloses a cross-network federal learning system, please refer to fig. 1, which comprises at least two completely physically isolated networks 1 and an offline transmission module 2 connected between the two networks 1, wherein the two networks 1 are respectively defined as an extranet and an intranet, each network comprises an initiator 10, a participant 11, a coordinator 12 and a transmission monitoring module 13, wherein:
the extranet initiator 10 is used for completing a federated learning joint modeling task together with the intranet participant 11;
the intranet participant 11 is used for providing local data, accepting the invitation of the extranet initiator 10, and participating in the federated learning joint modeling task;
the extranet coordinator 12 is used for performing aggregation optimization on intermediate parameters in the federate learning iterative computation process performed by the extranet initiator 10 and the intranet participant 11, and generating new intermediate parameters;
the external network transmission monitoring module 13 is configured to read the intermediate parameter from the offline transmission module 2, send the intermediate parameter to the external network coordinator 12, receive the intermediate parameter from the external network coordinator 12, and write the intermediate parameter into the offline transmission module 2;
the offline transmission module 2 is configured to interactively transmit the encrypted intermediate parameters between the two networks 1. In addition, the offline transmission module 2 is further configured to store intermediate parameters interactively transmitted between the two networks 1.
In the system, after the initialization of the transmission monitoring module 13 is completed, the external network initiator 10 and the internal network participant 11 together complete the creation of the federal learning model, the external network coordinator 12 performs environment initialization, starts a first-batch iterative calculation task, respectively sends the iteratively optimized two-party single-side gradient to the external network initiator 10 and the external network transmission monitoring module 13, then sets a data interval for executing a next iteration batch, starts a next iteration batch calculation, and repeatedly executes the iterative calculation until the federal learning task is finished. Compared with the prior art, the method and the system have the advantages that under the isolation network, any one party of data nodes can create a federal learning task with the federal learning and calculating nodes of the isolation party network, for example, when the intranet nodes serve as an initiator and the extranet nodes serve as participants, the intranet center side serves as a federal learning and coordinating node, and when the extranet nodes serve as the initiator and the intranet nodes serve as participants, the extranet center side federal learning and coordinating node is adopted. The invention provides a cross-network federated learning offline transmission module, which realizes offline transmission of intermediate parameters in the iterative process of a federated learning task and better completes the interaction function of the intermediate parameters in the calculation of the cross-network federated learning task.
On the basis of the above system, the present invention further relates to a cross-network federal learning method, which is implemented based on a system as shown in fig. 1 and fig. 2, wherein the system includes at least two completely physically isolated networks 1 and an offline transmission module 2 connected between the two networks 1, the two networks 1 are respectively defined as an extranet and an intranet, each network includes an initiator 10, a participant 11, a coordinator 12 and a transmission monitoring module 13, and the method includes the following steps:
step S10, initializing the transmission monitoring module 13;
step S20, the extranet initiator 10 and the intranet participant 11 complete the creation of a federal learning model together, and the extranet coordinator 12 performs environment initialization to start a first batch iterative computation task;
step S30, the external network coordinator 12 respectively sends the two-party unilateral gradient after the iterative optimization in the step S20 to the external network initiator 10 and the external network transmission monitoring module 13, sets a data interval for executing the next round of iterative batch, and starts the next round of iterative batch calculation;
step S40: and repeating the step S30 in an iterative manner until the federal learning task is finished.
In the method, under the completely physically isolated environment of two networks, the federal learning clusters are respectively deployed in the respective network environments, and an initiator, a participant, a central node and a transmission monitoring module are respectively deployed in each network environment. The transmission of intermediate data of the federated learning calculation task isolation network is realized through an offline transmission module, and finally, a cross-network federated learning calculation task is completed, wherein the definition of each module is as follows:
the initiator: the method refers to a data application party of the federated learning calculation task, and jointly completes the federated learning joint modeling task under the cooperation of data provided by participants;
the participation method comprises the following steps: the system comprises a data provider of a federated learning calculation task, an assistance initiator, a data processing module and a data processing module, wherein the assistance initiator provides local data, adds the federated learning task and assists the initiator to jointly complete the federated learning joint modeling task;
a central node: the system refers to a coordinator of a federated learning calculation task, and is used for aggregating and optimizing intermediate parameters in the iterative calculation process of an initiator and a participant to generate new intermediate parameters;
intermediate parameters: the method refers to intermediate factors such as a model gradient, a loss value and the like generated in the iterative calculation process of the federal learning;
the transmission monitoring module means: the intermediate parameters are used for receiving or sending the Federal learning iterative computation process to the middle coordinator, and writing the intermediate parameters into the offline transmission module or reading the intermediate parameters from the offline transmission module.
The offline transmission module refers to: the device is used for storing and transmitting the encrypted intermediate parameters calculated by each federal learning iteration, and the offline transmission module comprises but is not limited to a mobile U disk, a gate, a gatekeeper and the like, and can be used for offline data transmission media in a heterogeneous network.
The method comprises the steps of constructing a transmission monitoring module, monitoring intermediate parameters of a federation learning task iteration process of a coordinator under an initiator network environment, writing the intermediate parameters into an offline transmission module, verifying the legality of the offline transmission module, reading the intermediate parameters of the offline transmission module in a participant network environment, and sending the intermediate parameters to a participant for model updating.
Please refer to the first embodiment to the third embodiment for the specific implementation process of the method of the present invention.
Example one
Referring to fig. 3, this embodiment is a further explanation of the process of initializing the transmission monitoring module 13 in step S10. In this embodiment, the step S10 includes the following steps:
step S101, respectively starting the transmission monitoring modules 13 of the internal network and the external network; specifically, in step S101, the transmission monitoring modules 13 of the internal network and the external network establish a secure transmission channel based on the TLCP cryptographic protocol, respectively join the secure transmission channel into the federated learning cluster network environments of each party, and establish a secure connection with each computing node and the central node of the federated learning cluster network of the party by using the Netty protocol, so as to store and transmit intermediate parameters in the federated learning computing batch iteration process.
Step S102, the transmission monitoring modules 13 of the inner network and the outer network respectively adopt RSA or SM2 asymmetric algorithms, and generate a key pair set S and a public key set X based on the identifications of all participating nodes of the Federal learning cluster network of the party;
further, in the step S102, the node identifier is defined as: u = { h)1,h2,h3......hiH, where U is a node identification set in the network environment, and hiDefining as a federal learning node identification, wherein i represents the number of nodes of the federal learning cluster, and respectively generating and generating a public and private key set and a public key set corresponding to each node identification by adopting an RSA or SM2 encryption algorithm based on the node identification, wherein a key pair set S is defined as:
Figure BDA0003744893050000061
the public key set X is defined as:
Figure BDA0003744893050000062
wherein h isiRepresenting Federal learning node identity, piRepresenting a Federal learning node identifier hiK is a public key ofiRepresenting a Federal learning node identifier hiThe private key of (2).
Step S103, the transmission monitoring modules 13 of the internal network and the external network respectively load the public key set X generated by the network of the other party in the step S102 and mark the public key set X as X0(ii) a For example, the internal network transmission monitoring module loads the public key set X generated by the external network transmission monitoring module and marks as X0The purpose is to distinguish from the set X generated in step S102. The outer network transmission monitoring module loads the public key set X generated by the inner network transmission monitoring module and marks the public key set X as X0The same purpose is to distinguish from the binding X generated in step S102.
Step S104, starting the scanning monitoring process of the off-line transmission module 2, and monitoring the off-line federal learning iteration batch intermediate parameters for reading and writing the off-line transmission module 2 in real time.
Further, in step S104, the offline transmission rule of the offline transmission module 2 includes a combination of an intermediate parameter ciphertext data storage path and an intermediate parameter ciphertext data verification, and both parties define the storage path rule as:
[drive#:][/]fldir/hi/task/n/;
where fldir represents the device root directory, hiRepresenting participant node identification, task is a fixed folder name, n represents the turn of task iterative computation, and is sequentially increased from 01 and 02iThe node public key carries out data decryption and signature verification on the intermediate parameters read from the off-line transmission module, and whether the intermediate parameters are normally decrypted is judged. If the accessed offline transmission module is verified to be tampered, the legality of the transmission equipment is ensured, and the risk of data leakage caused by the fact that an illegal participant accesses and intercepts intermediate parameters is avoided.
Example two
In this embodiment, please refer to fig. 4, where the step S20 includes the following steps:
step S201, the external network coordinator 12 generates an initialized public and private key pair based on the paillier algorithm, and sends a public key PubKey and an internal network participant node identifier hiRespectively sending the parameters to an external network initiator 10 and an external network transmission monitoring module 13, encrypting intermediate parameters provided for an external network initiator and an internal network participant in an iterative batch, storing a PrivKey by an external network coordinator 12, decrypting the intermediate parameters after receiving the iterative batch, and setting a data interval for the current iterative batch to participate in training;
step S202, the extranet initiator 10 receives the PubKey, sets the local data to participate in the training batch each time, creates a paillier encryption processor, initializes the model and starts the first iterative computation, generates a single-sided gradient flag in the current iteration round:
Figure BDA0003744893050000081
marking of loss value
Figure BDA0003744893050000082
Obtaining intermediate parameters, wherein n is the round of task iterative computation, j1 is the affiliated parameter identification of the external network initiator, and the intermediate parameters are subjected to the PubKey pair through a paillier encryption processor
Figure BDA0003744893050000083
Encrypting to generate ciphertext data CT0nAnd will be sent to the extranet coordinator 12;
step S203, the off-line transmission module 2 is accessed to the external network federal learning cluster network, and the external network transmission monitoring module 13 receives the PubKey transmitted in the step S201 and the node identification h of the internal network participant 11iThrough the intranet public key set X received in step S1030Of (2)iTo coordinator pubKey, hiEncrypting to generate an encrypted file priFile, and writing the encrypted file into the offline transmission module 2, wherein the file path is as follows: [ drive #:][/]fldir/hithe current task is a first batch iteration task, and the value of n is 01;
s204, the off-line transmission module 2 is disconnected from the outer network and is accessed into the inner network federal learning cluster network, the inner network transmission monitoring module 13 scans the file path of the off-line transmission module 2 and reads the encrypted file priFile generated in the step S203, and the node identifier h is used for collecting S through the key pair generated in the step S102iPrivate key k ofiDecrypting the encrypted file to obtain the public key PubKey of the coordinator, and sending the obtained PukKey to the intranet participant 11 (node identification h)i);
Step S205, the intranet participant 11 receives the public key PubKey decrypted in step S204, sets a training batch in which local data participates in each iteration synchronously, creates a paillier encryption processor, initializes a model, starts a first iterative computation, and marks a single-side gradient generated by a current iteration as:
Figure BDA0003744893050000084
marking of loss value
Figure BDA0003744893050000085
Obtaining an intermediate parameter, wherein n is the round of task iterative computation, j2 is the affiliated parameter identification of the intranet participant 11, and the intermediate parameter is subjected to PubKey pair through a paillier encryption processor
Figure BDA0003744893050000086
Figure BDA0003744893050000087
Encrypting to generate intermediate parameter ciphertext data CT1nAnd sends the cipher text data to the intranet transmission monitoring module 13;
in step S206, the intranet transmission monitoring module 13 receives the ciphertext data CT1 in step S205nBased on the key pair S generated in step S102, the participant node identifier hiCorresponding private key kiFor the ciphertext data CT1nSigning to generate a signature file CTFilenAnd writing the signature file into the offline transmission module 2, wherein the file path is as follows: [ drive #:][/]fldir/hi/task/n/CTFilen
step S207, after step S203 is completed, the off-line transmission module 2 disconnects from the intranet and accesses the extranet federal learning cluster network, the extranet transmission monitoring module 13 scans the file path of the off-line transmission module 2, reads the signature file CTFile generated in step S206, and passes through the intranet public key set X received in step S1030Participant h in (1)iIdentify the corresponding public key PiChecking the signature of the signature file CTFile, obtaining ciphertext data CT1 generated in the step S205 after the signature is successfully checked, sending the ciphertext data to the coordinator 12, and if the signature is failed to be checked, indicating that the offline transmission module 2 has a risk of being tampered, and terminating the privacy calculation task;
step S208, the external network coordinator 12 receives the intermediate parameter ciphertext data CT1 in step S207 and the intermediate parameter ciphertext data CT0 in step S202, decrypts the intermediate parameters of the two parties through the private key PrivKey generated in step S201, and after decryption, the external network coordinator 12 obtains the unilateral gradient of the external network initiator 10 in the first round of iterative computation process
Figure BDA0003744893050000091
Loss value
Figure BDA0003744893050000092
Single edge gradient with intranet participant 11
Figure BDA0003744893050000093
Loss value
Figure BDA0003744893050000094
The composed intermediate parameter plaintext data;
step S209, the extranet coordinator 12 optimizes and aggregates the gradients of the two parties obtained in step S208 to obtain a total gradient after the first iteration optimization, and marks the total gradient as: total _ dtn, and segmenting the optimized total gradient to respectively obtain the optimized unilateral gradient of the external network initiator 10, and marking as:
Figure BDA0003744893050000095
and the unilateral gradient after the optimization of the intranet participant 11 is marked as:
Figure BDA0003744893050000096
respectively sending the optimized gradients to an external network initiator 10 and an external network transmission monitoring module 13, and simultaneously aggregating the loss values of the two parties obtained in step S208 to obtain a loss value after the first iteration, and marking as: iter _ lossn
EXAMPLE III
This embodiment is a further explanation of the step S30, and as shown in fig. 5 and fig. 6, the step S30 includes:
step S301, the external network coordinator optimizes the unilateral gradient
Figure BDA0003744893050000097
Sending to the external network initiator node, and optimizing the optimized unilateral gradient
Figure BDA0003744893050000101
Intranet participant node identificationhi is sent to an external network transmission monitoring module, and an external network initiator node receives the unilateral gradient
Figure BDA0003744893050000102
Updating local model parameters according to the optimized gradient, performing the next iteration batch calculation, and generating a current iteration round unilateral gradient mark:
Figure BDA0003744893050000103
marking loss value as
Figure BDA0003744893050000104
As an intermediate parameter, the PubKey is adopted to match the intermediate parameter
Figure BDA0003744893050000105
Encrypted by a paillier encryption processor to generate ciphertext data CT0n+1
Step S302, the offline transmission module is accessed into the external network federal learning cluster network, and the external network transmission monitoring module receives the optimized unilateral gradient transmitted in the step S301
Figure BDA0003744893050000106
And participant node identification hiThrough the intranet public key set X received in step S1030Of (2)iFor is to
Figure BDA0003744893050000107
hiEncrypting to generate an encrypted file marked as wFilenAnd writing the encrypted file into an offline transmission module, wherein the file path is as follows: [ drive #:][/]fldir/hi/task/n/wFilen
step S303, after step S302 is completed, the off-line transmission module disconnects with the outer network and accesses the intranet link learning cluster network, the intranet transmission monitoring module scans the file path of the off-line transmission module and reads the encrypted file wFile generated in step S2nUsing the node identification h, via the set S of key pairs generated in step S102iPrivate key k ofiFor encrypted textDecrypting the part to obtain the decrypted single-side gradient
Figure BDA0003744893050000108
Subjecting the obtained gradient to
Figure BDA0003744893050000109
Sending the information to an intranet participant;
step S304, the intranet participant receives the single-side gradient decrypted in the step S303
Figure BDA00037448930500001010
Updating local model parameters according to the received unilateral gradient, performing the next iteration batch calculation, and generating a current iteration round unilateral gradient mark as:
Figure BDA00037448930500001011
marking of loss value
Figure BDA00037448930500001012
As an intermediate parameter, the intermediate parameter is encrypted by a paillier encryption processor by adopting PubKey
Figure BDA00037448930500001013
Encrypting the data by a paillier encryption processor to generate intermediate parameter ciphertext data CT1n+1And sending the ciphertext data to an intranet transmission monitoring module;
step S305, the intranet transmission monitoring module receives the intermediate parameter ciphertext data CT1 in step S304n+1Based on the key pair S generated in step S102 and the node ID h of the intranet participantiCorresponding private key kiFor the ciphertext data CT1n+1Signing to generate a signature file CTFilen+1And writing the signature file into an offline transmission module, wherein the file path is as follows:
[drive#:][/]fldir/hi/task/n/CTFilen+1
step S306, after step S305 is completed, the off-line transmission module is disconnected with the intranet, and the off-line transmission module is accessed to the external network federal learning cluster network;
step S307, the external network transmission monitoring module scans the file path of the offline transmission module, and reads the signature file CTFile generated in step S305n+1The intranet public key set X received in step S1030Participant identification h iniCorresponding public key PiTo signature file CTFilen+1Checking the signature, and acquiring the ciphertext data CT1 generated in the step S304 after the signature is successfully checkedn+1And sending the ciphertext data to an external network coordinator, and if the signature verification fails, indicating that the offline transmission module has a risk of being tampered, terminating the privacy calculation task;
step S308, the external network coordination party respectively receives the intermediate parameter ciphertext data CT1n+1And intermediate parameter ciphertext data CT0n+1Decrypting the intermediate parameters of the two parties through the private key PrivKey generated in the step S201, and obtaining the unilateral gradient of the external network initiator in the first iteration calculation process by the external network coordinator after decryption
Figure BDA0003744893050000111
Loss value
Figure BDA0003744893050000112
Single edge gradient with intranet participants
Figure BDA0003744893050000113
Loss value
Figure BDA0003744893050000114
The composed intermediate parameter plaintext data;
and the outer network coordinator optimizes and aggregates the obtained gradients of the two parties to obtain a total gradient after the current iteration optimization, and the total gradient is marked as: total _ dtn+1And segmenting the optimized total gradient to respectively obtain the optimized unilateral gradient of the external network initiator, and marking as:
Figure BDA0003744893050000115
and the unilateral gradient after the optimization of the intranet participant is marked as:
Figure BDA0003744893050000116
and respectively sending the optimized gradient to an external network initiator and an external network transmission monitoring module. And simultaneously polymerizing the obtained loss values of the two parties to obtain the loss value after the current round of iteration, and marking the loss value as: iter _ lossiThe outer network coordinator calculates the loss values of all the iteration batches to obtain a convergence threshold value sigma2The calculation formula is as follows:
Figure BDA0003744893050000117
n is the iteration round, i is the current round variable, σ2To converge the threshold, by2And judging whether the model converges or not if the preset threshold value is reached or not.
In this embodiment, in a federate learning cluster network under two or more network isolation, a center node under each cluster network may act as a coordinator according to a network environment where a federate learning initiator is located, please refer to fig. 5, when an extranet node acts as an initiator and an intranet acts as a participant, an extranet center node acts as a current task coordinator; similarly, referring to fig. 6, when the intranet node is used as the initiator and the extranet is used as the participant, the intranet central node is used as the current task coordinator. The specific process is realized according to step S30, and the specific calculation flow can be subjected to batch iterative training according to steps S10 to S40.
The above description is only a preferred embodiment of the present invention and should not be taken as limiting the invention, and any modification, equivalent replacement or improvement made within the technical scope of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The cross-network federal learning system is characterized by comprising at least two completely physically isolated networks and an offline transmission module connected between the two networks, wherein the two networks are respectively defined as an extranet and an intranet, and each network comprises an initiator, a participant, a coordinator and a transmission monitoring module, wherein:
the outer network initiator is used for completing a federated learning joint modeling task together with the inner network participant;
the intranet participant is used for providing local data, accepting the invitation of the extranet initiator and participating in the federated learning joint modeling task;
the outer network coordinator is used for performing aggregation optimization on intermediate parameters in the process of executing federated learning iterative computation by the outer network initiator and the inner network participant and generating new intermediate parameters;
the outer network transmission monitoring module is used for reading the intermediate parameters from the off-line transmission module and then sending the intermediate parameters to the outer network coordinator, and receiving the intermediate parameters from the outer network coordinator and then writing the intermediate parameters into the off-line transmission module;
the off-line transmission module is used for interactively transmitting the encrypted intermediate parameters between the two networks.
2. The cross-network federated learning system of claim 1, wherein the offline transmission module is further configured to store intermediate parameters for inter-transmission between two networks.
3. A cross-network federated learning method is characterized in that the method is realized based on a system, the system comprises at least two completely physically isolated networks and an offline transmission module connected between the two networks, the two networks are respectively defined as an extranet and an intranet, each network comprises an initiator, a participant, a coordinator and a transmission monitoring module, and the method comprises the following steps:
step S10, initializing the transmission monitoring module;
step S20, the external network initiator and the internal network participant complete the creation of a federal learning model together, the external network coordinator performs environment initialization and starts a first batch iterative computation task;
step S30, the external network coordinator respectively sends the two-party unilateral gradient after the iterative optimization in the step S20 to an external network initiator and an external network transmission monitoring module, sets a data interval for executing a next iteration batch, and starts the next iteration batch calculation;
and step S40, repeatedly and iteratively executing the step S30 until the federal learning task is finished.
4. A cross-network federal learning method as claimed in claim 3, wherein said step S10 comprises the following procedures:
step S101, respectively starting transmission monitoring modules of an internal network and an external network;
step S102, adopting RSA or SM2 asymmetric algorithm by transmission monitoring modules of the internal network and the external network respectively, and generating a key pair set S and a public key set X based on each participating node identification of the local federal learning cluster network;
step S103, the transmission monitoring modules of the internal network and the external network respectively load the public key set X generated by the network of the opposite side in the step S102 and mark the public key set X as X0
Step S104, starting the scanning monitoring process of the off-line transmission module, and monitoring the off-line federal learning iteration batch intermediate parameters for reading and writing the off-line transmission module in real time.
5. The cross-network federated learning method of claim 4, wherein in the step S101, the transmission monitoring modules of the internal network and the external network establish secure transmission channels based on a TLCP cryptographic protocol, respectively join each party 'S federated learning cluster network environment, and adopt a Netty protocol to establish secure connections with each computing node and a central node of the party' S federated learning cluster network, for storing and transmitting intermediate parameters in the federated learning computing batch iterative process.
6. The cross-network federated learning method of claim 5, wherein in the step S102, a node identity is defined as: u = { h)1,h2,h3......hiH, where U is a node identification set in the network environment, and hiDefining as a federal learning node mark, wherein i represents the number of federal learning cluster nodes, and respectively generating and generating a public and private key set and a public key set corresponding to each node mark by adopting an RSA or SM2 encryption algorithm based on the node mark, wherein a key pair set S is defined as:
Figure FDA0003744893040000021
the public key set X is defined as:
Figure FDA0003744893040000031
wherein, PiRepresenting Federal learning node identity hiOf the public key, kiRepresenting a Federal learning node identifier hiThe private key of (1).
7. The cross-network federated learning method of claim 6, wherein in step S104, the offline transmission rule of the offline transmission module includes a middle parameter ciphertext data storage path and a middle parameter ciphertext data verification combination, and both parties define the storage path rule as:
[drive#:][/]fldir/hi/task/n/;
where fldir denotes the device root directory, hiRepresenting participant node identification, task is a fixed folder name, n represents the turn of task iterative computation, and is sequentially increased from 01 and 02iAnd the node public key carries out data decryption and signature verification on the intermediate parameters read from the offline transmission module, and judges whether the decryption is normal or not.
8. The cross-network federal learning method as claimed in claim 7, wherein said step S20 comprises the following procedures:
step S201, the external network coordinator generates an initialized public and private key pair based on the paillier algorithm, and identifies a public key PubKey and an internal network participant node hiRespectively sending the parameters to an external network initiator and an external network transmission monitoring module, encrypting intermediate parameters of iterative batches provided for the external network initiator and an internal network participant, storing PrivKey by an external network coordinator, and receiving the intermediate parameters of the iterative batchesCarrying out decryption, and simultaneously setting a data interval of the current iteration batch participating in training;
step S202, the external network initiator receives the PubKey, sets local data to participate in training batch in each iteration, creates a paillier encryption processor, initializes a model and starts the first iteration calculation, and generates a unilateral gradient mark of the current iteration round:
Figure FDA0003744893040000032
marking loss value as
Figure FDA0003744893040000033
Obtaining intermediate parameters, wherein n is the round of task iterative computation, j1 is the affiliated parameter identification of the external network initiator, and the intermediate parameters are subjected to the PubKey pair through a paillier encryption processor
Figure FDA0003744893040000041
Encrypting to generate ciphertext data CT0nAnd will send to the coordination party of the extranet;
step S203, the off-line transmission module is accessed into an external network federal learning cluster network, and the external network transmission monitoring module receives the PubKey and the node identification h of the internal network participant transmitted in the step S201iThrough the intranet public key set X received in step S1030Public key P in (1)iPubKey and h for coordinatoriEncrypting to generate an encrypted file priFile, and writing the encrypted file into an offline transmission module, wherein the file path is as follows: [ drive #:][/]fldir/hithe current task is a first batch iteration task, and the value of n is 01;
s204, the off-line transmission module is disconnected from the outer network and is accessed into the inner network federal learning cluster network, the inner network transmission monitoring module scans the file path of the off-line transmission module and reads the encrypted file priFile generated in the step S203, and the node identifier h is used for collecting S through the key pair generated in the step S102iPrivate key k ofiDecrypting the encrypted file to obtain a public key PubKey of a coordinator, and sending the obtained PukKey to an intranet participant;
Step S205, the intranet participant receives the public key PubKey decrypted in step S204, sets a training batch in which local data participates in each iteration synchronously, creates a paillier encryption processor, initializes a model, starts the first iteration calculation, and marks a unilateral gradient generated by the current iteration as:
Figure FDA0003744893040000042
marking of loss value
Figure FDA0003744893040000043
Obtaining an intermediate parameter, wherein n is the round of task iterative computation, j2 is the affiliated parameter identification of the intranet participant, and the intermediate parameter is subjected to the PubKey pair through a paillier encryption processor
Figure FDA0003744893040000044
Encrypting to generate intermediate parameter ciphertext data CT1nAnd send the cipher text data to the intranet transmission monitoring module;
step S206, the intranet transmission monitoring module receives the ciphertext data CT1 of the step S205nBased on the key pair S generated in step S102, the participant node identification hiCorresponding private key kiFor the ciphertext data CT1nSigning is carried out to generate a signature file CTFilenAnd writing the signature file into an offline transmission module, wherein the file path is as follows: [ drive #:][/]fldir/hi/task/n/CTFilen
9. the cross-network federal learning method as claimed in claim 8, wherein said step S20 further comprises:
step S207, after step S203, the off-line transmission module is disconnected from the intranet and accesses the external network Federal learning Cluster network, the external network transmission monitoring module scans the file path of the off-line transmission module, reads the signature file CTFile generated in step S206, and passes through the intranet public key set X received in step S1030Participant h in (1)iIdentify the corresponding public key PiChecking the signature of the signature file CTFile, obtaining ciphertext data CT1 generated in the step S205 after the signature is successfully checked, sending the ciphertext data to a coordinator, and if the signature is failed to be checked, indicating that the offline transmission module has a tampered risk, and terminating a privacy calculation task;
step S208, the external network coordinator respectively receives the intermediate parameter ciphertext data CT1 in step S207 and the intermediate parameter ciphertext data CT0 in step S202, then decrypts the intermediate parameters of the two parties through the private key PrivKey generated in step S201, and after decryption, the external network coordinator obtains the unilateral gradient of the external network initiator in the first round of iterative computation process
Figure FDA0003744893040000051
Loss value
Figure FDA0003744893040000052
Single edge gradient with inner net participant
Figure FDA0003744893040000053
Loss value
Figure FDA0003744893040000054
The composed intermediate parameter plaintext data;
step S209, the external network coordinator optimizes and aggregates the gradients of the two parties obtained in the step S208 to obtain a total gradient after the first iteration optimization, and the total gradient is marked as: total _ dtnAnd segmenting the optimized total gradient to respectively obtain the optimized unilateral gradient of the external network initiator, and marking as:
Figure FDA0003744893040000055
and the unilateral gradient after the optimization of the intranet participant is marked as:
Figure FDA0003744893040000056
respectively sending the optimized gradients to an external network initiator and an external network transmission monitoring module, and simultaneously aggregating the loss values of the two parties obtained in the step S208 to obtain a first-round overlapping valueThe loss values after generation, marked as: iter _ lossn
10. The cross-network federal learning method as claimed in claim 9, wherein said step S30 includes:
step S301, the external network coordinator optimizes the unilateral gradient
Figure FDA0003744893040000057
Sending to the external network initiator node, and optimizing the optimized unilateral gradient
Figure FDA0003744893040000058
And the node identifier hi of the internal network participant is sent to the external network transmission monitoring module, and the node of the external network initiator receives the unilateral gradient
Figure FDA0003744893040000059
Updating local model parameters according to the optimized gradient, performing the next iteration batch calculation, and generating a current iteration round unilateral gradient mark:
Figure FDA0003744893040000061
marking loss value as
Figure FDA0003744893040000062
As an intermediate parameter, the PubKey is adopted to match the intermediate parameter
Figure FDA0003744893040000063
Encrypted by a paillier encryption processor to generate ciphertext data CT0n+1
Step S302, the offline transmission module is accessed into the external network federal learning cluster network, and the external network transmission monitoring module receives the optimized unilateral gradient transmitted in the step S301
Figure FDA0003744893040000064
And participant node identification hiThe content received in step S103Network public key set X0Public key P in (1)iTo pair
Figure FDA0003744893040000065
hiEncrypting to generate an encrypted file marked as wFilenAnd writing the encrypted file into an offline transmission module, wherein the file path is as follows: [ drive #:][/]fldir/hi/task/n/wFilen
step S303, after step S302 is completed, the off-line transmission module disconnects with the outer network and accesses the intranet link learning cluster network, the intranet transmission monitoring module scans the file path of the off-line transmission module and reads the encrypted file wFile generated in step S2nUsing the node identification h, through the set S of key pairs generated in step S102iPrivate key k ofiDecrypting the encrypted file to obtain the decrypted unilateral gradient
Figure FDA0003744893040000066
Subjecting the obtained gradient to
Figure FDA0003744893040000067
Sending the information to an intranet participant;
step S304, the intranet participant receives the single-side gradient decrypted in the step S303
Figure FDA0003744893040000068
Updating local model parameters according to the received unilateral gradient, performing the next iteration batch calculation, and generating a current iteration round unilateral gradient mark:
Figure FDA0003744893040000069
marking loss value as
Figure FDA00037448930400000610
As an intermediate parameter, the intermediate parameter is encrypted by a paillier encryption processor by adopting PubKey
Figure FDA00037448930400000611
Encrypting the data by a paillier encryption processor to generate intermediate parameter ciphertext data CT1n+1And sending the ciphertext data to an intranet transmission monitoring module;
step S305, the intranet transmission monitoring module receives the intermediate parameter ciphertext data CT1 in step S304n+1Based on the key pair S generated in step S102 and the node ID h of the intranet participantiCorresponding private key kiFor the ciphertext data CT1n+1Signing to generate a signature file CTFilen+1And writing the signature file into an offline transmission module, wherein the file path is as follows:
[drive#:][/]fldir/hi/task/n/CTFilen+1
step S306, after step S305 is completed, the off-line transmission module is disconnected with the intranet, and the off-line transmission module is accessed to the external network federal learning cluster network;
step S307, the external network transmission monitoring module scans the file path of the offline transmission module, and reads the signature file CTFile generated in step S305n+1The intranet public key set X received in step S1030Participant identification h in (1)iCorresponding public key PiFor signed file CTFilen+1Checking the signature, and acquiring the ciphertext data CT1 generated in the step S304 after the signature is successfully checkedn+1And sending the ciphertext data to an extranet coordinator, and if the signature verification fails, indicating that the offline transmission module has a risk of being tampered, terminating the privacy calculation task;
step S308, the external network coordinator respectively receives the intermediate parameter ciphertext data CT1n+1And intermediate parameter ciphertext data CT0n+1Decrypting the intermediate parameters of the two parties through the private key PrivKey generated in the step S201, and obtaining the unilateral gradient of the external network initiator in the first iteration calculation process by the external network coordinator after decryption
Figure FDA0003744893040000071
Loss value
Figure FDA0003744893040000072
Single edge gradient with inner net participant
Figure FDA0003744893040000073
Loss value
Figure FDA0003744893040000074
The composed intermediate parameter plaintext data;
and the outer network coordinator optimizes and aggregates the obtained gradients of the two parties to obtain a total gradient after the current round of iterative optimization, and marks the total gradient as: total _ dtn+1And segmenting the optimized total gradient to respectively obtain the optimized unilateral gradient of the external network initiator, and marking as:
Figure FDA0003744893040000075
and the unilateral gradient after the optimization of the intranet participant is marked as:
Figure FDA0003744893040000076
and respectively sending the optimized gradient to an external network initiator and an external network transmission monitoring module. And simultaneously polymerizing the obtained loss values of the two parties to obtain the loss value after the current round of iteration, and marking the loss value as: iter _ lossiAnd the outer network coordinator calculates the loss values of all the iteration batches to obtain a convergence threshold value sigma2The calculation formula is as follows:
Figure FDA0003744893040000077
n is the iteration round, i is the current round variable, σ2To converge the threshold, by2And judging whether the model converges or not if the preset threshold value is reached.
CN202210823096.6A 2022-07-13 2022-07-13 Cross-network federal learning system and method Active CN115277696B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210823096.6A CN115277696B (en) 2022-07-13 2022-07-13 Cross-network federal learning system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210823096.6A CN115277696B (en) 2022-07-13 2022-07-13 Cross-network federal learning system and method

Publications (2)

Publication Number Publication Date
CN115277696A true CN115277696A (en) 2022-11-01
CN115277696B CN115277696B (en) 2023-04-18

Family

ID=83765984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210823096.6A Active CN115277696B (en) 2022-07-13 2022-07-13 Cross-network federal learning system and method

Country Status (1)

Country Link
CN (1) CN115277696B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115766189A (en) * 2022-11-10 2023-03-07 贵州电网有限责任公司 Multi-channel isolation safety protection method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487042A (en) * 2021-06-28 2021-10-08 海光信息技术股份有限公司 Federated learning method and device and federated learning system
CN114003950A (en) * 2021-10-19 2022-02-01 南京三眼精灵信息技术有限公司 Federal machine learning method, device, equipment and medium based on safety calculation
CN114139722A (en) * 2021-11-29 2022-03-04 广发银行股份有限公司 Block chain-based federal learning task scheduling method, system, device and medium
WO2022060264A1 (en) * 2020-09-18 2022-03-24 Telefonaktiebolaget Lm Ericsson (Publ) Methods and systems for updating machine learning models
CN114490704A (en) * 2020-11-13 2022-05-13 深圳前海微众银行股份有限公司 Data processing method, device, equipment and storage medium
CN114640498A (en) * 2022-01-27 2022-06-17 天津理工大学 Network intrusion cooperative detection method based on federal learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022060264A1 (en) * 2020-09-18 2022-03-24 Telefonaktiebolaget Lm Ericsson (Publ) Methods and systems for updating machine learning models
CN114490704A (en) * 2020-11-13 2022-05-13 深圳前海微众银行股份有限公司 Data processing method, device, equipment and storage medium
CN113487042A (en) * 2021-06-28 2021-10-08 海光信息技术股份有限公司 Federated learning method and device and federated learning system
CN114003950A (en) * 2021-10-19 2022-02-01 南京三眼精灵信息技术有限公司 Federal machine learning method, device, equipment and medium based on safety calculation
CN114139722A (en) * 2021-11-29 2022-03-04 广发银行股份有限公司 Block chain-based federal learning task scheduling method, system, device and medium
CN114640498A (en) * 2022-01-27 2022-06-17 天津理工大学 Network intrusion cooperative detection method based on federal learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115766189A (en) * 2022-11-10 2023-03-07 贵州电网有限责任公司 Multi-channel isolation safety protection method and system
CN115766189B (en) * 2022-11-10 2024-05-03 贵州电网有限责任公司 Multichannel isolation safety protection method and system

Also Published As

Publication number Publication date
CN115277696B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
Cui et al. RSMA: Reputation system-based lightweight message authentication framework and protocol for 5G-enabled vehicular networks
Khan et al. An efficient and provably secure certificateless key-encapsulated signcryption scheme for flying ad-hoc network
US20210336792A1 (en) Leveraging multiple devices to enhance security of biometric authentication
CN113194469A (en) 5G unmanned aerial vehicle cross-domain identity authentication method, system and terminal based on block chain
RU2009120689A (en) DISTRIBUTED CANCELLATION OF AUTHORITY OF DEVICES
CN110113334A (en) Contract processing method, equipment and storage medium based on block chain
CN115277696B (en) Cross-network federal learning system and method
CN116957064A (en) Knowledge distillation-based federal learning privacy protection model training method and system
CN115935438A (en) Data privacy intersection system and method
Zhao et al. Fuzzy identity-based dynamic auditing of big data on cloud storage
CN116011014A (en) Privacy computing method and privacy computing system
CN105610872A (en) Internet of Things terminal encryption method and Internet of Things terminal encryption device
CN111709053B (en) Operation method and operation device based on loose coupling transaction network
CN114124347A (en) Safe multi-party computing method and system based on block chain
CN113328854A (en) Service processing method and system based on block chain
Zhou et al. VDFChain: Secure and verifiable decentralized federated learning via committee-based blockchain
CN114257419B (en) Device authentication method, device, computer device and storage medium
Zeng et al. Concurrently Deniable Group Key Agreement and Its Application to Privacy‐Preserving VANETs
Kiefer et al. Universally composable two-server PAKE
CN115361196A (en) Service interaction method based on block chain network
Ayad et al. An efficient authenticated group key agreement protocol for dynamic UAV fleets in untrusted environments
He et al. Efficient group key management for secure big data in predictable large‐scale networks
CN114172742A (en) Layered authentication method for power internet of things terminal equipment based on node map and edge authentication
CN116614273B (en) Federal learning data sharing system and model construction method in peer-to-peer network based on CP-ABE
Tian et al. A new construction for linkable secret handshake

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant