CN112232527A - Safe distributed federal deep learning method - Google Patents

Safe distributed federal deep learning method Download PDF

Info

Publication number
CN112232527A
CN112232527A CN202010996487.9A CN202010996487A CN112232527A CN 112232527 A CN112232527 A CN 112232527A CN 202010996487 A CN202010996487 A CN 202010996487A CN 112232527 A CN112232527 A CN 112232527A
Authority
CN
China
Prior art keywords
model
training
aggregation
node
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010996487.9A
Other languages
Chinese (zh)
Other versions
CN112232527B (en
Inventor
黄小红
李丹丹
刘国智
钱叶魁
闪德胜
丛群
杨瑞朋
黄浩
夏军波
雒朝峰
李建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202010996487.9A priority Critical patent/CN112232527B/en
Publication of CN112232527A publication Critical patent/CN112232527A/en
Application granted granted Critical
Publication of CN112232527B publication Critical patent/CN112232527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

One or more embodiments of the present specification provide a safe distributed federal deep learning method to protect the raw data of the participants in the process of federal deep learning, and simultaneously avoid the parameters of the learning model from revealing the raw data of the participants. After identity authentication, each participant joins a super account book, and a node gives an initial model and initial parameters; dividing the participants into aggregation nodes and common nodes according to the intelligent contract; the common node receives the model, then trains, encrypts a training result and transmits the training result to the aggregation node, the aggregation node receives the encrypted model, then executes aggregation operation and transmits the result to the common node, and the common node receives the aggregation result, then decrypts the aggregation result and trains the aggregation result; and the common node decrypts the aggregation result, then needs to verify the model effect, and votes through an intelligent contract whether to terminate the learning. The method provided by the invention does not assume the existence of a semi-honest central server, improves the safety of the algorithm and is closer to the actual scene.

Description

Safe distributed federal deep learning method
Technical Field
One or more embodiments of the present disclosure relate to the technical field of crossing block chains and artificial intelligence, and in particular, to a secure distributed federal deep learning method.
Background
In each round of deep learning network training, each participant obtains the same initial model and trains the model by using local data, then the trained model is sent to an aggregator for aggregation, and the aggregator sends the aggregated model to the participants for the next round of training. That is, during federal learning, training data is not shared among the various participants.
Differential privacy is a means in cryptography, and is a well-accepted strict privacy protection model. The method for adding noise in the gradient training result of the deep neural network is used for realizing differential privacy, so that an attacker cannot judge whether specific training data are used for training the neural network, and the attacker is prevented from recovering original training data from the gradient training result of the deep neural network.
The federated learning algorithm without security configuration has a risk of revealing privacy when sharing gradient data, and the processing method in the traditional technical scheme is as follows: differential privacy noise is applied to the raw gradient data, and a semi-honest central server is responsible for the aggregation of the models. The problems faced by the conventional technical scheme are as follows:
1. a semi-honest server does not necessarily exist in an actual environment, and the semi-honest server can also be subjected to single-point failure or dishonest behaviors due to various attacks;
2. the currently mainstream implementation of differential privacy noise based on a deep neural network, such as dp-sgd, has demonstrated that repeated addition of differential privacy noise to the same training data reduces the degree of protection of differential privacy, and the degree of reduction can be estimated, so that the exposure of the gradient of the added differential noise to the semi-honest server still risks data leakage.
The federal learning method in the prior art has two problems that a semi-honest central server in an actual working environment is unreliable and privacy risks exist in a data transmission process.
Disclosure of Invention
In view of the above, an object of one or more embodiments of the present disclosure is to provide a secure distributed federated learning method for solving two problems of unreliable semi-honest servers and privacy risks in data transfer processes in the prior art.
In view of the above, one or more embodiments of the present specification provide a secure distributed federated deep learning method, including:
all participants join the blockchain;
selecting one participant as a learning organizer, enabling the learning organizer to generate an initial model and initial parameters, and distributing the initial model and the initial parameters to the participants;
and dividing the participator into aggregation nodes and common nodes according to an intelligent contract, carrying out training on the initial model by using a distributed federal deep learning method, and obtaining a model meeting the requirements after the training is finished.
In the first round of training, the common node trains the received initial model; in the subsequent training, the common node trains the received global model; and the global model is obtained by aggregating the training results sent by all the common nodes in the previous training round by the aggregation node.
And the common node evaluates the effect of the received global model, and if the effect reaches the expectation, the training is finished and the model meeting the requirements is obtained.
And when the number of training rounds reaches a set threshold value and the effect of the global model received by the common node does not reach the expectation, the participants are divided into new common nodes and new aggregation nodes again, and the global model is transmitted to the new common nodes for training.
Based on the same inventive concept, one or more embodiments of the present specification further provide a secure distributed federal deep learning method implementation apparatus, including:
the model initialization module is responsible for generating the initial model and the initial parameters and distributing the initial model and the initial parameters;
the node dividing module is responsible for dividing the participants into the common nodes and the aggregation nodes;
and the model training module is responsible for training the initial model until the global model effect reaches the expected requirement, and outputting the model meeting the requirement.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the method as described in any one of the above items when executing the program.
From the above description, it can be seen that the safe distributed federal deep learning method provided by one or more embodiments of the present specification has the following beneficial effects in an application scenario:
on the basis of the traditional technical scheme, the method adopts a block chain to maintain a finally consistent, non-falsifiable, dynamically joining and quitting and distributed database for all the participants participating in the learning process.
The private data structure of the super book is used to achieve a balance between data privacy and verifiability, and to ensure that most nodes in the network do not incur excessive time and space costs due to the introduction of blockchains.
An intelligent contract is used to automatically control the learning process.
And the data added with the differential privacy noise is encrypted again by using the semi-homomorphic encryption, so that the safety is improved.
Compared with the existing method, the method does not assume that a semi-honest third party exists in the system, and is a safer and more practical safe federal learning system.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
FIG. 1 is a schematic workflow diagram of a secure distributed federated deep learning method described in this specification;
FIG. 2 is a flow diagram of model training in accordance with one or more embodiments of the present disclosure;
FIG. 3 is a flow diagram illustrating the operation of a generic node in one or more embodiments of the present disclosure;
FIG. 4 is a flow diagram illustrating operation of an aggregation node in one or more embodiments of the present disclosure;
FIG. 5 is a schematic structural diagram of an implementation apparatus of a secure distributed federated deep learning method in one or more embodiments of the present specification;
fig. 6 is a schematic structural diagram of an electronic device according to one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items.
As discussed in the background section, problems exist with existing federal deep learning approaches. In the course of implementing the present disclosure, the applicant finds that, in the conventional federal deep learning method, aggregation of models is performed by using a semi-honest central server, and only differential privacy noise encryption is applied to raw gradient data, but such a scheme may face the following problems: a semi-honest server does not necessarily exist in an actual environment, and the semi-honest server can also be subjected to single-point failure or dishonest behaviors due to various attacks; at present, the mainstream realization of the differential privacy noise based on the deep neural network proves that the protection degree of the differential privacy can be reduced by repeatedly adding the differential privacy noise to the same training data, and the reduction degree can be estimated, so that the risk of data leakage still exists when the gradient of the differential noise is exposed to a semi-honest server. This represents that the protection of the privacy data by the conventional technical scheme is not strong enough.
In view of the above, one or more embodiments of the present disclosure provide a secure distributed federated deep learning method, whose brief steps are shown in fig. 1. Specifically, all the participants participating in the federal deep learning are added into the super ledger, and the super ledger is a block chain system with wide application. Then, the participants on one chain are selected for model initialization, an initial model and initial parameters are obtained and distributed to the remaining participants on the chain. And then, dividing all the participants into aggregation nodes and common nodes according to an intelligent contract, training the common node type initial model and the global model and transmitting the training results to the aggregation nodes, wherein the aggregation nodes are responsible for aggregating the training results to obtain the global model and transmitting the global model to the common nodes. In addition, the effect of the general node is verified after the general node receives the global model, the training can be stopped after the effect reaches the preset requirement, the model meeting the requirement is output as a training result, otherwise, the training is continued, and the general node and the aggregation node are divided again after the training is carried out for a certain number of rounds.
It can be seen that, in the secure distributed federal learning method in one or more embodiments of the present specification, based on a block chain technique and an intelligent contract technique, aggregation of training results is put into distributed aggregation nodes, and the aggregation nodes are not fixed participants but periodically alternate with the training process, so that the disadvantage of a semi-honest central server is effectively avoided, and the protection of private data is improved.
The safe distributed federal deep learning method described in one or more embodiments of the present specification is applied to training of an abnormal traffic recognition model in a cross-domain abnormal traffic recognition scene.
In abnormal traffic identification, as the data set used for training the model is low in publishing frequency and common attacks such as Lesoh software, worms, trojans and the like are not considered, the model trained by using the data set is difficult to cope with complex attack traffic in a real network environment. The administrator of any network domain is difficult to mark enough high-quality data for training, and the collected traffic of any single network domain cannot accurately discover unseen attacks; moreover, since network traffic may reveal sensitive information and user data within a domain, its data set cannot be aggregated among multiple network domains. Federated learning can allow multiple parties to co-train a model without acquiring raw data.
The technical solutions of one or more embodiments of the present specification are described in detail below with reference to specific embodiments.
Referring to fig. 2, the secure distributed federal deep learning method of one or more embodiments of the present specification includes the following steps:
and step S201, adding a block chain by a participant.
In this step, the block chain added by the participant is a super account book based on the alliance chain technology, the super account book is preset with an admission mechanism, the function of performing qualification audit on a new participant who wants to add federal deep learning is achieved, and only the participant who passes through the admission mechanism can continue to play a role in subsequent steps.
In this embodiment, the participants of the federal deep learning are several independent network domains. When a new participant wants to join the blockchain, the existing participant needs to perform identity verification on the new participant online, confirm that the identity of the new participant is valid, does not participate in the federal learning system with other identities, and can generate reliable label data to train a network model. The existing participators in the system make safe endorsements for the new participators, and record endorsement results to the super ledger. When more than 10% of the participants in the system make secure endorsements, or approve the secure endorsements made by other participants, the CA node in the super ledger issues certificates for the participants. And the new participant creates a super ledger node through the certificate and joins the super ledger.
And step S202, selecting one participant as a learning organizer, and generating an initial model and initial parameters by the learning organizer and distributing the initial model and the initial parameters to each participant.
In this step, in step S201, one of all network domains, in which all participants, i.e. super accounts, are added, is selected as a learning organizer, and the learning organizer generates an initial model N and initial parameters of the model, including a learning rate l and a learning rate attenuation ldecRegularization term l2Drop rate rdropoutAnd the like. And uploading the hash value and initial parameters of the initial model to a public data set of the super book, and then distributing the model to the rest of network domains in a p2p mode without a block chain. After receiving the initial model, the network domain calculates the hash value of the model and compares the hash value with the hash value retained in the public data set, verifies the integrity of the model, and issues 'prefix' information on the super account book after the verification is finished. When the total net is over 50 percentAfter the node sends the information, the next step can be performed.
And S203, dividing the participants into aggregation nodes and common nodes according to an intelligent contract, and starting the federal deep learning.
In the step, after all participants including learning organizers are divided into aggregation nodes and common nodes, the common nodes divide a first private data set which is only open to read and write all the common nodes in a super account book, the common nodes negotiate to obtain homomorphic encryption public and private keys, the public and private keys are stored into the first private data set, and hash values of the public and private keys are stored into a public data set; and the aggregation node divides a second private data set which is only read and written for all aggregation nodes in the super account book, uploads the aggregation chain codes to the second private data set, and stores the hash value of the second private data set into the public data set. And the common node starts a first round of training of the federal deep learning by using the received initial model and initial parameters.
Any common node generates the homomorphic encryption public and private keys in the current round and writes the homomorphic encryption public and private keys into the first private data set, and the hash value of the public and private key data is uploaded to the public data set to be received and stored.
And the other common nodes confirm the validity of the key and vote, and the valid key is confirmed by more than 50% of the online common nodes firstly, namely the homomorphic encryption key of the current round.
And S204, training the initial model or the global model by the common node and transmitting the training result to the aggregation node, and aggregating the received training results by the aggregation node to obtain the global model and transmitting the global model to the common node.
In the first round of training, the common node trains the initial model by using local data, and the obtained training result is encrypted by adding differential privacy noise and a homomorphic encryption public key and then transmitted to the aggregation node. And after receiving the training result, the aggregation node stores the training result into a second private data set, and stores the hash value of the second private data set into a public data set. And the aggregation node aggregates the training results according to the aggregation chain code to obtain a global model, and the global model is also stored in the second private data set and stores the hash value into the public data set. And finally, the aggregation node transmits the global model to the common node through a non-blockchain method.
The global model and the training result stored in the second private data set are only stored in a limited round and then deleted, but the evidence storage information on the public data set is always kept, so that the data evidence storage characteristics are protected, and a small storage burden is introduced.
In subsequent training, after the common node decrypts the received global model by using the homomorphic encryption private key, the subsequent steps are performed by using a decryption model obtained by decryption.
The local data used when the common node trains the model is the flow data collected by the administrator of the network domain in the network domain.
The homomorphic encryption is a concept in the field of cryptography, namely, a result obtained by performing mathematical operation on certain data and then performing homomorphic encryption is the same as a result obtained by performing mathematical operation on certain data and then performing homomorphic encryption. If a homomorphic encryption algorithm supports addition, subtraction, multiplication and division operations, then the homomorphic encryption algorithm is called fully homomorphic encryption; if a homomorphic encryption algorithm supports only partial operations, such as addition and multiplication, then such a homomorphic encryption algorithm is referred to as semi-homomorphic encryption.
In general, fully homomorphic encryption is more time consuming and the encrypted data becomes more voluminous than semi-homomorphic encryption. Homomorphic encryption, as referred to herein, is a lightweight, semi-homomorphic encryption that supports only addition and multiplication operations.
And S205, verifying the global model effect by the common node, and judging whether to terminate learning.
In this step, the common node calculates the hash value of the received global model and compares the hash value with the hash value stored in the public data set, and the purpose of the operation is to verify the integrity of the received global model. And after the verification is finished, the homomorphic encryption private key is used for decrypting the global model, the local data is used for verifying whether the model obtained by decryption meets the preset requirement or not, and how to operate the next step is judged.
And step S206, finishing learning and outputting the result.
In this step, in step S205, the common node finds, through verification of local data, that the decrypted model meets the preset requirement, the task of the federal deep learning is completed, training may be ended, and the model meeting the preset requirement may be output as an achievement of the federal deep learning, where the achievement of the federal deep learning is the abnormal flow identification model.
Step S207, determining whether the number of learning rounds reaches a threshold value.
In this step, the common node verifies the decrypted model by using the local data in step S205, and the effect does not meet the preset requirement. Judging whether the number of learning rounds reaches a set threshold value or not by the intelligent contract, returning to the step S203 if the number of learning rounds reaches the set threshold value, subdividing common nodes and aggregation nodes according to the intelligent contract, and continuing to train the model; if the threshold is not reached, the process returns to step S204, and the common node continues to train the model by using the local data.
As an alternative embodiment, referring to fig. 3, for the general node in step S204 in the foregoing embodiment, the following working steps may also be followed:
s301, receiving an initial model by a common node, and training the initial model by using local data;
s302, adding differential privacy noise to the training result by the common node, encrypting by using a homomorphic encryption public key, and transmitting to the aggregation node;
s303, the common node receives a global model obtained by aggregation of the aggregation nodes;
s304, the common node verifies the integrity of the global model, then uses a homomorphic encryption private key to decrypt the global model and verifies whether the obtained decryption result meets the requirement;
s305, if the decryption result meets the requirement, outputting a model meeting the requirement, and if the decryption result does not meet the requirement, continuing training by using the decryption result.
As an alternative embodiment, referring to fig. 4, for the aggregation node in step S204 in the foregoing embodiment, the following working steps may also be followed:
s401, the aggregation node receives training results from all common nodes;
s402, the aggregation node stores the training result to a second private data set and stores the hash value of the training result to a public data set;
s403, the aggregation node executes aggregation operation to obtain a global model;
s404, the aggregation node stores the global model to a second private data set and stores the hash value of the global model to a public data set;
and S405, the aggregation node transmits the global model to the common node.
In the whole federal deep learning process, training results are transmitted to an aggregation node by a common node, and a mechanism that a global model is transmitted to the common node by the aggregation node does not select chain-up evidence storage and chain-down transmission through a block chain; block chain data expansion caused by non-tamper-ability of the block chain when the model is sent on the chain and privacy leakage caused by the openness of the block chain are avoided.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, one or more embodiments of the invention also provide a safe implementation device of the distributed federal deep learning method. Referring to fig. 5, an implementation apparatus of the federal deep learning method includes:
a model initialization module 501, which has the function of selecting one participant as a learning organizer after the participant joins the super book, and the learning organizer generates an initial model and initial parameters and distributes them to the other participants;
a node division module 502, which has a function of dividing all participants into common nodes and aggregation nodes before the start of the federal deep learning, and re-dividing the aggregation nodes and the common nodes after the number of training rounds reaches a set threshold;
the model training module 503 has a function of training the initial model until obtaining a model meeting the preset requirements.
As an alternative embodiment, the model training module 503 includes an aggregation node and a common node; during the first round of training, the common node trains an initial model by using local data, then adds differential privacy noise to a training result, then performs homomorphic encryption and transmits the homomorphic encryption to the aggregation node, and the aggregation node aggregates the received training result according to the aggregation chain code to obtain a global model and transmits the global model to the common node; and after receiving the global model, the common node decrypts the global model by using the homomorphic encryption private key to obtain a decryption model, verifies the effect of the decryption model by using local data, and judges whether to terminate training according to whether the effect meets the preset requirement. And terminating the training and outputting the decrypted model as a model meeting the preset requirement if the requirement is met, judging whether the number of learning rounds reaches the preset threshold value if the requirement is not met, subdividing the aggregation nodes and the common nodes if the threshold value is met, continuing the training by using the new common nodes, and directly continuing the training if the threshold value is not met.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the secure distributed federal deep learning method as described in any of the above embodiments.
Fig. 6 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (10)

1. A secure distributed federated deep learning method, comprising:
all participants join the blockchain;
selecting one participant as a learning organizer, enabling the learning organizer to generate an initial model and initial parameters, and distributing the initial model and the initial parameters to the participants;
dividing the participators into aggregation nodes and common nodes according to an intelligent contract, carrying out training on the initial model by using a distributed federal deep learning method, and obtaining a model meeting requirements when the training is finished;
in the first round of training, the common node trains the received initial model; in the subsequent training, the common node trains the received global model; and the global model is obtained by aggregating the training results sent by all the common nodes in the previous training round by the aggregation node.
2. The method of claim 1, wherein in the subsequent training, before the ordinary node trains the received global model, the method further comprises:
and the common node evaluates the effect of the received global model, and if the effect reaches the expectation, the training is finished and the model meeting the requirements is obtained.
3. The method according to claim 2, wherein the training of the initial model using a distributed federated deep learning method specifically comprises:
and when the number of training rounds reaches a set threshold value and the effect of the global model received by the common node does not reach the expectation, the participants are divided into new common nodes and new aggregation nodes again, and the global model is transmitted to the new common nodes for training.
4. The method of claim 1, wherein the blockchain is a federation chain based super ledger;
the super ledger book comprises a public data set, a first private data set and a second private data set;
the public data set is used for reading and writing all the participants, the first private data set is only used for reading and writing the common node, and the second private data set is only used for reading and writing the aggregation node.
5. The method of claim 4, wherein:
the common node obtains a homomorphic encryption public and private key through negotiation, and the homomorphic encryption public and private key is used for encrypting the training result and decrypting the global model;
and the aggregation node generates an aggregation chain code, and the aggregation node aggregates the training result according to the aggregation chain code to obtain the global model.
6. The method of claim 5, wherein the initial model is transmitted to the generic node, and wherein the training results and the global model are not transmitted through a blockchain between the generic node and the aggregation node;
the common nodes use local data of each common node to train the received initial model or the global model;
the common node encrypts the training result by using the homomorphic encryption public key, and sends the encryption result to the aggregation node, and the aggregation node aggregates the encryption result to obtain a global model which is also an encryption model subjected to homomorphic encryption;
and the common node decrypts the global model by using the corresponding homomorphic encryption private key, performs effect evaluation by using the decryption model obtained by decryption operation, and judges whether to terminate training.
7. The method of claim 6, wherein:
the homomorphic encrypted public and private keys are stored in the first private data set, and the training results received by the aggregation node, the global model obtained by aggregation and the aggregation chain code are stored in the second private data set;
the public data set stores the initial model and the initial parameters, and hash values of data in the first private data set and the second private data set;
and after the training result and the global model stored in the second private data set are stored in a finite round, the common node and the aggregation node are alternately deleted, and the hash value stored in the public data set is always reserved.
8. An apparatus for implementing a secure distributed federated deep learning method, comprising:
the model initialization module is responsible for generating the initial model and the initial parameters and distributing the initial model and the initial parameters;
the node dividing module is responsible for dividing the participants into the common nodes and the aggregation nodes;
and the model training module is responsible for training the initial model until the global model effect reaches the expected requirement, and outputting the model meeting the requirement.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 7 when executing the program.
10. The safe distributed federal deep learning method, implementation device and electronic equipment as claimed in any one of claims 1 to 9 can be used for training of an abnormal traffic recognition model.
CN202010996487.9A 2020-09-21 2020-09-21 Safe distributed federal deep learning method Active CN112232527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010996487.9A CN112232527B (en) 2020-09-21 2020-09-21 Safe distributed federal deep learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010996487.9A CN112232527B (en) 2020-09-21 2020-09-21 Safe distributed federal deep learning method

Publications (2)

Publication Number Publication Date
CN112232527A true CN112232527A (en) 2021-01-15
CN112232527B CN112232527B (en) 2024-01-23

Family

ID=74108816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010996487.9A Active CN112232527B (en) 2020-09-21 2020-09-21 Safe distributed federal deep learning method

Country Status (1)

Country Link
CN (1) CN112232527B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801307A (en) * 2021-04-13 2021-05-14 深圳索信达数据技术有限公司 Block chain-based federal learning method and device and computer equipment
CN112967812A (en) * 2021-04-20 2021-06-15 钟爱健康科技(广东)有限公司 Anti-theft attack medical diagnosis model protection method based on federal learning
CN113094761A (en) * 2021-04-25 2021-07-09 中山大学 Method for monitoring federated learning data tamper-proofing and related device
CN113204787A (en) * 2021-05-06 2021-08-03 广州大学 Block chain-based federated learning privacy protection method, system, device and medium
CN113240127A (en) * 2021-04-07 2021-08-10 睿蜂群(北京)科技有限公司 Federal learning-based training method and device, electronic equipment and storage medium
CN113408746A (en) * 2021-06-22 2021-09-17 深圳大学 Block chain-based distributed federal learning method and device and terminal equipment
CN113515760A (en) * 2021-05-28 2021-10-19 平安国际智慧城市科技股份有限公司 Horizontal federal learning method, device, computer equipment and storage medium
CN113609508A (en) * 2021-08-24 2021-11-05 上海点融信息科技有限责任公司 Block chain-based federal learning method, device, equipment and storage medium
CN113612598A (en) * 2021-08-02 2021-11-05 北京邮电大学 Internet of vehicles data sharing system and method based on secret sharing and federal learning
CN113807537A (en) * 2021-04-06 2021-12-17 京东科技控股股份有限公司 Data processing method and device for multi-source data, electronic equipment and storage medium
CN113852955A (en) * 2021-09-23 2021-12-28 北京邮电大学 Method for secure data transmission and legal node authentication in wireless sensing network
CN114612408A (en) * 2022-03-04 2022-06-10 拓微摹心数据科技(南京)有限公司 Heart image processing method based on federal deep learning
WO2022174533A1 (en) * 2021-02-20 2022-08-25 平安科技(深圳)有限公司 Federated learning method and apparatus based on self-organized cluster, device, and storage medium
WO2023138152A1 (en) * 2022-01-20 2023-07-27 广州广电运通金融电子股份有限公司 Federated learning method and system based on blockchain
CN116720594A (en) * 2023-08-09 2023-09-08 中国科学技术大学 Decentralized hierarchical federal learning method
CN117171814A (en) * 2023-09-28 2023-12-05 数力聚(北京)科技有限公司 Federal learning model integrity verification method, system, equipment and medium based on differential privacy
CN117371025A (en) * 2023-09-18 2024-01-09 泉城省实验室 Method and system for training decentralised machine learning model
CN117714217A (en) * 2024-02-06 2024-03-15 河北数云堂智能科技有限公司 Method and device for trusted federal intelligent security computing platform

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110572253A (en) * 2019-09-16 2019-12-13 济南大学 Method and system for enhancing privacy of federated learning training data
CN111125779A (en) * 2019-12-17 2020-05-08 山东浪潮人工智能研究院有限公司 Block chain-based federal learning method and device
CN111212110A (en) * 2019-12-13 2020-05-29 清华大学深圳国际研究生院 Block chain-based federal learning system and method
CN111552986A (en) * 2020-07-10 2020-08-18 鹏城实验室 Block chain-based federal modeling method, device, equipment and storage medium
CN111600707A (en) * 2020-05-15 2020-08-28 华南师范大学 Decentralized federal machine learning method under privacy protection
CN111611610A (en) * 2020-04-12 2020-09-01 西安电子科技大学 Federal learning information processing method, system, storage medium, program, and terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110572253A (en) * 2019-09-16 2019-12-13 济南大学 Method and system for enhancing privacy of federated learning training data
CN111212110A (en) * 2019-12-13 2020-05-29 清华大学深圳国际研究生院 Block chain-based federal learning system and method
CN111125779A (en) * 2019-12-17 2020-05-08 山东浪潮人工智能研究院有限公司 Block chain-based federal learning method and device
CN111611610A (en) * 2020-04-12 2020-09-01 西安电子科技大学 Federal learning information processing method, system, storage medium, program, and terminal
CN111600707A (en) * 2020-05-15 2020-08-28 华南师范大学 Decentralized federal machine learning method under privacy protection
CN111552986A (en) * 2020-07-10 2020-08-18 鹏城实验室 Block chain-based federal modeling method, device, equipment and storage medium

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022174533A1 (en) * 2021-02-20 2022-08-25 平安科技(深圳)有限公司 Federated learning method and apparatus based on self-organized cluster, device, and storage medium
CN113807537B (en) * 2021-04-06 2023-12-05 京东科技控股股份有限公司 Data processing method and device for multi-source data, electronic equipment and storage medium
CN113807537A (en) * 2021-04-06 2021-12-17 京东科技控股股份有限公司 Data processing method and device for multi-source data, electronic equipment and storage medium
CN113240127A (en) * 2021-04-07 2021-08-10 睿蜂群(北京)科技有限公司 Federal learning-based training method and device, electronic equipment and storage medium
CN112801307A (en) * 2021-04-13 2021-05-14 深圳索信达数据技术有限公司 Block chain-based federal learning method and device and computer equipment
CN112967812A (en) * 2021-04-20 2021-06-15 钟爱健康科技(广东)有限公司 Anti-theft attack medical diagnosis model protection method based on federal learning
CN113094761A (en) * 2021-04-25 2021-07-09 中山大学 Method for monitoring federated learning data tamper-proofing and related device
CN113204787A (en) * 2021-05-06 2021-08-03 广州大学 Block chain-based federated learning privacy protection method, system, device and medium
CN113204787B (en) * 2021-05-06 2022-05-31 广州大学 Block chain-based federated learning privacy protection method, system, device and medium
CN113515760A (en) * 2021-05-28 2021-10-19 平安国际智慧城市科技股份有限公司 Horizontal federal learning method, device, computer equipment and storage medium
CN113515760B (en) * 2021-05-28 2024-03-15 平安国际智慧城市科技股份有限公司 Horizontal federal learning method, apparatus, computer device, and storage medium
CN113408746A (en) * 2021-06-22 2021-09-17 深圳大学 Block chain-based distributed federal learning method and device and terminal equipment
CN113612598A (en) * 2021-08-02 2021-11-05 北京邮电大学 Internet of vehicles data sharing system and method based on secret sharing and federal learning
CN113612598B (en) * 2021-08-02 2024-02-23 北京邮电大学 Internet of vehicles data sharing system and method based on secret sharing and federal learning
CN113609508A (en) * 2021-08-24 2021-11-05 上海点融信息科技有限责任公司 Block chain-based federal learning method, device, equipment and storage medium
CN113609508B (en) * 2021-08-24 2023-09-26 上海点融信息科技有限责任公司 Federal learning method, device, equipment and storage medium based on blockchain
CN113852955B (en) * 2021-09-23 2024-04-05 北京邮电大学 Method for secure data transmission and legal node authentication in wireless sensing network
CN113852955A (en) * 2021-09-23 2021-12-28 北京邮电大学 Method for secure data transmission and legal node authentication in wireless sensing network
WO2023138152A1 (en) * 2022-01-20 2023-07-27 广州广电运通金融电子股份有限公司 Federated learning method and system based on blockchain
CN114612408B (en) * 2022-03-04 2023-06-06 拓微摹心数据科技(南京)有限公司 Cardiac image processing method based on federal deep learning
CN114612408A (en) * 2022-03-04 2022-06-10 拓微摹心数据科技(南京)有限公司 Heart image processing method based on federal deep learning
CN116720594B (en) * 2023-08-09 2023-11-28 中国科学技术大学 Decentralized hierarchical federal learning method
CN116720594A (en) * 2023-08-09 2023-09-08 中国科学技术大学 Decentralized hierarchical federal learning method
CN117371025A (en) * 2023-09-18 2024-01-09 泉城省实验室 Method and system for training decentralised machine learning model
CN117371025B (en) * 2023-09-18 2024-04-16 泉城省实验室 Method and system for training decentralised machine learning model
CN117171814A (en) * 2023-09-28 2023-12-05 数力聚(北京)科技有限公司 Federal learning model integrity verification method, system, equipment and medium based on differential privacy
CN117171814B (en) * 2023-09-28 2024-06-04 数力聚(北京)科技有限公司 Federal learning model integrity verification method, system, equipment and medium based on differential privacy
CN117714217A (en) * 2024-02-06 2024-03-15 河北数云堂智能科技有限公司 Method and device for trusted federal intelligent security computing platform
CN117714217B (en) * 2024-02-06 2024-05-28 河北数云堂智能科技有限公司 Method and device for trusted federal intelligent security computing platform

Also Published As

Publication number Publication date
CN112232527B (en) 2024-01-23

Similar Documents

Publication Publication Date Title
CN112232527B (en) Safe distributed federal deep learning method
US11836616B2 (en) Auditable privacy protection deep learning platform construction method based on block chain incentive mechanism
JP7482982B2 (en) Computer-implemented method, system, and storage medium for blockchain
JP7289298B2 (en) Computer-implemented system and method for authorizing blockchain transactions using low-entropy passwords
US11228447B2 (en) Secure dynamic threshold signature scheme employing trusted hardware
JP6871380B2 (en) Information protection systems and methods
CN109255247A (en) Secure calculation method and device, electronic equipment
JP2019528592A (en) Method and system implemented by blockchain
Liu et al. An efficient method to enhance Bitcoin wallet security
CN112597542B (en) Aggregation method and device of target asset data, storage medium and electronic device
CN115795518B (en) Block chain-based federal learning privacy protection method
CN115037477A (en) Block chain-based federated learning privacy protection method
US20220374544A1 (en) Secure aggregation of information using federated learning
CN111738857B (en) Generation and verification method and device of concealed payment certificate applied to block chain
CN112231769A (en) Block chain-based numerical verification method and device, computer equipment and medium
CN113420886B (en) Training method, device, equipment and storage medium for longitudinal federal learning model
CN109660344A (en) Anti- quantum calculation block chain method of commerce and system based on unsymmetrical key pond route device
Dar et al. Blockchain based secure data exchange between cloud networks and smart hand-held devices for use in smart cities
CN113365264A (en) Block chain wireless network data transmission method, device and system
CN115118462B (en) Data privacy protection method based on convolution enhancement chain
CN110809000A (en) Service interaction method, device, equipment and storage medium based on block chain network
Lou et al. Blockchain-based privacy-preserving data-sharing framework using proxy re-encryption scheme and interplanetary file system
Maram Bitcoin generation using Blockchain technology
Sengupta et al. Blockchain-Enabled Verifiable Collaborative Learning for Industrial IoT
Zhu Privacy Preservation & Security Solutions in Blockchain Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant