CN114091651A

CN114091651A - Method, device and system for multi-party joint training of neural network of graph

Info

Publication number: CN114091651A
Application number: CN202111297665.XA
Authority: CN
Inventors: 倪翔; 吕灵娟; 许小龙; 孟昌华; 王维强; 吕乐
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2021-11-03
Filing date: 2021-11-03
Publication date: 2022-02-25
Anticipated expiration: 2041-11-03
Also published as: CN114091651B

Abstract

The embodiment of the specification provides a method, a device and a system for protecting a neural network of a multi-party joint training diagram of private data, wherein the method comprises the following steps: the first party processes the first characteristic part of the sample object by using the first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller to obtain a first encryption result; receiving a second encryption result from the second party; obtaining a first gradient ciphertext through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending it to the controller; receiving first noise data obtained after the first encrypted noise data are decrypted from a controller, and removing first noise from the first noise data to obtain a first gradient plaintext; the first parameter portion is updated based on the first gradient plaintext.

Description

Method, device and system for multi-party joint training of neural network of graph

Technical Field

The present disclosure relates to the field of data security technologies, and in particular, to a method, an apparatus, and a system for protecting a neural network of a multi-party joint training diagram of private data.

Background

The data required for machine learning often involves multiple domains. For example, in a machine learning-based user classification analysis scenario, an electronic payment platform owns transaction flow data of a user, a social platform owns friend contact data of the user, and a banking institution owns loan data of the user. Data often exists in the form of islands. Due to the problems of industry competition, data safety, user privacy and the like, data integration faces great resistance, and data scattered on various platforms are integrated together to train a machine learning model and are difficult to realize. On the premise of ensuring that data is not leaked, the joint training of the machine learning model by using multi-party data becomes a great challenge at present.

Graph neural networks are widely used machine learning models. Compared with the traditional neural network, the graph neural network not only can capture the characteristics of the nodes, but also can depict the incidence relation characteristics among the nodes, so that the graph neural network has an excellent effect in a plurality of machine learning tasks. However, when the data islanding phenomenon is faced, how to synthesize multi-party data and safely perform multi-party joint training graph neural network becomes a problem to be solved.

Disclosure of Invention

One or more embodiments of the present disclosure provide a method, an apparatus, and a system for protecting a neural network of a multi-party joint training graph of private data, so as to implement a safe and effective neural network of a joint training graph among multiple parties.

According to a first aspect, there is provided a method of jointly training a neural network of a plurality of parties for protecting private data, the plurality of parties including a first party, a second party and a controller, the method performed by the first party, comprising:

processing a first characteristic part of the sample object by using a first parameter part of the graph neural network to obtain a first processing result;

homomorphic encryption is carried out on the first processing result by utilizing the target public key of the controller to obtain a first encryption result;

receiving a second encryption result from the second party, wherein the second encryption result is obtained by the second party performing homomorphic encryption on a second processing result by using the target public key, and the second processing result is obtained by the second party processing a second characteristic part of the sample object by using a second parameter part of the graph neural network;

obtaining a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function, wherein the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;

adding a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller;

receiving first noise data obtained after the first encrypted noise data are decrypted from the controller, and removing the first noise from the first noise data to obtain a first gradient plaintext;

and updating the first parameter part according to the first gradient plaintext.

In one implementation, the sample objects are nodes or edges in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, articles; the edges represent associations between the business objects.

In one embodiment, the obtaining the first processing result includes:

and multiplying the first parameter part and the first characteristic part to obtain the first processing result.

In an embodiment, before the obtaining the first processing result, the method further includes:

and aggregating to obtain a first characteristic part by using the first original characteristic part of the neighbor object of the sample object and the first original characteristic part corresponding to the sample.

In an embodiment, the obtaining, by a homomorphic operation, a first gradient ciphertext corresponding to the first parameter portion based on the first encryption result and the second encryption result and a preset loss function includes:

determining a loss value ciphertext through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function;

determining the first gradient cipher text based on the loss value cipher text and the first parameter portion.

In one embodiment, the method further comprises:

sending the loss value ciphertext to the controller;

the receiving, from the controller, first noisy data after decrypting the first encrypted noisy data, comprising:

and receiving the first noise adding data sent by the controller under the condition that the loss value plaintext corresponding to the loss value ciphertext is determined to be not lower than a preset loss threshold.

In one embodiment, the orthogonal polynomial is a second order orthogonal polynomial.

In one embodiment, the loss function is defined as an orthogonal polynomial with the sum of the first and second processing results as a variable.

In one embodiment, the method further comprises:

the target public key is received from the controller and stored.

According to a second aspect, there is provided a method of protecting a neural network of a multi-party joint training graph of private data, the method comprising:

the first party utilizes the first parameter part of the graph neural network to process the first characteristic part of the sample object to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller to obtain a first encryption result;

the second party processes the second characteristic part of the sample object by using the second parameter part of the graph neural network to obtain a second processing result; using the target public key to perform homomorphic encryption on the second processing result to obtain a second encryption result, and sending the second encryption result to the first party;

the first party obtains a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller; wherein the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;

after the controller receives the first encrypted noisy data, decrypting the first encrypted noisy data by using a target private key corresponding to the target public key to obtain first noisy data; sending the first noisy data to the first party;

and the first party receives the first noise-adding data, removes the first noise from the first noise-adding data to obtain a first gradient plaintext, and updates the first parameter part by using the first gradient plaintext.

According to a third aspect, there is provided an apparatus for protecting a neural network of a multi-party joint training graph of private data, the multi-party including a first party, a second party and a controller, the apparatus being deployed at the first party, the apparatus comprising:

a first processing module configured to process a first characteristic portion of the sample object by using a first parameter portion of the graph neural network to obtain a first processing result;

the homomorphic encryption module is configured to homomorphic encrypt the first processing result by using the target public key of the controller to obtain a first encryption result;

a first receiving module configured to receive a second encryption result from the second party, where the second encryption result is obtained by the second party performing homomorphic encryption on a second processing result by using the target public key, and the second processing result is obtained by the second party processing a second feature portion of the sample object by using a second parameter portion of the graph neural network;

a gradient ciphertext determination module configured to obtain a first gradient ciphertext corresponding to the first parameter portion through homomorphic operation based on the first encryption result, the second encryption result, and a preset loss function, wherein the loss function is in a form of an orthogonal polynomial approximating an activation function in the graph neural network;

the adding and sending module is configured to add a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller;

and the second receiving module is configured to receive the first noise-added data after the first encrypted noise-added data is decrypted from the controller, and remove the first noise from the first noise-added data to obtain a first gradient plaintext.

And the updating module is configured to update the first parameter part according to the first gradient plaintext.

According to a fourth aspect, there is provided a system of a multi-party joint training graph neural network for protecting private data, the system comprising a first party, a second party and a controller, wherein,

the first party is used for processing a first characteristic part of the sample object by utilizing a first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller to obtain a first encryption result;

the second party is used for processing a second characteristic part of the sample object by using a second parameter part of the graph neural network to obtain a second processing result; using the target public key to perform homomorphic encryption on the second processing result to obtain a second encryption result, and sending the second encryption result to the first party;

the first party is further used for obtaining a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller; the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;

the controller is used for decrypting the first encrypted noisy data by using a target private key corresponding to the target public key after receiving the first encrypted noisy data to obtain first noisy data; sending the first noisy data to the first party;

the first party is further configured to receive the first noise-adding data, remove the first noise from the first noise-adding data, obtain a first gradient plaintext, and update the first parameter portion using the first gradient plaintext.

According to a fifth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first or second aspect.

According to a sixth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, which when executed by the processor, implements the method of the first or second aspect.

According to the method, the device and the system provided by the embodiment of the specification, data are interacted among multiple parties in a homomorphic encryption mode, leakage of data of each party is avoided, and when a first party calculates a first gradient ciphertext corresponding to a first parameter part of the first party based on an encryption result of the multiple parties, a loss function adopts an orthogonal polynomial form approximating an activation function in a graph neural network, so that the obtained result is linearly approximated from nonlinearity to support homomorphic operation, and the precision of the gradient ciphertext is ensured to a certain extent.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a schematic diagram of a framework for implementing one embodiment disclosed herein;

FIG. 2 is a schematic flowchart of a method for multi-party joint training graph neural network for protecting privacy data according to an embodiment;

FIG. 3 is another schematic flowchart of a method for multi-party joint training graph neural network for protecting privacy data according to an embodiment;

FIG. 4 is a schematic block diagram of an apparatus for a multi-party joint training graph neural network for protecting privacy data according to an embodiment;

fig. 5 is a schematic block diagram of a system for a multi-party joint training graph neural network for protecting privacy data according to an embodiment.

Detailed Description

The technical solutions of the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

The embodiment of the specification discloses a method, a device and a system for protecting a neural network of a multi-party joint training diagram of private data, and introduces an application scene and an inventive concept of the method for protecting the neural network of the multi-party joint training diagram of the private data, specifically as follows:

data often exists in an isolated island form, and due to the problems of industry competition, data security, user privacy and the like, data integration faces great resistance. On the premise of ensuring that data is not leaked, the joint training of the machine learning model by using multi-party data becomes a great challenge at present.

Graph neural networks are widely used machine learning models. Compared with the traditional neural network, the graph neural network not only can capture the characteristics of the nodes, but also can depict the incidence relation characteristics among the nodes, so that the graph neural network has an excellent effect in a plurality of machine learning tasks. When the isolated island phenomenon of data is faced, how to synthesize multi-party data and safely carry out multi-party joint training of the neural network of the graph becomes a problem to be solved.

In view of this, an embodiment of the present disclosure provides a method for protecting a neural network of a multi-party joint training graph of private data, where in a process of training the neural network of the multi-party joint training graph, the multi-party joint training graph implements secure collaborative training by using homomorphic encryption. Specifically, fig. 1 is a schematic view of an implementation scenario of an embodiment disclosed in this specification. As shown in fig. 1, in a scenario of a multi-party joint training graph neural network for protecting private data, involved parties may include: a first party a, a second party B and a controller C. Each participant may be implemented by any device, platform, server, or cluster of devices having computing, processing capabilities. The three parties jointly train a graph neural network under the condition of protecting data privacy. In one implementation, the graph neural network may be a graph convolution network based on the GraphSage algorithm.

The first party A stores a first parameter part WA of the graph neural network to be trained and a corresponding relation network graph of the graph neural network. The second party B stores a second parameter part WB of the neural network of the graph to be trained, and the relational network graph. The first party a stores a part of the features of the n sample objects in the training dataset, which is referred to as a first feature portion XA. The second party B stores second characteristic portions XB of the n sample objects. Wherein, the sample object is a node or an edge in the relationship network graph; a node may represent one of the following business objects, users, goods, and items; edges represent associations between business objects in the relational network graph. Assuming that the second party also stores therein the label values of n sample objects, the n label values constitute a label vector Y. It can be understood that, different parties participating in the multi-party joint training graph neural network, different business objects may exist in the node representation, and the business objects in the node representation may further include: residential, corporate, etc.

In one case, the first party a and the second party B store the same structure of the relationship network graph, which is constructed in advance based on the association relationships with the n sample objects stored by the first party a and the second party B, respectively.

For example, in one exemplary scenario, the first party a and the second party B are an electronic payment platform and a banking institution, and both parties need a jointly trained graph neural network for evaluating the credit rating of the user. Wherein the sample object is a user, the electronic payment platform may store a part of the characteristics (e.g., characteristics related to payment) of the user. The banking institution may store another portion of the user's characteristics (e.g., credit record related characteristics of the user) from which the second characteristic portion XB corresponding to the sample object may be determined. The electronic payment platform and the banking institution are respectively stored with a relationship network diagram related to the users corresponding to the graph neural network, and the relationship network diagram is constructed based on the association relationship between the users and/or the association relationship between the users and other objects (such as merchants and articles) stored by the electronic payment platform and the banking institution. The electronic payment platform can determine to obtain a first characteristic part XA corresponding to the user through the stored partial characteristics of the user and the relationship network diagram. The banking institution can determine to obtain the second characteristic part XB corresponding to the user through the stored partial characteristics of the user and the relationship network diagram.

In another exemplary scenario, the first party a and the second party B are an e-commerce platform and an e-payment platform, and both parties need a jointly trained graph neural network for evaluating the fraud risk of the merchant. Wherein the sample object is a merchant. The e-commerce platform may store sales data of merchants as a part of the features; the electronic payment platform maintains transaction flow data for the merchant as another portion of the features. The e-commerce platform and the e-payment platform both store a relationship network diagram related to the merchant corresponding to the graph neural network, and the relationship network diagram is constructed based on the association relationship between the merchant and other objects (articles and users) stored in the e-commerce platform and the e-payment platform. The electronic payment platform can determine to obtain a first characteristic part XA corresponding to the user through the stored partial characteristics of the user and the relationship network diagram. The banking institution can determine to obtain the second characteristic part XB corresponding to the user through the stored partial characteristics of the user and the relationship network diagram.

In other scenario examples, the business object may also be other objects to be evaluated, such as a good, an interaction event (e.g., a transaction event, a login event, a click event, a purchase event), and so forth. Accordingly, the participants may be different business parties that maintain different characteristic portions of the business object. The graph neural network may be a network that performs classification prediction or regression prediction for the corresponding business object.

It is to be understood that the service object features maintained by each participant belong to private data, and in the joint training process, plaintext exchange cannot be performed to protect the security of the private data. And, finally, the first party a wishes to train to obtain a model parameter part for processing the first feature part XA, i.e. the first parameter part WA; the second party wishes to train a second parametric part WB for processing a second feature part XB, which together constitute the patterned neural network.

In order to perform joint training of models without revealing privacy data, according to an embodiment of the present specification, as shown in fig. 1, in a graph neural network training process, a first party a processes a first feature portion XA by using a first parameter portion WA to obtain a first processing result MA, and performs homomorphic encryption on the first processing result MA by using a target public key PK of a controller C to obtain a first encryption result [ MA]_PK(ii) a And the second party B processes the second characteristic part XB by utilizing the second parameter part WB to obtain a corresponding processing result MB, and performs homomorphic encryption on the second processing result MB by utilizing the target public key PK to obtain a second encryption result [ MB ]]_PK. Then the first party A and the second party B respectively obtain the encryption result M]_PK(first party A is the first encryption result MA]_PKThe second party B is the second encryption result [ MB ]]_PK) Sending the data to the other party; the first party A and the second party B receive the encryption result M of the other party]_PKThereafter, the encryption results of both parties [ M ] are used respectively]_PKAnd a preset loss function, and obtaining respective corresponding gradient ciphertext [ G ] through homomorphic operation]_PK(first party A gets a first gradient ciphertext [ GA ]]_PKThe second party B obtains a second gradient ciphertext [ GB ]]_PK). Thereafter, the first party A and the second party B are each at their gradient ciphertext [ G ]]_PK(the first party A is [ GA ]]_PKThe second party B is [ GB]_PK) Plus noise ciphertext [ epsilon ]]_PKAnd then sent to the controller C. Wherein, the [ alpha ], [ beta ] -a]Indicating encryption and the corner mark indicates the key used for encryption.

First party A has its first gradient ciphertext [ GA ]]_PKThe added noise ciphertext ([ ε 1 ]]_PK) And the second party B at its second gradient ciphertext [ GB]_PKThe added noise ciphertext ([ epsilon 2 ]]_PK) The random noise is obtained by homomorphic encryption of randomly generated noise. Which may or may not be the same.

The activation function in the graph neural network is generally a nonlinear function, in order to enable the training process to support homomorphic operation, the loss function adopts a form of an orthogonal polynomial approximating the activation function in the graph neural network, and the accuracy of the gradient ciphertext (gradient) can be improved to a certain extent by approximating the activation function in the graph neural network by the orthogonal polynomial.

The controller C obtains gradient ciphertexts G of the added noise ciphertexts respectively transmitted by the first party A and the second party B]_PK+[ε]_PKThen, the gradient ciphertext G of the ciphertext added with the noise is subjected to the target private key SK corresponding to the target public key PK]_PK+[ε]_PKAnd decrypting to obtain the gradient G + epsilon of the added noise of the first party A and the second party B (the first party A is GA + epsilon 1, and the second party B is GB + epsilon 2), feeding back the GA + epsilon 1 to the first party A, and feeding back the GB + epsilon 2 to the second party B.

And after obtaining the gradient GA + epsilon 1 of the noise added by the first party A, removing the noise epsilon 1 from the gradient GA to obtain a corresponding gradient plaintext GA, and after obtaining the gradient GB + epsilon 2 of the noise added by the second party B, removing the noise epsilon 2 from the gradient GA + epsilon 1 to obtain a corresponding gradient plaintext GB. And then the first party A and the second party B respectively update the parameter parts of the graph neural network models stored by the first party A and the second party B according to the obtained gradient plaintext G, so that the multi-party combined training of the graph neural network is realized.

In the whole training process, each party does not exchange data in a plaintext mode, all communication data are encrypted data or mixed data are added, so that privacy data are prevented from being leaked in the combined training process, and the safety of the data is enhanced. In addition, in order to support homomorphic operation, the loss function adopts the form of orthogonal polynomial approximating the activation function in the neural network of the graph, the obtained result is approximated to be linear from nonlinear, and the accuracy of the gradient ciphertext is ensured to a certain extent by approximating the activation function by the orthogonal polynomial. The following describes a specific implementation procedure of the above scheme.

FIG. 2 is a flow diagram illustrating a method for multi-party federated training graph neural networks to protect private data in one embodiment of the present description. Wherein the plurality of parties includes a first party a, a second party B, and a controller C. It is to be understood that before the model iterative training (i.e., iteratively training the graph neural network), an initialization phase is first performed. In the initialization phase, the controller C generates an asymmetric key pair, i.e., a target public key PK and a target private key SK, for homomorphic encryption, and then sends the target public key PK to the first party a and the second party B, respectively, where the first party a and the second party B store the target public key PK. The controller C keeps the target private key SK private.

In addition, the first party a and the second party B also initialize their stored parameter portions of the graph neural network. Specifically, the first party a initializes a first parameter portion WA for processing a first characteristic portion of the sample object. The second party B initiates generation of a second parameter portion WB for processing a second feature portion of the sample object. In one implementation, the first parameter portion WA and the second parameter portion WB may be initialized by random generation.

Then, the iterative training process of the model shown in fig. 2 is entered. The following describes a method flow of a multi-party joint training graph neural network for protecting private data from the perspective of a first party a. The first party a may be implemented by any means, device, platform, cluster of devices, etc. having computing, processing capabilities. It will be appreciated that the second party B and the controller C may also be implemented by any means, device, platform, cluster of devices, etc. having computing, processing capabilities.

Accordingly, the method is performed by a first party a, comprising the following steps S210-S270:

s210: the first characteristic portion XA of the sample object is processed using the first parametric portion WA of the graphical neural network to obtain a first processing result MA. The first party a stores a first parameter part of the graph neural network, and also stores a relational network graph corresponding to the graph neural network, and the relational network graph may include a plurality of nodes and edges. Wherein a node may represent one of the following business objects: users, merchants, articles; edges may represent associations between business objects. The first party a also stores a part of the features of the n sample objects in the training data set of the training graph neural network, referred to as a first feature part XA. The first party a processes the first characteristic portion XA of the sample object using the first parametric portion WA of the neural network of the graph to obtain a first processing result MA. In one implementation, step S210 may specifically include: the first parameter portion WA and the first characteristic portion XA are multiplied to obtain a first processing result MA.

In one case, the relationship network graph stored by the first party a and the relationship network graph stored by the second party B are the same in structure, and the relationship network graph is constructed in advance based on the association relationship with the sample object stored by each of the first party a and the second party B.

Correspondingly, in order to avoid leakage of the private data in the process of multi-party joint training of the neural network and ensure effective training process, the first party a needs to homomorphically encrypt the first processing result, and then sends the homomorphic encrypted first processing result to the second party B for data interaction. Accordingly, the first party a, after obtaining the first processing result, performs step S220: with the target public key PK of the controller C,performing homomorphic encryption on the first processing result MA to obtain a first encryption result MA]_PK。

Thereafter, the first party A encrypts the first encryption result MA]_PKAnd sending to the second party B. And on the side of the second party B, in the process of carrying out model iterative training, the second party B utilizes a second parameter part WB of the graph neural network to process a second characteristic part XB of the sample object, and a second processing result MB is obtained. Thereafter, in order to avoid leakage of the private data (second processing result MB), the second processing result MB is homomorphically encrypted by using the target public key PK of the controller C, and a second encryption result [ MB ] is obtained]_PK(ii) a On the one hand, the second party B encrypts the second encryption result [ MB ]]_PKTo the first party a. After that, the first party a performs step S230: receiving the second encryption result [ MB ] from the second party B]_PK。

On the other hand, the second party B is obtaining the first encryption result MA]_PKAnd a second encryption result [ MB ]]_PKThereafter, the first encryption result [ MA ] is utilized]_PKAnd a second encryption result [ MB ]]_PKAnd a preset loss function, and obtaining a second gradient ciphertext [ GB ] corresponding to the second parameter part through homomorphic operation]_PK(ii) a And at the second gradient cipher text [ GB]_PKAdding a second noise ciphertext [ epsilon 2 ] obtained by encrypting the second noise epsilon 2]_PKTo obtain the second encrypted noisy data [ GB ]]_PK+[ε2]_PK(ii) a Second encrypted noisy data [ GB ]]_PK+[ε2]_PKTo the controller C. The process of the second party B obtaining the second processing result MB may be: the second parameter part WB and the second characteristic part XB are multiplied to obtain a second processing result MB. The second party B obtains a second gradient ciphertext [ GB ]]_PKThe process of (2) can be as follows: first encryption result [ MA]_PKAnd a second encryption result [ MB ]]_PKObtaining a loss value ciphertext through homomorphic operation; determining a second gradient ciphertext [ GB ] using the lossy value ciphertext and a second parameter portion WB]_PK. The specific process can be seen in the following first party a determining a first gradient cipher text GA]_PKThe process of (1).

The first party A receives the second encryption result MB]_PKAfter that, step S240 is executed: based onFirst encryption result [ MA]_PKAnd a second encryption result [ MB ]]_PKAnd a predetermined loss function, which is used for obtaining a first gradient ciphertext [ GA ] corresponding to the first parameter part WA through homomorphic operation]_PK. It is understood that the activation function in the graph neural network is generally a non-linear function, and the homomorphic operation is generally a linear operation. In view of this, in order to support homomorphic operation and ensure the accuracy of the gradient obtained by the operation, the activation function in the neural network of the orthogonal polynomial approximation map is used, and correspondingly, the loss function takes the form of the orthogonal polynomial approximating the activation function in the neural network of the orthogonal polynomial approximation map. One aspect, approximating an activation function in a neural network using orthogonal polynomials can include: the hidden layer activation function RELU function and the outer layer (output layer) activation function softmax function of the graph neural network.

In one possible embodiment, step S240 may include the following steps 11-12:

step 11: based on the first encryption result MA]_PKAnd a second encryption result [ MB ]]_PKAnd a predetermined loss function, determining a loss value ciphertext [ L ] by homomorphic operation]_PK。

In one implementation, the orthogonal polynomial is a second order orthogonal polynomial, which can be represented by the following equation (1):

where a is a range value determined in advance based on training data (including the first feature portion and the second feature portion of the sample object) in the training data set, and x represents an independent variable.

In one implementation, the penalty function may be defined as an orthogonal polynomial with the sum of the first processing result and the second processing result as a variable. Accordingly, the loss function can be expressed by the following formula (2):

L(WA,WB)＝p(WA*XA+WB*XB)；(2)

where WA × XA indicates the first processing result, and WB × XB indicates the second processing result.

Based on the aboveEquation (1) and equation (2), loss value ciphertext [ L]_PKThe corresponding loss value plaintext L can be represented by the following formula (3):

after applying the homomorphic encryption operator, namely the loss value ciphertext [ L ] obtained based on the first encryption result, the second encryption result and the preset loss function]_PKCan be expressed by the following formula (4):

[L(WA,WB)]_PK＝[p(WA*XA+WB*XB)]_PK；(4)

accordingly, based on the above equation (3), equation (4) can be modified to equation (5):

further, the above formula (5) can be modified into formula (6):

as can be appreciated, the first encryption result MA]_PK(i.e., [ WA x XA ]]_PK) And a second encryption result [ WB x XB ]]_PKIn view of this, the above formula (6) can be modified to the following formula (7):

[L(WA,WB)]_PK＝[L_A]_PK+[L_B]_PK+[L_AB]_PK；(7)

wherein,

the homomorphism of a homomorphic encryption algorithm is utilized, namely, after the plaintext is operated, encryption is carried out again, and corresponding operation is carried out on the ciphertext after encryption, and the result is equivalent. For example, encrypting v1 and v2 with the same public key pk yields E_pk(v1) and E_pk(v2), if:

then it is assumed that the homomorphic encryption algorithm satisfies the additive homomorphism, where

The corresponding homomorphic addition operation is performed. In the practice of the method, the raw material,

the operations may correspond to conventional addition, multiplication, etc.

Another example is: encryption of v1 and v2 with the same public key pk yields E_pk(v1) and E_pk(v2), if:

then it is assumed that the homomorphic encryption algorithm satisfies the multiplicative homomorphism, where

Is the corresponding homomorphic multiply operation.

Using the homomorphism, in the formula (7), the homomorphic addition operation and the homomorphic multiplication operation are performed to obtain the loss value ciphertext [ L]_PK。

Step 12: based on loss value ciphertext [ L]_PKAnd a first parametric part WA, determining a first gradient cipher text [ GA ]]_PK. Wherein, under the condition of using maximum likelihood probability and random gradient descent mode, determining a first gradient ciphertext [ GA ]]_PKIn the concreteThe formula can be: using the first parameter part WA to obtain the loss value ciphertext [ L]_PKIs determined as a first gradient ciphertext [ GA ]]_PK。

It is understood that, on the first party a side, the resulting gradient (plaintext) can be represented by the following equation (8):

on the second party B side, the resulting gradient (plaintext) can be represented by the following equation (9):

after applying homomorphic cryptographic operators, i.e. first gradient cipher text [ GA ]]_PKCan be expressed by the following formula (10):

second gradient cipher text [ GB]_PKCan be represented by the following formula (11):

using the homomorphism, in the above equations (10) and (11), a homomorphic addition operation and a homomorphic multiplication operation are performed to obtain a first gradient ciphertext [ GA]_PKAnd a second gradient cipher text [ GB]_PK。

S250: at the first gradient ciphertext [ GA ]]_PKAdding a first noise ciphertext [ epsilon 1 ] encrypted for the first noise epsilon 1]_PKObtaining the first encrypted noisy data [ GA ]]_PK+[ε1]_PK(ii) a The first encrypted noisy data [ GA ]]_PK+[ε1]_PKTo the controller C.

In this step, leakage of the first gradient on the controller C side is avoidedThe first party A sends the first gradient cipher text to the controller C side before the first gradient cipher text is sent to the controller C side [ GA ]]_PKAnd noise is added, so that after the controller C side decrypts, the real gradient cannot be obtained, and the protection of the gradient of the first party A side is realized. In one implementation, a first party A is obtaining a first gradient ciphertext [ GA ]]_PKThen, generating a first noise, and homomorphically encrypting the first noise epsilon 1 by using a target public key to obtain a first noise ciphertext [ epsilon 1 [ ]]_PKAdding the first noise ciphertext to the first gradient ciphertext, i.e., based on the first noise ciphertext [ ε 1]_PKAnd first gradient cipher text [ GA ]]_PKTo obtain first encrypted noisy data [ GA ]]_PK+[ε1]_PK. And the first encrypted noisy data [ GA ]]_PK+[ε1]_PKTo the controller C.

Controller C receives the first encrypted noisy data GA]_PK+[ε1]_PKThereafter, the first encrypted noisy data [ GA ] is encrypted using the target private key]_PK+[ε1]_PKAnd decrypting to obtain the first encryption data GA + epsilon 1, and sending the first encryption data GA + epsilon 1 to the first party A. Accordingly, the first party a performs the subsequent step S260.

S260: receiving from controller C first encrypted noisy data [ GA ]]_PK+[ε1]_PKAnd removing the first noise epsilon 1 from the decrypted first noise data GA + epsilon 1 to obtain a first gradient plaintext GA.

S270: the first parameter portion WA is updated based on the first gradient plain text GA.

After the first party a removes the first noise epsilon 1 from the first noise adding data GA + epsilon 1 to obtain the first gradient plaintext GA, the updated value of the first parameter part WA is determined by using the first gradient plaintext GA and the current value of the first parameter part WA, and the current value of the first parameter part WA is updated to the updated value, so that the first parameter part WA is updated. The process of determining the updated value of the first parameter portion WA aims at minimizing the loss value plaintext.

Similarly, the controller C receives the second encrypted noisy data [ GB ] sent by the second party B]_PK+[ε2]_PKThe second encrypted noisy data [ GB ] is encrypted using the target private key]_PK+[ε2]_PKAnd decrypting to obtain second noise data GB + epsilon 2, and sending the second noise data GB + epsilon 2 to a second party B. And the second party B receives the second noise data GB + epsilon 2, removes the second noise epsilon 2 from the second noise data GB + epsilon 2 to obtain a second gradient plaintext GB, and updates the second parameter part WB according to the second gradient plaintext GB.

It is to be understood that the first noise and the second noise are both randomly generated noise, which may be the same or different.

The steps S210 to S270 are a model iterative training process. The above process may be performed in multiple iterations in order to train a better patterned neural network. I.e. the updated first parameter portion WA of the model parameters after step S270, returns to perform step S210.

The stopping condition of the model iterative training process may include that the iterative training time reaches a preset time threshold, or the iterative training time reaches a preset time, or the loss value is smaller than a set loss threshold, and the like.

In the embodiment, in the training process of the whole graph neural network, each party does not exchange data in the plaintext, and all communication data are encrypted data or mixed data, so that the private data are not leaked in the joint training process, and the data security is enhanced. In addition, in order to support homomorphic operation, the loss function adopts the form of orthogonal polynomial approximating the activation function in the neural network of the graph, the obtained result is approximated to be linear from nonlinear, and the accuracy of the gradient ciphertext is ensured to a certain extent by approximating the activation function by the orthogonal polynomial.

In this embodiment, the graph neural network is jointly trained by using feature parts of different dimensions of the sample object stored by the first party and the second party, so as to realize longitudinal federal training of the graph neural network. In the case that the graph neural network is a graph convolution network based on the graphrage algorithm, the algorithm on which the method provided in the embodiment of the present specification depends may be referred to as the fedgraphrage algorithm.

Referring back to the execution process of steps S210 to S270, the above embodiment is described by taking one sample object as an example. In another embodiment, the steps S210 to S260 may be performed on a batch of object samples, that is, a plurality of sample objects, to obtain a first gradient plaintext corresponding to each sample object, determine an average gradient plaintext based on the first gradient plaintext corresponding to the plurality of sample objects, and adjust the first parameter portion based on the average gradient plaintext, so that the number of times of adjusting the first parameter portion can be reduced, and the training process can be performed more easily.

In one implementation, the first party a initially stores a part of original features of each sample object, which is referred to as a first original feature part, where the first original feature part includes, but is not limited to, part of attribute information of the sample object and association relationship information of other sample objects neighboring to the sample object, and the other sample objects neighboring to the sample object may be referred to as neighboring objects. Other sample objects connected with the sample object through one or more edges in the relational network graph are neighbor objects of the sample object, wherein the other sample objects connected with the sample object through one edge in the relational network graph are one-hop neighbor objects corresponding to the sample; the other sample objects connected to the sample object by two edges (with one other sample object in between) are two-hop neighbor objects corresponding to the sample, and so on.

The first feature portion XA is the feature vector embedding of the sample object. The first feature XA may be obtained by aggregating the original features of the sample object itself and the original features of its neighboring objects. Specifically, the first party a may first determine, for each sample object, based on the relationship network graph, a neighbor object that participates in calculation and corresponds to the sample object, and then aggregate the first original feature portion of the neighbor object of the sample object and the first original feature portion of the sample object to obtain the first feature portion XA of the sample object. The second party B determines the second characteristic portion XB of each sample object by using the stored second original characteristics of each sample object, which may refer to the process of the first party a determining the first characteristic portion XA of each sample object, and is not described herein again.

It will be appreciated that the first party a and the second party B store raw features of different dimensions of the sample object. The first party a and the second party B may each determine a respective feature portion based on their stored raw features of the sample object. For example, the storing of the first original feature portion of the sample object S by the first party a includes: original features S1, original features S2 and original features S3, the first original feature part of the neighboring object Si corresponding to the sample object S includes: raw feature Si1, raw feature Si2, and raw feature Si 3. The second party B stores the second original feature portion of the sample object S including: original features S4 and original features S5, the second original feature part of the neighboring object Si corresponding to the sample object S includes: primitive features Si4 and primitive features Si 5. Where Si denotes the ith neighbor object of the sample object S.

Accordingly, the first party a aggregates the original features S1, S2 and S3 of the sample object S and the original features Si1, Si2 and Si3 of the neighboring object Si to obtain a first feature portion XA of the sample object S. The second party B aggregates the second feature part XB of the sample object S based on the original features S4 and S5 of the sample object S and the original features Si4 and, respectively, the original features Si5 of the neighboring objects Si.

In one possible embodiment, the method further comprises the following step 21: ciphertext [ L ] of loss value]_PKSending the data to a controller C;

the step S260 includes: receiving controller C in determining loss value ciphertext [ L]_PKAnd the first noise adding data G + epsilon are sent under the condition that the corresponding loss value plaintext L is not lower than the preset loss threshold.

In this embodiment, the first party A is determining the loss value ciphertext [ L]_PKThereafter, the loss value ciphertext [ L]_PKTo the controller C. Controller C decrypts loss value ciphertext [ L ] using target private key SK]_PKAnd obtaining the loss value plaintext L. Furthermore, the controller C judges whether the loss value plaintext L is not lower than a preset loss threshold value or not, and judges that the loss value plaintext L is not lower than the preset loss threshold valueIn the case of the value, it may be determined that the neural network of the graph has not reached the convergence state, and accordingly, the first noisy data GA + epsilon 1 obtained by decryption is sent to the first party a. The first party A receives the first noise adding data GA + epsilon 1, and then first noise epsilon 1 is removed from the first noise adding data GA + epsilon 1 to obtain a first gradient plaintext GA; and updates the first parameter portion WA based on the first gradient plain text GA.

And under the condition that the controller C judges that the loss value plaintext L is not lower than the preset loss threshold, second noisy data obtained by decrypting the second encrypted noisy data are sent to a second party B. The second party B receives the second noise-added data, and then second noise is removed from the second noise-added data to obtain a second gradient plaintext; and updates the second parameter part WB based on the second gradient plaintext.

In another implementation, the controller C may determine that the graph neural network reaches the convergence state when the loss value plaintext L is determined to be lower than the preset loss threshold, and thus, may determine that the graph neural network has been trained, and correspondingly, the controller C may send information indicating that the model training is completed to the first party a and the second party B, so that the first party a and the second party B determine that the graph neural network has been jointly trained.

Corresponding to the above method embodiment, this specification embodiment further provides a method for protecting a neural network of a multi-party joint training graph of private data, where, as shown in fig. 3, the method may include:

the first party 310 processes the first characteristic part of the sample object by using the first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller to obtain a first encryption result;

the second party 320 processes the second characteristic part of the sample object by using the second parameter part of the graph neural network to obtain a second processing result; using the target public key to perform homomorphic encryption on the second processing result to obtain a second encryption result, and sending the second encryption result to the first party 310;

the first party 310 obtains a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller 330; wherein the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;

after the controller 330 receives the first encrypted noisy data, decrypting the first encrypted noisy data by using a target private key corresponding to the target public key to obtain first noisy data; sending the first noisy data to the first party 310;

the first party 310 receives the first noise-added data, removes the first noise from the first noise-added data to obtain a first gradient plaintext, and updates the first parameter portion by using the first gradient plaintext.

In an implementation manner, the first party 310 is specifically configured to multiply the first parameter portion and the first characteristic portion to obtain the first processing result in the process of obtaining the first processing result.

In an implementation manner, before the obtaining of the first processing result, the first party 310 is further configured to aggregate a first feature portion by using a first original feature portion of a neighboring object of the sample object and a first original feature portion corresponding to the sample;

in an implementation manner, the first party 310 is specifically configured to determine a loss value ciphertext through a homomorphic operation based on the first encryption result, the second encryption result, and a preset loss function in the process of obtaining the first gradient ciphertext corresponding to the first parameter portion through the homomorphic operation based on the first encryption result, the second encryption result, and the preset loss function;

In one possible implementation, the first party 310 is further configured to send the loss value ciphertext to the controller;

correspondingly, the controller 320 is further configured to decrypt the loss value ciphertext by using the target private key to obtain a loss value plaintext; judging whether the loss value plaintext is not lower than a preset loss threshold value or not; and under the condition that the loss value plaintext is not lower than the preset loss threshold value, sending the first noise adding data to the first party 310.

In one possible implementation, the first party 310 is further configured to receive the target public key from the controller and store the target public key.

In the embodiment, in the training process of the whole graph neural network, each party does not exchange data in the plaintext, and all communication data are encrypted data or mixed data, so that the privacy data are not leaked in the joint training process, and the data security is enhanced. In addition, in order to support homomorphic operation, the loss function adopts the form of orthogonal polynomial approximating the activation function in the neural network of the graph, the obtained result is approximated to be linear from nonlinear, and the accuracy of the gradient ciphertext is ensured to a certain extent by approximating the activation function by the orthogonal polynomial.

The foregoing describes certain embodiments of the present specification, and other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily have to be in the particular order shown or in sequential order to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

In accordance with the above method embodiments, the present specification provides an apparatus 400 for a multi-party joint training graph neural network for protecting private data, a schematic block diagram of which is shown in fig. 4, wherein the multiple parties include a first party, a second party and a controller, and the controller is applied to the first party, the apparatus includes:

a first processing module 410 configured to process a first characteristic portion of the sample object using a first parameter portion of the graph neural network to obtain a first processing result;

a homomorphic encryption module 420 configured to homomorphic encrypt the first processing result by using the target public key of the controller to obtain a first encrypted result;

a first receiving module 430, configured to receive a second encryption result from the second party, where the second encryption result is obtained by the second party performing homomorphic encryption on a second processing result by using the target public key, and the second processing result is obtained by the second party processing a second feature portion of the sample object by using a second parameter portion of the graph neural network;

a gradient ciphertext determining module 440, configured to obtain, based on the first encryption result, the second encryption result, and a preset loss function, a first gradient ciphertext corresponding to the first parameter portion through homomorphic operation, where the loss function is in a form of an orthogonal polynomial approximating an activation function in the graph neural network;

an adding and sending module 450, configured to add a first noise ciphertext encrypted with first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller;

a second receiving module 460, configured to receive the first noisy data after decrypting the first encrypted noisy data from the controller, and remove the first noise from the first noisy data to obtain a first gradient plaintext;

an updating module 470 configured to update the first parameter portion according to the first gradient plaintext.

In an implementation manner, the first processing module is specifically configured to multiply the first parameter portion and the first characteristic portion to obtain the first processing result.

In one embodiment, the method further comprises:

an aggregation module (not shown in the figure), configured to, before the obtaining of the first processing result, aggregate the first original feature portion of the neighboring object of the sample object and the first original feature portion corresponding to the sample to obtain a first feature portion;

in an implementation manner, the gradient ciphertext determining module 440 is specifically configured to determine a loss value ciphertext through a homomorphic operation based on the first encryption result, the second encryption result, and a preset loss function;

In one embodiment, the method further comprises:

a transmission module (not shown in the figure) configured to transmit the loss value ciphertext to the controller;

the second receiving module 460 is specifically configured to receive the first noise data sent by the controller when it is determined that the loss value plaintext corresponding to the loss value ciphertext is not lower than a preset loss threshold.

In one embodiment, the method further comprises:

a receiving storage module (not shown in the figure) configured to receive the target public key from the controller and store the target public key.

In accordance with the above-described method embodiments, the present specification provides a system 500 for a multi-party joint training graph neural network for protecting private data, the system including a first party 510, a second party 520, and a controller 530, a schematic block diagram of which is shown in fig. 5, wherein,

the first party 510 is configured to process a first feature portion of the sample object using a first parameter portion of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller to obtain a first encryption result;

the second party 520 is configured to process a second feature portion of the sample object by using a second parameter portion of the graph neural network to obtain a second processing result; homomorphic encrypting the second processing result by using the target public key to obtain a second encryption result, and sending the second encryption result to the first party 510;

the first party 510 is further configured to obtain a first gradient ciphertext corresponding to the first parameter portion through homomorphic operation based on the first encryption result, the second encryption result, and a preset loss function; adding a first noise ciphertext encrypted by first noise to the first gradient ciphertext to obtain first encrypted noisy data; sending the first encrypted noisy data to the controller 530; the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;

the controller 530 is configured to decrypt the first encrypted noisy data by using a target private key corresponding to the target public key after receiving the first encrypted noisy data, so as to obtain first noisy data; sending the first noisy data to the first party 510;

the first party 510 is further configured to receive the first noise-adding data, remove the first noise from the first noise-adding data to obtain a first gradient plaintext, and update the first parameter portion with the first gradient plaintext.

In one possible implementation, the controller 530 is further configured to obtain second encrypted and noisy data sent by the second party; decrypting the second encrypted noisy data by using the target private key to obtain second noisy data; sending the second noisy data to the second party 520, where the second encrypted noisy data is obtained by the second party 520 adding a second noise ciphertext obtained by encrypting a second noise to a second gradient ciphertext; the second gradient ciphertext is obtained by homomorphic operation of the second party 520 based on the first encryption result, the second encryption result, a preset loss function and the second parameter portion;

the second party 520 is configured to receive the second noise-adding data, and remove the second noise from the second noise-adding data to obtain a second gradient plaintext; and updating the second parameter part according to a second gradient plaintext.

The device and system embodiments correspond to the method embodiments, and specific descriptions may refer to descriptions of the method embodiments, which are not repeated herein. The device and system embodiments are obtained based on the corresponding method embodiments, have the same technical effects as the corresponding method embodiments, and for specific description, refer to the corresponding method embodiments.

The present specification also provides a computer-readable storage medium, on which a computer program is stored, which, when executed in a computer, causes the computer to perform the method for multi-party joint training graph neural network for protecting privacy data provided in the present specification.

The present specification also provides a computing device, including a memory and a processor, where the memory stores executable code, and the processor executes the executable code to implement the method for the multi-party joint training graph neural network for protecting privacy data provided by the present specification.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the storage medium and the computing device embodiments, since they are substantially similar to the method embodiments, they are described relatively simply, and reference may be made to some descriptions of the method embodiments for relevant points.

Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The above-mentioned embodiments further describe the objects, technical solutions and advantages of the embodiments of the present invention in detail. It should be understood that the above description is only exemplary of the embodiments of the present invention, and is not intended to limit the scope of the present invention, and any modification, equivalent replacement, or improvement made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims

1. A method of jointly training a neural network of a plurality of parties for private data protection, the plurality of parties including a first party, a second party, and a controller, the method performed by the first party, comprising:

2. The method of claim 1, wherein the sample objects are nodes, or edges, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, articles; the edges represent associations between the business objects.

3. The method of claim 1, wherein the obtaining a first processing result comprises:

4. The method of claim 1, further comprising, prior to said obtaining a first processing result:

5. The method of claim 1, wherein the obtaining a first gradient ciphertext corresponding to the first parameter portion through a homomorphic operation based on the first and second encryption results and a preset loss function comprises:

6. The method of claim 5, further comprising:

sending the loss value ciphertext to the controller;

7. The method of claim 1, the orthogonal polynomial being a second order orthogonal polynomial.

8. The method of claim 1, wherein the loss function is defined as an orthogonal polynomial with a sum of the first and second processing results as a variable.

9. The method of claim 1, further comprising:

the target public key is received from the controller and stored.

10. A method of multi-party joint training of graph neural networks for protecting private data, wherein the method comprises:

11. An apparatus for jointly training a neural network of a plurality of parties including a first party, a second party and a controller to protect private data, the apparatus being deployed at the first party, the apparatus comprising:

a second receiving module, configured to receive, from the controller, first noise-added data obtained by decrypting the first encrypted noise-added data, and remove the first noise from the first noise-added data to obtain a first gradient plaintext;

12. The apparatus of claim 11, wherein the sample object is a node, or an edge, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, articles; the edges represent associations between the business objects.

13. The apparatus according to claim 11, wherein the first processing module is specifically configured to multiply the first parameter portion and the first characteristic portion to obtain the first processing result.

14. The apparatus of claim 11, further comprising:

and the aggregation module is configured to aggregate the first characteristic part by using the first original characteristic part of the neighbor object of the sample object and the first original characteristic part corresponding to the sample before the first processing result is obtained.

15. The apparatus according to claim 11, wherein the gradient ciphertext determination module is specifically configured to determine the loss value ciphertext through a homomorphic operation based on the first and second encryption results and a preset loss function;

16. The apparatus of claim 15, further comprising:

a transmission module configured to transmit the loss value ciphertext to the controller;

the second receiving module is specifically configured to receive the first noise adding data sent by the controller when it is determined that a loss value plaintext corresponding to the loss value ciphertext is not lower than a preset loss threshold.

17. The apparatus of claim 11, the orthogonal polynomial being a second order orthogonal polynomial.

18. The apparatus of claim 11, wherein the loss function is defined as an orthogonal polynomial with a sum of the first and second processing results as a variable.

19. The apparatus of claim 11, further comprising:

a receiving storage module configured to receive the target public key from the controller and store the target public key.

20. A system for multi-party joint training of neural networks for protecting private data, the system comprising a first party, a second party and a controller,

21. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that when executed by the processor implements the method of any of claims 1-9.