CN112507323A

CN112507323A - Model training method and device based on unidirectional network and computing equipment

Info

Publication number: CN112507323A
Application number: CN202110133496.XA
Authority: CN
Inventors: 周亚顺; 赵原; 尹栋
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2021-02-01
Filing date: 2021-02-01
Publication date: 2021-03-16

Abstract

The embodiment of the specification discloses a model training method and device based on a unidirectional network and computing equipment. The unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method may be applied to a first party, comprising: receiving a first random number set sent by a third party; jointly training a model with a second party according to a first random number set based on a secret sharing algorithm; in the process of jointly training the model, a first party receives an intermediate result sent by a second party when the secret is shared, and the sharding of the secret is determined according to the intermediate result and the first random number set. The embodiment of the specification can realize privacy protection based on the secret sharing algorithm joint training model under the condition that one of multiple parties of joint modeling is not allowed to be actively accessed.

Description

Model training method and device based on unidirectional network and computing equipment

Technical Field

The embodiment of the specification relates to the technical field of computers, in particular to a model training method, a model training device and computing equipment based on a one-way network.

Background

In the big data era, there are very many data islands. Data is often scattered in different enterprises, and enterprises do not trust each other completely due to the consideration of competitive relationship and privacy protection.

In some cases, joint modeling is needed between enterprises, so that the data of all parties can be used for collaborative training of mathematical models on the premise of sufficiently protecting the data privacy of the enterprises. Data used for training the mathematical model are distributed among all parties of the joint modeling, so how to protect the data privacy of all the parties of the modeling in the model training process is a technical problem which needs to be solved urgently at present.

Disclosure of Invention

The embodiment of the specification provides a model training method, a model training device and a computing device based on a unidirectional network, so that data privacy of all modeling parties is protected in the model training process. The technical scheme of the embodiment of the specification is as follows.

In a first aspect of embodiments of the present specification, a model training method based on a unidirectional network is provided, where the unidirectional network includes a first party, a second party, and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method is applied to a first party and comprises the following steps:

receiving a first random number set sent by a third party;

jointly training a model with a second party according to a first random number set based on a secret sharing algorithm; in the process of jointly training the model, a first party receives an intermediate result sent by a second party when the secret is shared, and the sharding of the secret is determined according to the intermediate result and the first random number set.

In a second aspect of embodiments of the present specification, there is provided a model training method based on a unidirectional network, the unidirectional network including a first party, a second party, and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method is applied to a second party and comprises the following steps:

sending a first request to a third party;

receiving a second random number set fed back by a third party;

based on a secret sharing algorithm, jointly training a model with the first party according to the second random number set; in the process of jointly training the model, the second party sends a second request to the first party, receives an intermediate result fed back by the first party when the secret is shared, and determines the secret fragmentation according to the intermediate result and the second random number set.

In a third aspect of the embodiments of the present specification, a model training method based on a unidirectional network is provided, where the unidirectional network includes a first party, a second party, and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method is applied to a third party and comprises the following steps:

receiving a first request sent by a second party;

in response to the first request, sending a first set of random numbers to a first party and a second set of random numbers to a second party; so that the first party jointly trains the model according to the first random number set and the second party jointly trains the model according to the second random number set based on the secret sharing algorithm.

In a fourth aspect of embodiments of the present specification, there is provided a model training apparatus based on a unidirectional network, the unidirectional network including a first party, a second party, and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the apparatus, applied to a first party, comprises:

the receiving unit is used for receiving a first random number set sent by a third party;

the training unit is used for training a model jointly with a second party according to the first random number set based on a secret sharing algorithm; in the process of jointly training the model, a first party receives an intermediate result sent by a second party when the secret is shared, and the sharding of the secret is determined according to the intermediate result and the first random number set.

In a fifth aspect of embodiments of the present specification, there is provided a model training apparatus based on a unidirectional network, the unidirectional network including a first party, a second party, and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the apparatus, applied to a second party, comprises:

a sending unit, configured to send a first request to a third party;

the receiving unit is used for receiving a second random number set fed back by a third party;

the training unit is used for training the model with the first party in a combined manner according to the second random number set based on the secret sharing algorithm; in the process of jointly training the model, the second party sends a second request to the first party, receives an intermediate result fed back by the first party when the secret is shared, and determines the secret fragmentation according to the intermediate result and the second random number set.

In a sixth aspect of embodiments of the present specification, there is provided a model training apparatus based on a unidirectional network, the unidirectional network including a first party, a second party, and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the device is applied to a third party and comprises:

a receiving unit, configured to receive a first request sent by a second party;

a sending unit, configured to send a first random number set to a first party and a second random number set to a second party in response to the first request; so that the first party jointly trains the model according to the first random number set and the second party jointly trains the model according to the second random number set based on the secret sharing algorithm.

A seventh aspect of embodiments of the present specification provides a computing device, comprising:

at least one processor;

a memory storing program instructions configured to be suitable for execution by the at least one processor, the program instructions comprising instructions for performing the method of the first, second or third aspect.

According to the technical scheme provided by the embodiment of the specification, under the condition that one of multiple parties in the combined modeling is not allowed to be actively accessed, the privacy protection can be realized based on the secret sharing algorithm combined training model.

Drawings

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of a secret sharing process in an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a model training process in an embodiment of the present disclosure;

FIG. 4 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 5 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 6 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 9 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of an electronic device in an embodiment of the present specification.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.

multi-Party Secure computing (MPC) is an algorithm that protects data privacy and security. The multi-party security calculation can ensure that a plurality of data parties perform cooperative calculation on the premise of not leaking own data.

Secret Sharing (SS) is an algorithm for protecting data privacy and security. A plurality of data parties can perform cooperative calculation by using a secret sharing algorithm on the premise of not leaking own data, and secret information is shared. The various data parties may each obtain a piece of secret information. Please refer to fig. 1. For example, suppose there is a data partyP ₁Data sideP ₂And a Trusted Third Party (TTP). Data sideP ₁Holding business datax ₁Data sideP ₂Holding business datax ₂. Using secret sharing algorithms, data partiesP ₁And data sideP ₂Can perform cooperative calculation and share secret informationy. Data sideP ₁Secret information can be obtainedyIs divided intoy ₁Data sideP ₂Secret information can be obtainedyIs divided intoy ₂。y=y ₁+y ₂=x ₁ x ₂. In particular, a trusted third party may generate random numbersURandom number ofZ ₁Random number ofVRandom number ofZ ₂(ii) a Can go to the data sideP ₁Issuing random numbersUAnd random numberZ ₁(ii) a Can go to the data sideP ₂Issuing random numbersVAnd random numberZ ₂. Random numberURandom number ofZ ₁Random number ofVAnd random numberZ ₂Satisfy the relationZ ₁+Z ₂=UV. Data sideP ₁Can receive random numbersUAnd random numberZ ₁(ii) a Can calculateE=x ₁-U(ii) a Can go to the data sideP ₂Transmitting random numbersE. Data sideP ₂Can receive random numbersVAnd random numberZ ₂(ii) a Can calculateF=x ₂-V(ii) a Can go to the data sideP ₁Transmitting random numbersF. Thus, the data sideP ₁Can receive random numbersF(ii) a Can countCalculating secret informationyIs divided intoy ₁=UF+Z ₁. Data sideP ₂Can receive random numbersE(ii) a Secret information can be calculatedyIs divided intoy ₂=Ex ₂+Z ₂。

An excitation Function (also known as an Activation Function) may be used to construct the mathematical model. The excitation function defines the output at a given input. The excitation function is a non-linear function. Nonlinear factors can be added into the mathematical model through the excitation function, and the expression capacity of the mathematical model is improved. The excitation function may include a Sigmoid function, a Tanh function, a ReLU function, and the like. A Loss Function (Loss Function) may be used to measure the degree of inconsistency between the predicted and true values of the mathematical model. The smaller the value of the loss function, the better the robustness of the representation mathematical model. The Loss Function includes, but is not limited to, a Logarithmic Loss Function (Logarithmic Loss Function), a Square Loss Function (Square Loss), and the like. The mathematical model may include a logistic regression model, a neural network model, and the like.

In the related art, each data party for joint modeling can open its own communication port to the outside, so that each data party can be actively accessed by other data parties. Here, the data side can be actively accessed by other data sides, and it can be understood that: other data parties can actively establish communication connection with the data party (of course, the data party can also actively establish communication connection with other data parties). For example, the data sideP ₁And data sideP ₂Two parties of the joint modeling. Data sideP ₁And data sideP ₂Can open own communication port to the outside, so that the data sideP ₁Can be data sideP ₂Active access, data sideP ₂Can also be used by data sideP ₁And (4) active access. In some cases, however, one or more data parties for federated modeling need to have extremely strong privacy protection so that the one or more data parties may not open their own communication ports to the outside. Such that said one or more numbersAn actor cannot be actively accessed by other actors. Here, the data side cannot be actively accessed by other data sides, and it can be understood that: the other data parties cannot actively establish communication connection with the data party (but the data party can actively establish communication connection with the other data parties). For example, the data sideP ₁And data sideP ₂Two parties of the joint modeling. Data sideP ₁Can open its own communication port to the outside, so that the data sideP ₁Can be data sideP ₂And (4) active access. Data sideP ₂The communication port of the data party can not be opened to the outside so that the data partyP ₂Can not be data sideP ₁And (4) active access.

Embodiments of the present description provide a unidirectional network with which a mathematical model may be trained.

In some embodiments, the unidirectional network may include a first party, a second party, and a third party.

The first party, the second party, and the third party may be a single server, a server cluster composed of a plurality of servers, or a server deployed in the cloud. The first party and the second party can be two parties of joint modeling, and the third party is used for providing random numbers required in the joint modeling process for the first party and the second party.

In the unidirectional network, the first party and the third party are allowed to be actively accessed, and the second party is not allowed to be actively accessed. Enabling the first party to actively establish communication connection with the third party but not actively establish communication connection with the second party; the second party can actively establish communication connection with the first party and the third party; the third party can actively establish the communication connection with the first party but cannot actively establish the communication connection with the second party.

In some embodiments, training samples are scattered at the first party and the second party. Specifically, the first party may hold feature data of a training sample, and the second party may hold label data of the training sample. For example, the first party may be a big data company that holds characteristic data such as the amount of the user loan, the base of the social security paid by the user, whether the user has been married, and whether the user has a room. The big data company may open its own communication port so that the big data company is allowed to be actively accessed. The second party may be a credit investigation institution which may hold tag data for the user, the tag data being indicative of the user's credit status. Since the credit situation relates to the personal privacy of the user, an extremely strong privacy protection is required. The credit bureau may not open its own communication port, so that the credit bureau is not allowed to be actively accessed. In addition, the number of training samples may be plural. The first party may hold feature data of the training samples and identifications of the training samples, and the second party may hold label data of the training samples and identifications of the training samples. Thus, by using the identification of the training samples, the first party can select the feature data of one or more training samples, and the second party can select the label data of the same training sample to jointly train the mathematical model. Of course, in practical applications, the first party may also hold the label data of the training samples, and the second party may also hold the feature data of the training samples.

In some embodiments, the second party may send a first request to the third party. The third party may receive a first request; a first set of random numbers and a second set of random numbers may be obtained. The third party may issue a first set of random numbers to the first party, and the first party may receive the first set of random numbers. The third party may respond (respond) to the first request with a second set of random numbers, which is fed back to the second party.

The first set of random numbers and the second set of random numbers may include at least one random number. The random numbers in the first random number set and the random numbers in the second random number set meet preset conditions. In particular, the first set of random numbers may include a first set of sub-random numbers and a second set of sub-random numbers, and the second set of random numbers may include a third set of sub-random numbers and a third set of sub-random numbersA fourth set of sub-random numbers. And random numbers in the first sub random number set, the second sub random number set, the third sub random number set and the fourth sub random number set meet the preset condition. For example, the first set of random numbers may be represented as { ({ (R) }U ₁,U ₂,…,U _i,…,U _n)、(Z ₁₁,Z ₁₂,…,Z _1i,…,Z _1n) -said second set of random numbers may be represented as { ({ (R) }V ₁,V ₂,…,V _i,…,V _n)、(Z ₂₁,Z ₂₂,…,Z _2i,…,Z _2n)}. Wherein (A), (B), (C), (D), (C), (U ₁,U ₂,…,U _i,…,U _n) Is a first set of sub-random numbers (Z ₁₁,Z ₁₂,…,Z _1i,…,Z _1n) Is a second set of sub-random numbers (V ₁,V ₂,…,V _i,…,V _n) Is a third set of sub-random numbers (Z ₂₁,Z ₂₂,…,Z _2i,…,Z _2n) Is a fourth set of sub-random numbers. The preset condition may beZ _1i+Z _2i=U _i V _i。

The third party may generate the first set of random numbers and the second set of random numbers after receiving the first request from the second party. Alternatively, the second party may send a separate random number generation request to the third party. The third party may receive a random number generation request; the first set of random numbers and the second set of random numbers may be generated; the first set of random numbers and the second set of random numbers may be stored. The third party may thus read the stored first and second random number sets after receiving the first request from the second party.

In some embodiments, the first party may jointly train the model based on a secret sharing algorithm according to a first set of random numbers and the second party may jointly train the model according to a second set of random numbers. In the course of co-training the model, the first party may determine a first intermediate result in sharing the secret, and the second party may determine a second intermediate result in sharing the secret. Since the second party may have active access to the first party, the second party may send the second intermediate result to the first party. The first party may receive the second intermediate result and may determine a shard of the secret based on the second intermediate result and the first set of random numbers. In addition, since the first party cannot actively access the second party, the second party may send a second request to the first party, and in response to the second request, the first intermediate result may be fed back to the second party. The second party may receive the first intermediate result and may determine another shard of the secret based on the first intermediate result and the second set of random numbers. Therefore, under the condition that the second party is not allowed to be actively accessed, the privacy protection can be realized based on the secret sharing algorithm joint training model.

The secret may comprise a secret of the first party and the second party in a joint training model process. The secret may also be different depending on the method of the first and second parties in jointly training the model. For example, the first party and the second party may jointly train a model using a gradient descent method based on a secret sharing algorithm. Then, the secret may comprise at least one of: the product between the characteristic data and the model parameter, the value of the excitation function and the gradient of the loss function. As another example, the first party and the second party may jointly train a model using newton's method based on a secret sharing algorithm. Then, the secret may comprise at least one of: the product between the characteristic data and the model parameter, the value of the excitation function, the gradient of the loss function, the hessian matrix and the inverse matrix of the hessian matrix. The Hessian Matrix (also called blackplug Matrix, hatse Matrix, or sea plug Matrix) is a square Matrix formed by second-order partial derivatives of the loss function and used for representing the local curvature of the loss function.

An example scenario for a first-party and a second-party joint training model in the embodiments of the present disclosure is described below.

In this scenario example, the first party may hold a first slice of feature data and model parameters of a training sample, and the second party may hold a second slice of label data and model parameters of a training sample. The first party can jointly train the model by adopting a gradient descent method according to the characteristic data, the first fragment of the model parameter and the first random number set, and the second party can jointly train the model by adopting the gradient descent method based on the secret sharing algorithm according to the label data, the second fragment of the model parameter and the second random number set.

It should be noted that, although in this scenario example, the feature data of the training sample held by the first party and the label data of the training sample held by the second party are taken as an example, the process of jointly training the model by the first party and the second party is described. But is not limited to this in practical applications. For example, in practical applications, the first party may also hold label data of a training sample, and the second party may also hold feature data of the training sample. In addition, although in the present scenario example, a gradient descent method is taken as an example, a process of jointly training a model by the first party and the second party is described. But is not limited to this in practical applications. For example, in practical applications, the first party and the second party may also jointly train a model by using a newton method.

The joint training model includes a plurality of rounds of iterative processes. Referring to fig. 2 and 3, each iteration process includes the following steps.

Step S101: the first party determines a first intermediate result in sharing the product according to the first random number set and the feature data.

Step S103: the second party determines a second intermediate result in sharing the product according to the second random number set and the second piece of the model parameter.

In some embodiments, the product may comprise a product between the feature data and the model parameters. For example, the characteristic data may be expressed asXThe model parameters can be expressed asWThe product can be expressed asWX=X·W。

In some embodiments, the first and second parties may each hold a slice of the model parameters. Specifically, theThe first party may hold a first slice of the model parameters and the second party may hold a second slice of the model parameters. The sum of the first patch of model parameters and the second patch of model parameters is equal to the model parameters. Continuing the previous example, the first party may be in possession of<W>₀The second party can hold<W>₁。<W>₀+<W>₁=W。

If the iteration process of the current round is the first iteration process, the model parameters may be initial model parameters of the mathematical model. The initial model parameters may be empirical values or random values, etc. In practical application, the third party or other trusted computing equipment can split the initial model parameters of the mathematical model to obtain a first fragment and a second fragment of the initial model parameters; a first slice of initial model parameters may be sent to the first party; a second slice of initial model parameters may be sent to the second party. The first party may receive a first slice of initial model parameters. The second party may receive a second slice of initial model parameters. If the iteration process of the current round is a non-initial round, the first party can obtain a first fragment of the model parameter through the iteration process of the previous round, and the second party can obtain a second fragment of the model parameter.

In some embodiments, the first intermediate result and the second intermediate result are intermediate results generated by the first party and the second party, respectively, in the process of secretly sharing the product. Specifically, the first party may determine a first intermediate result in the process of sharing the product according to the random numbers in the first random number set and the feature data. The second party may determine a second intermediate result in sharing the product according to the random number in the second random number and the second piece of the model parameter. Continuing with the previous example, the first set of random numbers may be represented as { ({ (S) }U ₁,Z ₁₁),(U ₂,Z ₁₂),…,(U _i,Z _1i),…,(U _n,Z _1n) The second random ofThe number set may be represented as { ({ (R) { (V ₁,Z ₂₁),(V ₂,Z ₂₂),…,(V _i,Z _2i),…,(U _n,Z _2n)}。Z _1i+Z _2i=U _i V _i. The first party may be based on a random numberU ₁And characteristic dataXCalculating a first intermediate resultE ₁=X-U ₁. The second party may be based on a random numberV ₁And a second slice of model parameters<W>₁Calculating a second intermediate resultF ₁=<W>₁-V ₁。

Step S105: the second party sends the second intermediate result and the first request to the first party.

In some embodiments, the second party may send a second intermediate result to the first party. In addition, the first party is not able to actively send the first intermediate result to the second party since the second party is not allowed to be actively accessed. The second party may thus send a first request to the first party requesting that a first intermediate result be obtained.

In practical applications, the second party may send the second intermediate result to the first party first, and then send the first request to the first party. Or, the second party may also send a first request to the first party first, and send a second intermediate result to the first party. Alternatively, the second party may also send a second intermediate result and the first request in parallel to the first party.

Step S107: the first party determines a first fragment of the product according to the second intermediate result, the first fragment of the model parameter and the first random number set; and feeding back the first intermediate result to the second party in response to a first request sent by the second party.

Step S109: the second party determines a second segment of the product based on the first intermediate result and the second set of random numbers.

In some embodiments, the first party may receive a second intermediate result; a first slice of the product may be determined based on the second intermediate result, the first slice of the model parameters, and the random numbers in the first random number set.

Continuing with the previous example, the first party may receive a second intermediate resultF ₁=<W>₁-V ₁(ii) a May be based on the second intermediate resultF ₁=<W>₁-V ₁Random number ofZ ₁₁Calculating<[X<W>₁]>₀=U ₁ F ₁+Z ₁₁(ii) a And can further be based on<[X<W>₁]>₀First slice of model parameters<W>₀CalculatingX<W>₀+<[X<W>₁]>₀As said productWXFirst segment of<WX>₀。

In some embodiments, the first party may receive a first request; the first intermediate result may be fed back to the second party as a Response (Response) to the first request. The second party may receive a first intermediate result; a second slice of the product may be determined based on the first intermediate result, the random numbers in the second set of random numbers.

The sum of the second slice of the product and the first slice of the product is equal to the product.

Continuing with the previous example, the second party may receive a first intermediate resultE ₁=X-U ₁(ii) a May be based on the first intermediate resultE ₁=X-U ₁Random number ofZ ₂₁Calculating<[X<W>₁]>₁=E ₁<W>₁+Z ₂₁As said productWXSecond section of<WX>₁。<[X<W>₁]>₁+<[X<W>₁]>₀=X<W>₁。<WX>₁+<WX>₀=WX。

It is noted that the random number used in determining the first intermediate result, the random number used in determining the second intermediate result, the random number used in determining the first fraction of the product, and the random number used in determining the second fraction of the product may constitute a random array. Continuing with the previous example (U ₁,Z ₁₁,V ₁,Z ₂₁) The random array may be structured such thatZ ₁₁+Z ₂₁=U ₁ V ₁。

Step S111: the first party determines a third intermediate result when the value of the incentive function is shared according to the first random number set and the first fragment of the product.

Step S113: the second party determines a fourth intermediate result in sharing the value of the incentive function according to the second random number set and the second piece of the product.

In some embodiments, the third intermediate result and the fourth intermediate result are intermediate results generated by the first party and the second party in the process of evaluating the secret sharing incentive function, respectively. Specifically, the first party may determine a third intermediate result when the excitation function is shared according to the random numbers in the first random number set and the first piece of the product. The second party may determine a fourth intermediate result when the shared incentive function takes a value according to the random number in the second random number and the second piece of the product. In practical applications, it is considered that, when a value of an excitation function is obtained, a nonlinear operation (for example, a logarithmic operation, an exponential operation, a trigonometric function operation, and the like) is usually involved, so that it is difficult to directly determine the value of the excitation function by using a secret sharing method. Therefore, a polynomial can be used for fitting the excitation function, and the value of the polynomial is determined by adopting a secret sharing mode and is used as the value of the excitation function. Specifically, the first party may determine a third intermediate result in the sharing of the polynomial value according to the random numbers in the first random number set and the first piece of the product. The second party may determine a fourth intermediate result in the sharing of the polynomial value according to the random number in the second random number and the second piece of the product.

Continuing with the previous example, the excitation function may be a Sigmoid function and the polynomial may be expressed as

a=a ₀+a ₁(WX)+a ₂(WX)³

=a ₀+a ₁(<WX>₁+<WX>₀)+a ₂(<WX>₁ ³+3<WX>₁ ²·<WX>₀+3<WX>₁·<WX>₀ ²+<WX>₀ ³)。

Then, the first party may be based on a random numberU ₂Random number ofU ₃First fragment of the product<WX>₀CalculatingE ₂=3a ₂<WX>₀-U ₂AndE ₃=3a ₂<WX>₀ ²-U ₃as a third intermediate result. The second party may be based on a random numberV ₂Random number ofV ₃Second section of the product<WX>₁CalculatingF ₂=<WX>₁ ²-V ₂AndF ₃=<WX>₁-V ₃as a fourth intermediate result.

Step S115: the second party sends the fourth intermediate result and the second request to the first party.

In some embodiments, the second party may send a fourth intermediate result to the first party. Additionally, the first party is unable to actively send a third intermediate result to the second party because the second party is not allowed to be actively accessed. The second party may thus send a second request to the first party requesting a third intermediate result.

In practical applications, the second party may send the fourth intermediate result to the first party first, and then send the second request to the first party. Or, the second party may also send a second request to the first party first, and then send a fourth intermediate result to the first party. Alternatively, the second party may also send a fourth intermediate result and a second request in parallel to the first party.

Step S117: the first party determines a first fragment of the value of the excitation function according to the fourth intermediate result and the first random number set; and feeding back the third intermediate result to the second party in response to a second request from the second party.

Step S119: and the second party determines a second fragment of the value of the excitation function according to the third intermediate result and the second random number set.

In some embodiments, the first party may receive a fourth intermediate result; the first segment of the excitation function value may be determined based on the fourth intermediate result and the random numbers in the first random number set.

In practical applications, a polynomial may be used to fit the excitation function. Thus, the first party may determine, as the first segment of the excitation function value, the first segment of the polynomial value according to the fourth intermediate result and the random numbers in the first random number set.

Continuing with the previous example, the first party may receive a fourth intermediate resultF ₂=<WX>₁ ²-V ₂AndF ₃=<WX>₁-V ₃(ii) a May be based on the fourth intermediate resultF ₂=<WX>₁ ²-V ₂Random number ofZ ₁₂Calculating<[3a ₂<WX>₁ ²·<WX>₀]>₀=U ₂ F ₂+Z ₁₂(ii) a May be based on the fourth intermediate resultF ₃=<WX>₁-V ₃Random number ofZ ₁₃Calculating<[3a ₂<WX>₁·<WX>₀ ²]>₀=U ₃ F ₃+Z ₁₃(ii) a And can further calculatea ₀+a ₁<WX>₀+<[3a ₂<WX>₁ ²·<WX>₀]>₀+<[3a ₂<WX>₁·<WX>₀ ²]>₀+a ₂<WX>₀ ³Taking values as polynomialsaFirst segment of<a>₀。

In some embodiments, the first party may receive a second request; a third intermediate result may be fed back to the second party as a Response (Response) to the second request. The second party may receive a third intermediate result; the second segment of the excitation function value may be determined according to the third intermediate result and the random numbers in the second random number set.

In practical applications, a polynomial may be used to fit the excitation function. Thus, the second party may determine, as the second segment of the excitation function value, the second segment of the polynomial value according to the third intermediate result and the random numbers in the second random number set.

The sum of the first slice of the excitation function value and the second slice of the excitation function value is equal to the value of the excitation function. The sum of the first piece of the polynomial value and the second piece of the polynomial value equals the value of the polynomial.

Continuing with the previous example, the second party may receive a third intermediate resultE ₂=3a ₂<WX>₀-U ₂AndE ₃=3a ₂<WX>₀ ²-U ₃(ii) a May be based on the third intermediate resultE ₂=3a ₂<WX>₀-U ₂Random number ofZ ₂₂Calculating<[3a ₂<WX>₁ ²·<WX>₀]>₁=E ₂<W>₁ ²+Z ₂₂(ii) a May be based on the third intermediate resultE ₃=3a ₂<WX>₀ ²-U ₃Random number ofZ ₂₃Calculating<[3a ₂<WX>₁·<WX>₀ ²]>₀=E ₃<W>₁+Z ₂₃(ii) a And can further calculatea ₁<WX>₁+<[3a ₂<WX>₁ ²·<WX>₀]>₁+<[3a ₂<WX>₁·<WX>₀ ²]>₁+a ₂<WX>₁ ³Taking values as polynomialsaSecond section of<a>₁。<[3a ₂<WX>₁ ²·<WX>₀]>₀+<[3a ₂<WX>₁ ²·<WX>₀]>₁=3a ₂<WX>₁ ²·<WX>₀。<[3a ₂<WX>₁·<WX>₀ ²]>₀+<[3a ₂<WX>₁·<WX>₀ ²]>₁=3a ₂<WX>₁·<WX>₀ ²。<a>₀+<a>₁=a。

It is worth mentioning that the random number used when determining the third intermediate result, the random number used when determining the fourth intermediate result, the random number used when determining the first segment of the excitation function value, and the random number used when determining the second segment of the excitation function value may form a random number group. Continuing with the previous example (U ₂,Z ₁₂,V ₂,Z ₂₂) Can be formed intoRandom array such thatZ ₁₂+Z ₂₂=U ₂ V ₂。(U ₃,Z ₁₃,V ₃,Z ₂₃) The random array may be structured such thatZ ₁₃+Z ₂₃=U ₃ V ₃。

Step S121: the first party determines a fifth intermediate result when the loss function gradient is shared according to the first random number set, the feature data and the first fragment of the excitation function value.

Step S123: and the second party determines a sixth intermediate result when the loss function gradient is shared according to the second random number set, the tag data and the second fragment of the excitation function value.

In some embodiments, the fifth intermediate result and the sixth intermediate result are intermediate results generated by the first party and the second party, respectively, in the process of secret sharing loss function gradient. Specifically, the first party may determine a fifth intermediate result when sharing the gradient of the loss function according to the random numbers in the first random number set, the feature data, and the first segment of the excitation function value. The second party may determine a sixth intermediate result when sharing the gradient of the loss function according to the random number in the second random number, the tag data, and the second segment of the excitation function value.

Continuing with the previous example, the first party may be based on a random numberU ₄Random number ofU ₅Characteristic dataXCalculatingE ₄=X ^T-U ₄AndE ₅=X ^T-U ₅as a fifth intermediate result. The second party may be based on a random numberV ₄Random number ofV ₅Second section for excitation function value<a>₁Tag datayCalculatingF ₄=<a>₁-V ₄AndF ₅=y-V ₅as a sixth intermediate result.

Step S125: the second party sends a sixth intermediate result and a third request to the first party.

In some embodiments, the second party may send a sixth intermediate result to the first party. Additionally, the first party is unable to actively send a fifth intermediate result to the second party because the second party is not allowed to be actively accessed. The second party may thus send a third request to the first party requesting a fifth intermediate result.

In practical applications, the second party may send the sixth intermediate result to the first party first, and then send the third request to the first party. Or, the second party may also send a third request to the first party first, and then send a sixth intermediate result to the first party. Alternatively, the second party may also send a sixth intermediate result and a third request in parallel to the first party.

Step S127: the first party determines a first segment of the loss function gradient according to the sixth intermediate result and the first random number set; and feeding back a fifth intermediate result to the second party in response to a third request from the second party.

Step S129: the second party determines a second patch of the gradient of the loss function based on the fifth intermediate result and the second set of random numbers.

In some embodiments, the first party may receive a sixth intermediate result; the first slice of the gradient of the penalty function may be determined based on the sixth intermediate result, the random numbers in the first random number set.

Continuing with the previous example, the first party may receive a sixth intermediate resultF ₄=<a>₁-V ₄AndF ₅=y-V ₅(ii) a May be based on a sixth intermediate resultF ₄=<a>₁-V ₄Random number ofZ ₁₄Calculating<[X ^T<a>₁]>₀=U ₄ F ₄+Z ₁₄(ii) a May be based on a sixth intermediate resultF ₅=y-V ₅Random number ofZ ₁₅Calculating<[X ^T y]>₀=U ₅ F ₅+Z ₁₅(ii) a And can further calculateX ^T<a>₀+<[X ^T<a>₁]>₀+<[X ^T y]>₀Gradient as a function of lossdW=X ^T(a-y) First segment of<dW>₀。

In some embodiments, the first party may receive a third request; a fifth intermediate result may be fed back to the second party as a Response (Response) to the third request. The second party may receive a fifth intermediate result; a second slice of the gradient of the loss function may be determined based on the fifth intermediate result, the random numbers in the second set of random numbers.

The sum of the first patch of the loss function gradient and the second patch of the loss function gradient is equal to the gradient of the loss function.

Continuing with the previous example, the second party may receive a fifth intermediate resultE ₄=X ^T-U ₄AndE ₅=X ^T-U ₅(ii) a May be based on a fifth intermediate resultE ₄=X ^T-U ₄Random number ofZ ₂₄Calculating<[X ^T<a>₁]>₁=E ₄<a>₁+Z ₂₄(ii) a May be based on a fifth intermediate resultE ₅=X ^T-U ₅Random number ofZ ₂₅Calculating<[X ^T y]>₁=E ₅ y+Z ₂₅(ii) a And can further calculate<[X ^T<a>₁]>₁+<[X ^T y]>₁Gradient as a function of lossdW=X ^T(a-y) Second section of<dW>₁。<dW>₀+<dW>₁=dW。

It should be noted that the random number used for determining the fifth intermediate result, the random number used for determining the sixth intermediate result, the random number used for determining the first slice of the loss function gradient, and the random number used for determining the second slice of the loss function gradient may constitute a random number group. Continuing with the previous example (U ₄,Z ₁₄,V ₄,Z ₂₄) The random array may be structured such thatZ ₁₄+Z ₂₄=U ₄ V ₄。(U ₅,Z ₁₅,V ₅,Z ₂₅) The random array may be structured such thatZ ₁₅+Z ₂₅=U ₅ V ₅。

Step S131: and the first party determines a first fragment of a new model parameter according to the first fragment of the loss function gradient, the first fragment of the model parameter and the preset step length.

Step S133: and the second party determines a new second fragment of the model parameter according to the second fragment of the loss function gradient, the second fragment of the model parameter and the preset step length.

In some embodiments, the preset step size may be used to control the iteration speed of the gradient descent method. The preset step length can be flexibly set according to actual needs. For example, when the preset step size is too large, the iteration speed is too fast, so that the optimal model parameters may not be obtained. When the preset step size is too small, the iteration speed is too slow, and the time is long. The preset step length may specifically be an empirical value; alternatively, the method may be obtained by machine learning. Of course, the preset step length can also be obtained in other manners. The first party and the second party may both hold the preset step size.

In some embodiments, the first party may multiply a first slice of the loss function gradient by a preset step size; the first slice of the model parameter may be subtracted from the multiplication result to obtain a new first slice of the model parameter. The second party may multiply a second slice of the loss function gradient by a preset step size; the second slice of the model parameter may be subtracted from the multiplication result to obtain a new second slice of the model parameter.

The sum of the first slice of the new model parameters and the second slice of the new model parameters is equal to the new model parameters.

Continuing with the previous example, the first party may calculate<W’>₀=<W>₀-L<dW>₀As the first slice of the new model parameters. The second party can calculate<W’>₁=<W>₁-L<dW>₁As a second slice of the new model parameters. And L is a preset step length.<W>₀+<W>₁=W。<W’>₀+<W’>₁=W’。

The embodiment of the specification provides a model training method based on a unidirectional network. The unidirectional network may include a first party, a second party, and a third party. In the unidirectional network, the first party and the third party cannot actively access the second party. Referring to fig. 4, the model training method may be applied to the first party, and specifically may include the following steps.

Step S41: a first set of random numbers from a third party is received.

Step S43: jointly training a model with a second party according to a first random number set based on a secret sharing algorithm; in the process of jointly training the model, the first party receives an intermediate result sent by the second party when the secret is shared, and the fragment of the secret is determined according to the intermediate result and the first random number set.

The model training method in the embodiment of the specification can realize privacy protection based on the secret sharing algorithm joint training model under the condition that one of multiple parties of joint modeling is not allowed to be actively accessed.

The embodiment of the specification also provides another model training method based on the unidirectional network. The unidirectional network may include a first party, a second party, and a third party. In the unidirectional network, the first party and the third party cannot actively access the second party. Referring to fig. 5, the model training method may be applied to the second party, and specifically may include the following steps.

Step S51: a first request is sent to a third party.

Step S53: and receiving a second random number set fed back by a third party.

Step S55: based on a secret sharing algorithm, jointly training a model with the first party according to the second random number set; in the process of jointly training the model, the second party sends a second request to the first party, receives an intermediate result fed back by the first party when the secret is shared, and determines the secret fragmentation according to the intermediate result and the second random number set.

The embodiment of the specification also provides another model training method based on the unidirectional network. The unidirectional network may include a first party, a second party, and a third party. In the unidirectional network, the first party and the third party cannot actively access the second party. Referring to fig. 6, the model training method may be applied to a third party, and specifically includes the following steps.

Step S61: a first request from a second party is received.

Step S63: in response to the first request, sending a first set of random numbers to a first party and a second set of random numbers to a second party; so that the first party jointly trains the model according to the first random number set and the second party jointly trains the model according to the second random number set based on the secret sharing algorithm.

The embodiment of the specification also provides a model training device based on the unidirectional network. The unidirectional network may include a first party, a second party, and a third party. In the unidirectional network, the first party and the third party cannot actively access the second party. Referring to fig. 7, the model training apparatus may be applied to a first party, and specifically may include the following units.

A receiving unit 71, configured to receive a first random number set sent by a third party;

a training unit 73, configured to jointly train a model with a second party according to the first random number set based on a secret sharing algorithm; in the process of jointly training the model, the first party receives an intermediate result sent by the second party when the secret is shared, and the fragment of the secret is determined according to the intermediate result and the first random number set.

The embodiment of the specification also provides another model training device based on the unidirectional network. The unidirectional network may include a first party, a second party, and a third party. In the unidirectional network, the first party and the third party cannot actively access the second party. Referring to fig. 8, the model training apparatus may be applied to a second party, and specifically may include the following units.

A sending unit 81, configured to send a first request to a third party;

a receiving unit 83, configured to receive a second random number set fed back by a third party;

a training unit 85, configured to jointly train a model with the first party according to the second random number set based on the secret sharing algorithm; in the process of jointly training the model, the second party sends a second request to the first party, receives an intermediate result fed back by the first party when the secret is shared, and determines the secret fragmentation according to the intermediate result and the second random number set.

The embodiment of the specification also provides another model training device based on the unidirectional network. The unidirectional network may include a first party, a second party, and a third party. In the unidirectional network, the first party and the third party cannot actively access the second party. Referring to fig. 9, the model training apparatus may be applied to a third party, and specifically may include the following units.

A receiving unit 91, configured to receive a first request sent by a second party;

a sending unit 93, configured to send, in response to the first request, a first random number set to a first party, and send a second random number set to a second party; so that the first party jointly trains the model according to the first random number set and the second party jointly trains the model according to the second random number set based on the secret sharing algorithm.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and the same or similar parts in each embodiment may be referred to each other, and each embodiment focuses on differences from other embodiments.

An embodiment of an electronic device of the present description is described below. Fig. 10 is a schematic diagram of a hardware configuration of the electronic apparatus in this embodiment. As shown in fig. 10, the electronic device may include one or more processors (only one of which is shown), memory, and a transmission module. Of course, it is understood by those skilled in the art that the hardware structure shown in fig. 10 is only an illustration, and does not limit the hardware structure of the electronic device. In practice the electronic device may also comprise more or fewer component elements than those shown in fig. 10; or have a different configuration than that shown in fig. 10.

The memory may comprise high speed random access memory; alternatively, non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory may also be included. Of course, the memory may also comprise a remotely located network memory. The remotely located network storage may be connected to the blockchain client through a network such as the internet, an intranet, a local area network, a mobile communications network, or the like. The memory may be used to store program instructions or modules of application software, which may be used to implement the model training method in the embodiments corresponding to fig. 4, fig. 5, or fig. 6 of this specification.

The processor may be implemented in any suitable way. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The processor may read and execute the program instructions or modules in the memory.

The transmission module may be used for data transmission via a network, for example via a network such as the internet, an intranet, a local area network, a mobile communication network, etc.

This specification also provides one embodiment of a computer storage medium. The computer storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk (HDD), a Memory Card (Memory Card), and the like. The computer storage medium stores computer program instructions. The computer program instructions when executed implement: the description refers to the method for training a model in the embodiment corresponding to fig. 4, fig. 5 or fig. 6.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD) is an integrated circuit whose Logic function is determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present specification may be essentially or partially implemented in the form of software products, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.

The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.

Claims

1. A model training method based on a unidirectional network, wherein the unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method is applied to a first party and comprises the following steps:

receiving a first random number set sent by a third party;

2. The method of claim 1, the jointly training a model with a second party from a first set of random numbers, comprising:

and training the model by combining a gradient descent method or a Newton method with the second party based on the secret sharing algorithm.

3. The method of claim 1, wherein the first party holds characteristic data of a sample and the second party holds label data of the sample; alternatively, the first party holds label data of the sample and the second party holds feature data of the sample.

4. A model training method based on a unidirectional network, wherein the unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method is applied to a second party and comprises the following steps:

sending a first request to a third party;

receiving a second random number set fed back by a third party;

5. The method of claim 4, the jointly training a model with a first party from a second set of random numbers, comprising:

and training the model by combining a gradient descent method or a Newton method with the first party based on the secret sharing algorithm.

6. The method of claim 4, wherein the first party holds characteristic data of a sample and the second party holds label data of the sample; alternatively, the first party holds label data of the sample and the second party holds feature data of the sample.

7. A model training method based on a unidirectional network, wherein the unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the method is applied to a third party and comprises the following steps:

receiving a first request sent by a second party;

8. The method of claim 7, the joint training model based on the secret sharing algorithm, comprising:

based on a secret sharing algorithm, a gradient descent method or a Newton method combined training model is adopted.

9. The method of claim 7, the joint training model based on the secret sharing algorithm, comprising:

the first party determines a first intermediate result in sharing the secret;

the second party determines a second intermediate result in sharing the secret;

the second party sends a second intermediate result and a second request to the first party;

the first party receives the second intermediate result and determines a shard of the secret according to the second intermediate result and the first random number set; feeding back a first intermediate result to the second party in response to the second request;

the second party receives the first intermediate result and determines another piece of the secret based on the first intermediate result and the second set of random numbers.

10. The method of claim 7, wherein the first party holds characteristic data of a sample and the second party holds label data of the sample; alternatively, the first party holds label data of the sample and the second party holds feature data of the sample.

11. A model training device based on a unidirectional network, wherein the unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the apparatus, applied to a first party, comprises:

12. A model training device based on a unidirectional network, wherein the unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the apparatus, applied to a second party, comprises:

a sending unit, configured to send a first request to a third party;

13. A model training device based on a unidirectional network, wherein the unidirectional network comprises a first party, a second party and a third party; in the unidirectional network, a first party and a third party cannot actively access a second party; the device is applied to a third party and comprises:

a receiving unit, configured to receive a first request sent by a second party;

14. A computing device, comprising:

at least one processor;

a memory storing program instructions configured for execution by the at least one processor, the program instructions comprising instructions for performing the method of any of claims 1-10.