CN112183759A

CN112183759A - Model training method, device and system

Info

Publication number: CN112183759A
Application number: CN201910607608.3A
Authority: CN
Inventors: 陈超超; 李梁; 王力; 周俊
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd
Priority date: 2019-07-04
Filing date: 2019-07-04
Publication date: 2021-01-05
Anticipated expiration: 2039-07-04
Also published as: CN112183759B

Abstract

The present disclosure provides methods and apparatus for training logistic regression models. In the method, at a training initiator, a feature sample set and a label value are respectively divided into a first number of feature sample subsets and a first number of partial label values, and each of a second number of feature sample subsets and partial label values is respectively sent to a corresponding training cooperator. Then, at each training participant, a secret shared matrix multiplication is used to obtain a matrix product of the logistic regression model and the subset of feature samples of the training participant. At each training participant, a respective predicted value and a predicted difference value are determined, and a respective model update quantity is determined based on the feature sample set and the respective predicted difference value. Then, at each training participant, the respective sub-model is updated based on the respective current sub-model and the corresponding model update amount. And circularly executing the processes until a cycle ending condition is met.

Description

Model training method, device and system

Technical Field

The present disclosure relates generally to the field of machine learning, and more particularly, to a method, apparatus, and system for collaborative training of logistic regression models via multiple training participants using a horizontally-segmented training set.

Background

Logistic regression models are widely used regression/classification models in the field of machine learning. In many cases, multiple model training participants (e.g., e-commerce companies, courier companies, and banks) each possess different portions of data for feature samples used to train logistic regression models. The multiple model training participants generally want to use each other's data together to train a logistic regression model uniformly, but do not want to provide their respective data to other individual model training participants to prevent their own data from being leaked.

In view of such a situation, a machine learning method capable of protecting data security is proposed, which is capable of training a logistic regression model in cooperation with a plurality of model training participants to be used by the plurality of model training participants while ensuring the data security of each of the plurality of model training participants. However, the model training efficiency of the existing machine learning method capable of protecting data security is low.

Disclosure of Invention

In view of the above, the present disclosure provides a method, an apparatus, and a system for collaborative training of a logistic regression model via a plurality of training participants, which can improve the efficiency of model training while ensuring the security of respective data of the plurality of training participants.

According to an aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training collaborators, training sample data of the training initiator having a set of feature samples and a set of label values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the method being performed by the training initiator, the method comprising: the following loop process is executed until a loop end condition is satisfied: dividing the feature sample set into the first number of feature sample subsets, and respectively sending each of the second number of feature sample subsets to a corresponding training cooperative party; obtaining a matrix product between the logistic regression model and the subset of feature samples of the training initiator using secret-shared matrix multiplication; dividing the label value into the first number of partial label values, and sending each of the second number of partial label values to a corresponding training cooperator; determining a current predictor at the training initiator based on a matrix product at the training initiator; determining a prediction difference value between a current prediction value of the training initiator and a corresponding partial mark value; determining a model update quantity at the training initiator based on the feature sample set and the prediction difference at the training initiator; and updating the sub-model of the training initiator based on the current sub-model of the training initiator and the corresponding model updating amount, wherein when the cyclic process is not finished, the updated sub-model of each training participant is used as the current sub-model of the next cyclic process.

According to another aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training collaborators, training sample data of the training initiator having a set of feature samples and a set of label values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the method being performed by a training collaborator, the method comprising: the following loop process is executed until a loop end condition is satisfied: receiving a corresponding feature sample subset from the training initiator, the feature sample subset being one of the first number of feature sample subsets resulting from segmenting the feature sample set at the training initiator; obtaining a matrix product between the logistic regression model and a subset of feature samples of the training cooperator using secret sharing matrix multiplication; receiving a corresponding partial token value from the training initiator, the partial token value being one of the first number of partial token values resulting from segmenting the token value at the training initiator; determining a current predictor at the training cooperator based on a matrix product at the training cooperator; determining a prediction difference at the training cooperator using the current prediction value of the training cooperator and the received partial tag value; obtaining a model update quantity of the training cooperator by using secret shared matrix multiplication based on the feature sample set and the prediction difference value of the training cooperator; and updating the sub-model of the training cooperative party based on the current sub-model of the training cooperative party and the corresponding model updating amount, wherein when the cycle process is not finished, the updated sub-model of each training participant is used as the current sub-model of the next cycle process.

According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training collaborators, training sample data of the training initiator having a set of feature samples and a set of label values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the apparatus being located on a training initiator side, the apparatus comprising: a sample segmentation unit configured to segment the feature sample set into the first number of feature sample subsets; a sample sending unit configured to send each of the second number of feature sample subsets to a corresponding training cooperator, respectively; a matrix product obtaining unit configured to obtain a matrix product between the logistic regression model and the subset of feature samples of the training initiator using secret shared matrix multiplication; a marker value dividing unit configured to divide the marker value into the first number of partial marker values; a tag value transmitting unit configured to transmit each of the second number of partial tag values to a corresponding training cooperator, respectively; a predictor determination unit configured to determine a current predictor at the training initiator based on a matrix product at the training initiator; a prediction difference determination unit configured to determine a prediction difference between a current prediction value of the training initiator and a corresponding partial marker value; a model update amount determination unit configured to determine a model update amount at the training initiator based on the feature sample set and the prediction difference value at the training initiator; and a model updating unit configured to update the sub-model of the training initiator based on the current sub-model of the training initiator and a corresponding model update amount, wherein the sample dividing unit, the sample transmitting unit, the matrix product obtaining unit, the flag value dividing unit, the flag value transmitting unit, the predicted value determining unit, the prediction difference determining unit, the model update amount determining unit, and the model updating unit are configured to circularly perform operations until a cycle end condition is satisfied, wherein when a cycle process is not ended, the updated sub-models of the respective training participants are used as the current sub-model of a next cycle process.

According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model including a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of the training participants, the training participants including a training initiator and a second number of training collaborators, training sample data of the training initiator having a feature sample set and a flag value, the training sample data being obtained by horizontal slicing, the second number being equal to the first number minus one, the apparatus being located on a training collaborator side, the apparatus comprising: a sample receiving unit configured to receive a corresponding feature sample subset from the training initiator, the feature sample subset being one of the first number of feature sample subsets resulting from segmentation of the feature sample set at the training initiator; a matrix product obtaining unit configured to obtain a matrix product between the logistic regression model and the subset of feature samples of the training cooperator using secret shared matrix multiplication; a token value receiving unit configured to receive a corresponding partial token value from the training initiator, the partial token value being one of the first number of partial token values resulting from segmenting the token value at the training initiator; a predictor determination unit configured to determine a current predictor at the training cooperator based on a matrix product at the training cooperator; a prediction difference determination unit configured to determine a prediction difference at the training cooperator using a current prediction value of the training cooperator and the received partial tag value; a model update amount determination unit configured to obtain a model update amount of the training cooperator using secret sharing matrix multiplication based on the feature sample set and the prediction difference of the training cooperator; and a model updating unit configured to update the sub-model of the training cooperator based on the current sub-model of the training cooperator and a corresponding model update amount, wherein the sample receiving unit, the matrix product obtaining unit, the flag value receiving unit, the prediction value determining unit, the prediction difference value determining unit, the model update amount determining unit, and the model updating unit are configured to perform operations in a loop until a loop ending condition is satisfied, wherein the updated sub-model of each training participant is used as a current sub-model of a next loop process when the loop process is not ended.

According to another aspect of the present disclosure, there is provided a system for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, the system comprising: a training initiator device comprising means for performing training at a training initiator side as described above; and a second number of training cooperator apparatuses, each training cooperator apparatus comprising an apparatus for performing training at a training cooperator side as described above, wherein the first number is equal to the number of training participants, each training participant has a submodel, training sample data of the training initiator has a set of feature samples and labeled values, the training sample data is obtained by horizontal segmentation, and the second number is equal to the first number minus one.

According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a training initiator side as described above.

According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above that is performed at a training initiator side.

According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a training cooperator side as described above.

According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above that is performed on a training cooperator side.

By using the scheme of the embodiment of the disclosure, the model parameters of the logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in a linear relationship rather than an exponential relationship with the number of the feature samples used for training, so that compared with the prior art, the scheme of the embodiment of the disclosure can improve the efficiency of the model training under the condition of ensuring the safety of the respective data of the training participants.

Drawings

A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.

FIG. 1 shows a schematic diagram of an example of horizontally sliced data according to an embodiment of the present disclosure;

FIG. 2 illustrates an architectural diagram showing a system for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 3 illustrates a flow diagram of a method for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 4 shows a flowchart of a process of performing a trusted initiator secret sharing matrix multiplication on current submodels of various training participants and a feature sample set of a training initiator, according to an embodiment of the disclosure;

FIG. 5 shows a flowchart of a process of performing untrusted initializer secret sharing matrix multiplication on current submodels of individual training participants and a feature sample set of a training initiator, in accordance with an embodiment of the disclosure;

FIG. 6 shows a flowchart of one example of untrusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;

FIG. 7 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 8 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 9 illustrates a schematic diagram of a computing device for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 10 illustrates a schematic diagram of a computing device for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure.

Detailed Description

The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.

As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.

The secret sharing method is a cryptographic technique for storing a secret in a split manner, and divides the secret into a plurality of secret shares in a proper manner, each secret share is owned and managed by one of a plurality of parties, a single party cannot recover the complete secret, and only a plurality of parties cooperate together can the complete secret be recovered. The secret sharing method aims to prevent the secret from being too concentrated so as to achieve the purposes of dispersing risks and tolerating intrusion.

Secret sharing methods can be roughly divided into two categories: there is a trusted initializer secret sharing method and a untrusted initializer secret sharing method. In the secret sharing method with a trusted initiator, the trusted initiator is required to perform parameter initialization (often to generate random numbers meeting certain conditions) on each participant participating in multi-party secure computation. After the initialization is completed, the trusted initialization party destroys the data and disappears at the same time, and the data are not needed in the following multi-party security calculation process.

The trusted initializer secret sharing matrix multiplication is applicable to the following situations: the complete secret data is a product of the first set of secret shares and the second set of secret shares, and each of the participants has one of the first set of secret shares and one of the second set of secret shares. By the secret sharing matrix multiplication of the trusted initiator, each of the multiple participants can obtain partial complete secret data of the complete secret data, the sum of the partial complete secret data obtained by each participant is the complete secret data, and each participant discloses the obtained partial complete secret data to the rest of the participants, so that each participant can obtain the complete secret data without disclosing the secret share owned by each participant, thereby ensuring the safety of the data of each of the multiple participants.

Untrusted initializer secret sharing matrix multiplication is one of the secret sharing methods. Secret-sharing matrix multiplication by an untrusted initializer is applicable to the case where the complete secret is the product of a first secret share and a second secret share, and both parties own the first and second secret shares, respectively. By secret sharing matrix multiplication by an untrusted initiator, each of the two parties that own a respective secret share generates and discloses data that is different from the secret share that they own, but the sum of the data that the two parties each disclose is equal to the product of the secret shares that the two parties each own (i.e., the complete secret). Therefore, the parties can recover the complete secret by the cooperative work of the secret sharing matrix multiplication of the trusted initialization party without disclosing the secret shares owned by the parties, and the data security of the parties is guaranteed.

In the present disclosure, the training sample set used in the logistic regression model training scheme is a horizontally sliced training sample set. The term "horizontally slicing the training sample set" refers to slicing the training samples in the training sample set into a plurality of training sample subsets according to a certain rule of a certain field, each training sample subset contains a part of the training samples, and the training samples included in each training sample subset are complete training samples, i.e., all field data and corresponding label values of the training samples are included. In the present disclosure, assuming that there are three data parties, Alice, Bob, and Charlie, local samples are obtained at each data party to form a local sample set, where each sample contained in the local sample set is a complete sample, and then the local sample sets obtained by the three data parties, Alice, Bob, and Charlie, constitute a training sample set for training a logistic regression model, where each local sample set is a training sample subset of the training sample set for training the logistic regression model.

Suppose a sample x of attribute values described by d attributes (also called features) is given^T＝(x₁；x₂；…；x_d) Wherein x is_iIf the value sum T of x on the ith attribute represents transposition, the logistic regression model is Y ═ 1/(1+ e)^-wx) Where Y is the predicted value, and W is the model parameter of the logistic regression model (i.e., the model described in this disclosure),

W_Prefers to a sub-model at each training participant P in the present disclosure. In this disclosure, attribute value samples are also referred to as feature data samples.

In the present disclosure, each training participant has a different portion of the data of the training samples used to train the logistic regression model. For example, taking two training participants as an example, assuming that the training sample set includes 100 training samples, each of which contains a plurality of feature values and labeled actual values, the data owned by the first participant may be the first 30 training samples in the training sample set, and the data owned by the second participant may be the last 70 training samples in the training sample set.

The matrix multiplication computation described anywhere in this disclosure needs to determine whether to transpose a corresponding matrix of one or more of two or more matrices participating in matrix multiplication or not, as the case may be, to satisfy a matrix multiplication rule, thereby completing the matrix multiplication computation.

Embodiments of a method, apparatus, and system for collaborative training of a logistic regression model via multiple training participants according to the present disclosure are described in detail below with reference to the accompanying drawings.

Fig. 1 shows a schematic diagram of an example of a horizontally sliced training sample set according to an embodiment of the present disclosure. In fig. 1, 2 data parties Alice and Bob are shown, as are the multiple data parties. Each training sample in the subset of training samples owned by each data party Alice and Bob is complete, i.e., each training sample includes complete feature data (x) and label data (y). For example, Alice possesses a complete training sample (x0, y 0).

Fig. 2 shows an architectural diagram illustrating a system 1 for collaborative training of a logistic regression model via multiple training participants (hereinafter referred to as model training system 1) according to an embodiment of the present disclosure.

As shown in fig. 2, the model training system 1 includes a training initiator device 10 and at least one training cooperator device 20. In fig. 2, 2 training cooperator apparatuses 20 are shown. In other embodiments of the present disclosure, one training cooperator apparatus 20 may be included or more than 2 training cooperator apparatuses 20 may be included. The training initiator device 10 and the at least one training cooperator device 20 may communicate with each other via a network 30, such as, but not limited to, the internet or a local area network or the like. In the present disclosure, the training initiator device 10 and the at least one training cooperator device 20 are collectively referred to as training participant devices.

In the present disclosure, the trained logistic regression model is partitioned into a first number of sub-models. Here, the first number is equal to the number of training participant devices participating in model training. Here, it is assumed that the number of training participant devices is N. Accordingly, the logistic regression model is partitioned into N submodels, one for each training participant device. A training sample set for model training is located at the training initiator device 10, the training sample set being a horizontally partitioned training sample set as described above, and the training sample set comprising a feature data set and corresponding marker values, i.e., x0 and y0 shown in fig. 1. The submodel and corresponding training samples owned by each training participant are secret to that training participant and cannot be learned or are completely learned by other training participants.

In the present disclosure, the training initiator device 10 and the at least one training cooperator device 20 together use a set of training samples at the training initiator device 10 and respective sub-models to cooperatively train a logistic regression model. The specific training process for the model will be described in detail below with reference to fig. 3 to 6.

In the present disclosure, the training initiator device 10 and the training cooperator device 20 may be any suitable computing device having computing capabilities. The computing devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.

FIG. 3 illustrates a flow diagram of a method for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure. In fig. 3, a training initiator Alice and 2 training cooperators Bob and Charlie are taken as an example for illustration. The training initiator Alice has a characteristic sample set X and a mark value Y, and the submodels of the training initiator Alice, the training cooperators Bob and Charlie are W_A、W_BAnd W_CAnd the logistic regression model W ═ W_A+W_B+W_C。

As shown in FIG. 3, first, at block 301, the training initiator Alice, the training cooperator Bob, and Charlie initialize the sub-model parameters of their sub-models, i.e., weight sub-vectors W_A、W_BAnd W_CTo obtain initial values of its sub-model parameters and to initialize the number of training cycles performed t to zero. Here, it is assumed that the end condition of the loop process is that a predetermined number of training loops are performed, for example, T training loops are performed.

After initialization as above, at block 302, at the training initiator AlicePerforming segmentation processing on the feature sample set X to obtain a feature sample subset X_A、X_BAnd X_C. For example, assume that the feature sample set X includes two feature samples S1 and S2, and the feature samples S1 and S2 each include 3 attribute values, where S1 ═ a₁ ¹,a₂ ¹,a₃ ¹]And S2 ═ a₁ ²,a₂ ²,a₃ ²]Then, the feature sample set X is divided into 3 feature sample subsets X_A、X_BAnd X_CThereafter, a first subset of feature samples X_AComprising a characteristic subsample [ a₁₁ ¹,a₂₁ ¹,a₃₁ ¹]And a characteristic subsample [ a₁₁ ²,a₂₁ ²,a₃₁ ²]Second subset of feature samples X_BComprising a characteristic subsample [ a₁₂ ¹,a₂₂ ¹,a₃₂ ¹]And a characteristic subsample [ a₁₂ ²,a₂₂ ²,a₃₂ ²]Third subset of feature samples X_CComprising a characteristic subsample [ a₁₃ ¹,a₂₃ ¹,a₃₃ ¹]And a characteristic subsample [ a₁₃ ²,a₂₃ ²,a₃₃ ²]Wherein, a₁₁ ¹+a₁₂ ¹+a₁₃ ¹＝a₁ ¹，a₂₁ ¹+a₂₂ ¹+a₂₃ ¹＝a₂ ¹，a₃₁ ¹+a₃₂ ¹+a₃₃ ¹＝a₃ ¹，a₁₁ ²+a₁₂ ²+a₁₃ ²＝a₁ ²，a₂₁ ²+a₂₂ ²+a₂₃ ²＝a₂ ²And a₃₁ ²+a₃₂ ²+a₃₃ ²＝a₃ ²。

Then, at block 303, the training initiator Alice will characterize the sample subset X_BIs sent toTraining cooperator Bob and subset X of feature samples_CSending to Charlie of training cooperator, and simultaneously Alice reserving feature sample subset X_AAs its own subset of feature samples.

Next, at block 304, a logistic regression model W and a subset of feature samples X for each training participant are obtained using secret-shared matrix multiplication_A、X_BAnd X_CMatrix product between. I.e. the matrix product W X at Alice_AThe matrix product W X X at Bob_BAnd the matrix product W X at Charlie_C。

In one example of the disclosure, a secret shared matrix multiplication is used to obtain a logistic regression model W and a subset of feature samples X for each training participant_A、X_BAnd X_CThe matrix product between may include: obtaining a logistic regression model W and a subset of feature samples X for each training participant using secret sharing matrix multiplication with a trusted initiator_A、X_BAnd X_CMatrix product between. How to use the trusted initializer secret sharing matrix multiplication to obtain the current prediction values at the respective training participants will be explained below with reference to fig. 4.

In another example of the present disclosure, a secret shared matrix multiplication is used to obtain a logistic regression model W and a subset of feature samples X for each training participant_A、X_BAnd X_CThe matrix product between may include: obtaining a logistic regression model W and a subset of feature samples X for each training participant using untrusted initializer secret shared matrix multiplication_A、X_BAnd X_CMatrix product between. How to use untrusted initializer secret sharing matrix multiplication to obtain current predicted values at the respective training participants will be explained below with reference to fig. 5-6.

After the matrix product of each training participant is obtained as described above, at the training initiator Alice, the token value Y is segmented to obtain 3 partial token values Y at block 305_A、Y_BAnd Y_C. Segmentation process for a token value Y and the above segmentation for a feature sample set XThe process is the same and will not be described here.

Next, at block 306, the training initiator Alice will partially mark the value Y with the value Y_BSending to the training cooperator Bob, and marking the partial mark value Y_CSending to Charlie of training cooperator, and reserving part of mark value Y by Alice_AAs its own partial tag value.

Then, at each training participant, a current predictor at each training participant is determined based on the matrix product of each training participant at block 307. For example, the current predicted values at the various training participants may utilize a formula

To obtain a solution of, wherein,

is the predicted value at the training participant i, W ═ W_A+W_B+W_CIs a logistic regression model, and X_iIs a subset of feature samples at the training participants.

In addition, the formula can be matched

A taylor formula expansion is performed, that is,

thus, the matrix product W.X of each training participant can be used based on the Taylor expansion formula_iTo calculate the current predicted value of each training participant. As for taylor formula expansion, it needs to be approximated to several times, and it can be determined based on the accuracy required for the application scenario.

At each training participant, a prediction difference value is determined at each training participant based on the current prediction value and the respective partial label value of each training participant at block 308. I.e. the predicted difference at Alice

Prediction at Bob

And predicted value at Charlie

Where e is a column vector, Y is a column vector representing the label values of the training samples X, and,

is a column vector representing the current predictor for training sample X. E, Y and if training sample X contains only a single training sample

Are column vectors having only a single element. If the training sample X contains multiple training samples, e, Y and

are column vectors having a plurality of elements, wherein,

each element in (e) is a current predicted value of a corresponding training sample in the plurality of training samples, each element in (Y) is a labeled value of a corresponding training sample in the plurality of training samples, and each element in (e) is a difference of the labeled value of the corresponding training sample in the plurality of training samples and the current predicted value. It is to be noted that, in the above description, e_A、e_BAnd e_CAre collectively referred to as e, and Y_A、Y_BAnd Y_CCollectively referred to as Y.

Then, at block 309, based on the training initiator Alice's feature sample set X and the predicted difference e of each training participant_A、e_BAnd e_CTo determine the model update quantity TMP of each training participant_A、TMP_BAnd TMP_C. Specifically, the model update quantity TMP at Alice_A＝X*e_AModel update quantity TMP at Bob_B＝X*e_BAnd anModel update at Charlie TMP_C＝X*e_C. Here, the model update quantity TMP at Bob_BAnd the model update quantity TMP at Charlie_CIs obtained using secret sharing matrix multiplication.

Next, at each training participant, the sub-model at that training participant is updated based on the current sub-model of that training participant and the corresponding model update amount, at block 310. For example, the training initiator Alice uses the current submodel W_AAnd corresponding model update quantity TMP_ATo update the submodel at the training initiator Alice, and the training cooperator Bob uses the current submodel W_BAnd corresponding model update quantity TMP_BTo update the submodel at the training cooperator Bob and the training cooperator Charlie uses the current submodel W_CAnd corresponding model update quantity TMP_CTo update the submodel at the training cooperator Charlie.

In one example of the present disclosure, updating the current submodel at a training participant based on the current submodel of the training participant and a corresponding model update amount may update the current submodel W at the training participant according to the following equation_n+1＝W_n-α·TMP_i＝W_n-α·X·e_iWherein W is_n+1Represents the updated current submodel, W, at the training participant_nRepresenting the current submodel at the training participant, alpha representing the learning rate, X representing the set of feature samples, and e_iRepresenting the predicted difference at the training participant. Wherein the updated current submodel may be calculated separately at the training initiator when the training participant is the training initiator. When the training participants are training cooperators, X.e_iIs obtained at the training cooperator using secret sharing matrix multiplication, which may be performed using a similar process as shown in fig. 4 or fig. 5-6, except that X corresponds to W in fig. 4, and e_iCorresponding to X in fig. 4. It is to be noted here that, when X is a single feature sample, X is a feature vector (column vector or row vector) composed of a plurality of attributes, and e_iIs a single prediction difference. When X is a plurality of feature samples, X is a feature matrix, and the attribute of each feature sample constitutes one column element/one row element of the feature matrix X, and e_iIs a prediction difference vector. In the calculation of X.e_iWhen with e_iIs the eigenvalue of each sample corresponding to a certain characteristic of the matrix X. For example, assume e_iIs a column vector, each multiplication, e_iMultiplied by a row in the matrix X, the elements in the row representing the eigenvalues of a certain characteristic corresponding to each sample.

After the respective sub-model update is completed at the respective training participants as described above, it is determined whether a predetermined number of cycles has been reached, i.e., whether a cycle end condition has been reached, at block 311. If the preset cycle times are reached, each training participant stores the current updated value of each sub-model parameter as the final value of the sub-model parameter, so as to obtain each trained sub-model, and then the process is ended. If the predetermined number of cycles has not been reached, flow returns to the operation of block 302 to perform a next training cycle in which the updated submodel obtained by the respective training participant in the current cycle is used as the current submodel for the next cycle.

It is to be noted here that, in the above example, the end condition of the training loop process means that the predetermined number of loops is reached. In other examples of the disclosure, the ending condition of the training loop process may also be that the determined prediction difference is within a predetermined range, i.e., the prediction difference e_A、e_BAnd e_CEach element e of_iAll within a predetermined range, e.g. predicting each element e of the difference e_iAre less than a predetermined threshold. Accordingly, the operations of block 311 in FIG. 3 may be performed after the operations of block 307.

Figure 4 shows a flow diagram of one example of a secret sharing matrix multiplication process with a trusted initializer. In fig. 4, the calculation of the current prediction value at Alice of the training initiator is taken as an example for explanation. The current prediction value calculation process for the training participants Bob and Charlie is similar to that of Alice, and only the training participants Bob and Charlie need to be respectively adjusted to be training initiators. In the case of using multiplication with a trusted initializer secret sharing matrix, the model training system 1 shown in fig. 2 further comprises a trusted initializer device 30.

As shown in fig. 4, first, at the trusted initiator 30, a first number of random weight vectors, a first number of random feature matrices, and a first number of random flag value vectors are generated, and a product of a sum of the first number of random weight vectors and a sum of the first number of random feature matrices is equal to a sum of the first number of random flag value vectors. Here, the first number is equal to the number of training participants.

For example, as shown in FIG. 4, the trusted initiator 30 generates 3 random weight vectors W_R，1、W_R，2And W _R，33 random feature matrices X_R，1、X_R，2And X_R，3And 3 vectors of random tag values Y_R，1、Y_R，2And Y_R，3Wherein, in the step (A),

here, the dimension of the random weight vector is the same as the dimension of the submodel of each model training participant, the dimension of the random feature matrix is the same as the dimension of the training sample set, and the dimension of the random token value vector is the same as the dimension of the token value vector.

The trusted initiator 30 then converts the generated W at block 401_R，1、X_R，1And Y_R，1Sent to the training initiator Alice, and at block 402, the generated W is transmitted_R，2、X_R，2And Y_R，2Sent to the training cooperator Bob and, at block 403, the generated W_R，3、X_R，3And Y_R，3And sending the training cooperative party Charlie to the training cooperative party.

Next, at a training initiator Alice, the feature sample subset X is combined at block 404_A(hereinafter referred to as feature matrix X)_A) Into a first number of feature sub-matrices, e.g. into 3 feature sub-matrices as shown in fig. 4X_A1、X_A2And X_A3。

Then, the training initiator Alice sends each of a second number of feature submatrices in the divided first number of feature submatrices to the training cooperator, respectively, where the second number is equal to the first number minus one. For example, at

blocks

405 and 406, 2 feature sub-matrices X_A2And X_A3Respectively sent to the training cooperators Bob and Charlie.

Then, at each training participant, a weight sub-vector difference E and a feature sub-matrix difference D at the training participant are determined based on the weight sub-vector, the corresponding feature sub-matrix, and the received random weight vector and random feature matrix of the training participant. For example, at the training initiator Alice, at block 407, it is determined that its weight subvector difference E1 ═ W_A-W_R，1And the feature submatrix difference D1 ═ X_A1-X_R，1. At block 408, at the training cooperator Bob, its weight subvector difference E2 ═ W is determined_B-W_R，2And the feature submatrix difference D2 ═ X_A2-X_R，2. At the training cooperator Charlie, its weight subvector difference E3 ═ W is determined at block 409_C-W_R，3And the feature submatrix difference D3 ═ X_A3-X_R，3。

Determining respective weight sub-vector difference E at each training participant_iAnd the feature submatrix difference D_iThen, each training participant determines the difference E of the weight sub-vectors_iAnd the feature submatrix difference D_iTo the remaining training participants. For example, at

blocks

410 and 411, the training initiator Alice sends D1 and E1 to the training cooperators Bob and Charlie, respectively. At

blocks

412 and 413, the training cooperator Bob sends D2 and E2 to the training initiator Alice and the training cooperator Charlie, respectively. At

blocks

414 and 415, Charlie sends D3 and E3 to the training initiator Alice and the training cooperator Bob, respectively.

Then, at each training participant, the weight sub-vector difference value and the feature sub-matrix difference value at each training participant are summed to obtain a weight sub-vector total difference value E and a feature sub-matrix total difference value D, respectively, at block 416. For example, as shown in fig. 4, D — D1+ D2+ D3, and E — E1+ E2+ E3.

Then, at each training participant, based on the received random weight vector W_R,iRandom feature matrix X_R,iVector of random mark values Y_R,iAnd calculating the predicted value vector Zi corresponding to the weight sub-vector total difference E and the feature sub-matrix total difference D respectively.

In one example of the present disclosure, at each training participant, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, and the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant may be summed to obtain the corresponding predictor vector (first calculation). Alternatively, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant, and the product of the total difference value of the weight sub-vectors and the total difference value of the feature sub-matrices may be summed to obtain the corresponding predictor matrix (second calculation).

It should be noted here that, in the predictor matrix calculation at each training participant, only one predictor matrix calculated at each training participant includes the product of the total weight sub-vector difference and the total feature sub-matrix difference. In other words, for each training participant, only one of the training participants' predictor vectors is calculated in the second calculation, while the remaining training participants calculate the corresponding predictor vector in the first calculation.

For example, at the training initiator Alice, at block 417, the corresponding predictor vector Z1 ═ Y is calculated_R，1+E*X_R，1+D*W_R，1+ D × E. At block 418, at the training cooperator Bob, the corresponding predictor vector Z2-Y is calculated_R，2+E*X_R，2+D*W_R，2. At block 419, at the training cooperator Charlie, the corresponding predictor vector Z3 is calculatedY_R，3+E*X_R，3+D*W_R，3。

It is noted here that in fig. 4, D × E is shown contained in Z1 calculated at the training initiator Alice. In other examples of the present disclosure, D _ E may also be included in Zi calculated by either of the training cooperators Bob and Charlie, and accordingly, D _ E is not included in Z1 calculated at the training initiator Alice. In other words, only one of the zis calculated at each training participant contains D × E.

Each training participant then discloses the calculated respective predictor vector to the remaining training participants. For example, at

blocks

420 and 421, the training initiator Alice sends the predictor vector Z1 to the training cooperators Bob and Charlie, respectively. At

blocks

422 and 423, the training cooperator Bob sends the predictor vector Z2 to the training initiator Alice and the training cooperator Charlie, respectively. At

blocks

424 and 425, the training cooperator Charlie sends the predictor vector Z3 to the training initiator Alice and the training cooperator Bob, respectively.

Then, at

blocks

426, 427, and 428, each training participant sums the predictor vectors for that respective training participant Z-Z1 + Z2+ Z3 to obtain the current predictor of the logistic regression model for the feature sample set.

Fig. 5 illustrates a flow chart of a process of obtaining current predictors for respective training participants using untrusted initializer secret sharing matrix multiplication based on current submodels of the respective training participants and feature sample subsets of the respective training participants, in accordance with an embodiment of the present disclosure. The following description will take the example of calculating the current predicted value at Alice as an example. The current prediction value calculation process for the training participants Bob and Charlie is similar to that of Alice, and only the training participants Bob and Charlie need to be respectively adjusted to be training initiators.

As shown in FIG. 5, first, at a training initiator Alice, a first weight submatrix W of the training initiator is calculated, at block 510_AAnd the first feature matrix X_ATo obtain a sub-model W with the training initiator Alice_ACorresponding partial prediction value Y_A1＝W_A*X_A。

Next, at block 520, a first weight sub-matrix (e.g., W) of each training cooperator (e.g., Bob and Charlie) is calculated using secret-shared matrix multiplication of untrusted initializers_BAnd W_C) And the first feature matrix X_ATo obtain partial predicted values (Y) corresponding to the submodels of the respective training cooperator_A2＝W_B*X_AAnd Y_A3＝W_C*X_A). Here, the partial predictor corresponding to the sub-model for each training cooperator is based on a subset of feature samples of the training cooperator (e.g., X at Alice) between the corresponding training cooperator and the training initiator_A) To be multiplied using a secret sharing matrix of the untrusted initiator. How partial predictors at the training cooperators are computed using untrusted initializer secret shared matrix multiplication will be described in detail below with reference to fig. 6.

Then, at the training initiator Alice, partial prediction values (for example, Y) corresponding to the obtained sub-models of the training participants are obtained_A1、Y_A2And Y_A3) Summing to obtain the current predicted value Y to_A＝Y_A1+Y_A2+Y_A3。

Fig. 6 shows a flow diagram of one example of the untrusted initializer secret sharing matrix multiplication process of fig. 5. In FIG. 6, an example of a training initiator Alice and a training cooperator Bob is shown for Y_A2＝W_B*X_AThe calculation process of (2).

As shown in FIG. 6, first, at block 601, if a first feature matrix X at the initiator Alice is trained_AIs not even, and/or trains the current sub-model parameter W at the cooperator Bob_B(hereinafter referred to as the first weight submatrix W)_B) If the number of columns is not even, then the first feature matrix X is selected_AAnd/or the first weight submatrix W_BPerforming dimension completion processing to make the first feature matrix X_AIs an even number and/or a first weight sub-matrix W_BThe number of columns of (a) is an even number. For example,the first feature matrix X_AIs increased by a row 0 value and/or a first weight sub-matrix W_BThe dimension completion processing is performed by adding one more row of 0 values at the end of the row. In the following description, it is assumed that the first weight submatrix W_BAnd the first feature matrix has a dimension of J x K, where J is an even number.

The operations of blocks 602 through 604 are then performed at the training initiator Alice to obtain a random feature matrix X1, second through third feature matrices X2, and X3. Specifically, at block 602, a random feature matrix X1 is generated. Here, the dimension of the random feature matrix X1 is the same as the first feature matrix X_AI.e., the random feature matrix X1 has dimension J × K. At block 603, the random feature matrix X1 is subtracted from the first feature matrix X_ATo obtain a second feature matrix X2. The dimension of the second feature matrix X2 is J × K. At block 604, the even row submatrix X1_ e of the random feature matrix X1 is subtracted from the odd row submatrix X1_ o of the random feature matrix X1 to obtain a third feature matrix X3. The dimension of the third feature matrix X3 is J × K, where J is J/2.

Further, the operations of blocks 605 to 607 are performed at the training cooperator Bob to obtain the random weight submatrix W_B1A second and a third weight submatrix W_B2And W_B3. Specifically, at block 605, a random weight submatrix W is generated_i1. Here, the random weight submatrix W_B1Dimension of (d) and a first feature matrix W_BAre equally dimensioned, i.e. the random weight sub-matrix W_i1Is I x J. At block 606, the first weight submatrix W is processed_BAnd a random weight submatrix W_B1Summing to obtain a second weight submatrix W_B2. Second weight submatrix W_B2Is I x J. At block 607, the random weight submatrix W_B1Odd column submatrix W_{B1_o}Adding a random weight sub-matrix W_B1Of even-numbered rows of the submatrix W_{B1_e}To obtain a third weight submatrix W_B3. Third weight submatrix W_B3Is represented by I x J, where J/2.

Then, at block 608, the training initiator Alice will generate a second feature matrixX2 and the third feature matrix X3 are sent to the training cooperator Bob, and at block 609 the training cooperator Bob passes the second weight sub-matrix W_B2And a third weight submatrix W_B3And sending the training data to a training initiator Alice.

Next, at the training initiator Alice, at block 610, W based on the equation Y1_B2*(2*X_A-X1)-W_B3(X3+ X1_ e) to get the first matrix product Y1, and at block 612, the first matrix product Y1 is sent to the training cooperator Bob.

At block 611, at the training cooperator Bob, (W) based on the equation Y2_B+2*W_B1)*X2+(W_B3+W_{B1_o}) X3 computes a second matrix product Y2 and, at block 613, sends the second matrix product Y2 to the training initiator Alice.

Then, at

blocks

614 and 615, the first matrix product Y1 and the second matrix product Y2 are summed at the training initiator Alice and the training cooperator Bob, respectively, to obtain partial prediction values Y corresponding to the submodels of the training cooperator Bob_A2＝Y1+Y2。

It is noted here that in the model training process shown in fig. 3-6, Alice is used as a training initiator to initiate the current model cycle training, i.e., the model cycle training is performed using the training data at Alice. In other examples of the disclosure, the training data used in each model cycle training may be training data that is present in any of the training participants. Accordingly, the method described in fig. 3 may further include: at each round-robin training, the training participants negotiate to determine which training participant acts as the training initiator, i.e. to determine which training participant's training data is used to perform the round-robin training. Each training participant then performs the corresponding operations shown in fig. 3-6 in accordance with the determined training role.

Further, it is noted that the model training schemes of 1 training initiator and 2 training cooperators are shown in fig. 3-6, and in other examples of the present disclosure, 1 training cooperator may be included or more than 2 training cooperators may be included.

By using the logistic regression model training method disclosed in fig. 3-6, the model parameters of the logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in a linear relationship rather than an exponential relationship with the number of the feature samples used for training, so that the efficiency of the model training can be improved under the condition of ensuring the safety of the respective data of the training participants.

FIG. 7 shows a schematic diagram of an apparatus for collaborative training of a logistic regression model via multiple training participants (hereinafter referred to as a model training apparatus) 700, according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants including a training initiator and a second number of training cooperators, the training initiator having a feature sample set and labeled values, the feature sample set being a data set obtained by horizontal segmentation. The second number is equal to the first number minus one. The model training apparatus 700 is located on the training initiator side.

As shown in fig. 7, the model training apparatus 700 includes a sample division unit 710, a sample transmission unit 720, a matrix product acquisition unit 730, a flag value division unit 740, a flag value transmission unit 750, a prediction value determination unit 760, a prediction difference determination unit 770, a model update amount determination unit 780, and a model update unit 790.

In performing model training, the sample division unit 710, the sample transmission unit 720, the matrix product acquisition unit 730, the marker value division unit 740, the marker value transmission unit 750, the predicted value determination unit 760, the predicted difference value determination unit 770, the model update amount determination unit 780, and the model update unit 790 are configured to perform operations cyclically until a cycle end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated sub-models of the training participants are used as the current sub-models of the next loop process.

In particular, during each cycle, the sample segmentation unit 710 is configured to segment the set of feature samples into a first number of feature sample subsets. The sample sending unit 720 is configured to send each of the second number of feature sample subsets to a corresponding training cooperator, respectively.

The matrix product acquisition unit 730 is configured to obtain a matrix product between the logistic regression model and the subset of feature samples of the training initiator using secret-shared matrix multiplication.

The mark value dividing unit 740 is configured to divide the mark value into a first number of partial mark values. The tag value transmitting unit 750 is configured to transmit each of the second number of partial tag values to a corresponding training cooperator, respectively.

The predictor determination unit 760 is configured to determine a current predictor at the training initiator based on the matrix product at the training initiator. In one example of the present disclosure, the predictor determination unit 760 is configured to determine a current predictor at the training initiator based on a matrix product at the training initiator according to a taylor expansion formula.

The prediction difference determination unit 770 is configured to determine a prediction difference between the current prediction value of the training initiator and the corresponding partial marker value.

The model update amount determination unit 770 is configured to determine a model update amount at the training initiator based on the set of feature samples and the predicted difference at the training initiator. For example, the model update amount determination unit 770 may calculate a product of the feature sample set and the prediction difference at the training initiator to obtain a model update amount at the training initiator.

The model updating unit 780 is configured to update the training initiator's submodel based on its current submodel and the corresponding model update amount.

In one example of the present disclosure, the matrix product acquisition unit 730 may be configured to: a matrix product between the logistic regression model and the subset of feature samples of the training initiator is obtained using secret-shared-matrix multiplication with a trusted initiator. The operations of the matrix product acquisition unit 730 may refer to the operations performed at the training initiator Alice described above with reference to fig. 4.

In another example of the present disclosure, the matrix product acquisition unit 730 may be configured to: matrix products between the logistic regression model and the subset of feature samples of the training initiator are obtained using untrusted initializer secret shared matrix multiplication. The operations of the matrix product acquisition unit 730 may refer to the operations performed at the training initiator Alice described above with reference to fig. 5-6.

Furthermore, in other examples of the present disclosure, the model training apparatus 700 may further include a negotiation unit (not shown) configured to negotiate between a plurality of training participants to determine the training initiator and the training cooperator.

FIG. 8 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants (hereinafter referred to as model training apparatus 800) in accordance with an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants including a training initiator and a second number of training cooperators, the training initiator having a feature sample set and labeled values, the feature sample set being a data set obtained by horizontal segmentation. The second number is equal to the first number minus one. The model training apparatus 800 is located on the training cooperator side.

As shown in fig. 8, the model training apparatus 800 includes a sample receiving unit 810, a matrix product obtaining unit 820, a flag value receiving unit 830, a predicted value determining unit 840, a predicted difference value determining unit 850, a model update amount determining unit 860, and a model updating unit 870.

At the time of model training, the sample receiving unit 810, the matrix product acquisition unit 820, the marker value receiving unit 830, the predicted value determining unit 840, the predicted difference value determining unit 850, the model update amount determining unit 860, and the model updating unit 870 are configured to perform operations in a loop until a loop end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range.

In particular, during each cycle, the sample receiving unit 810 is configured to receive a corresponding subset of feature samples from the training initiator, which subset of feature samples is one of a first number of subsets of feature samples resulting from a segmentation of the set of feature samples at the training initiator.

The matrix product acquisition unit 820 is configured to obtain a matrix product between the logistic regression model and the subset of feature samples of the training cooperator using secret shared matrix multiplication.

The token value receiving unit 830 is configured to receive a corresponding partial token value from the training initiator, the partial token value being one of a first number of partial token values resulting from segmenting the token value at the training initiator.

The predictor determination unit 840 is configured to determine a current predictor at the training cooperator based on the matrix product at the training cooperator. In one example of the present disclosure, the predictor determination unit 840 is configured to determine a current predictor at a training cooperator based on a matrix product at the training cooperator in accordance with a taylor expansion formula.

The prediction difference determination unit 850 is configured to determine a prediction difference at the training cooperator using the current prediction value of the training cooperator and the received partial tag value.

The model update amount determination unit 860 is configured to obtain a model update amount of the training cooperator using a secret shared matrix multiplication based on the sample label set and the predicted difference value of the training cooperator.

The model updating unit 870 is configured to update the submodel of the training cooperator based on the current submodel of the training cooperator and the corresponding model update amount.

In one example of the present disclosure, the matrix product acquisition unit 820 may be configured to: a matrix product between the logistic regression model and the subset of feature samples of the training cooperator is obtained using secret sharing matrix multiplication with a trusted initializer. The operations of the matrix product acquisition unit 820 may refer to the operations performed at the training cooperator described above with reference to fig. 4.

In another example of the present disclosure, the matrix product acquisition unit 820 may be configured to: matrix multiplication between the logistic regression model and the feature sample subset of the training cooperator is obtained using untrusted initializer secret sharing matrix multiplication. The operations of the matrix product acquisition unit 820 may refer to the operations performed at the training cooperator described above with reference to fig. 5-6.

In one example of the present disclosure, the model update amount determination unit 860 may be configured to: and obtaining the model updating amount of the training cooperative party by using the secret sharing matrix multiplication of the trusted initializer based on the feature sample set and the prediction difference of the training cooperative party.

In another example of the present disclosure, the model update amount determination unit 860 may be configured to: and obtaining the model updating amount of the training cooperative party by using secret sharing matrix multiplication of the untrusted initializer based on the feature sample set and the prediction difference of the training cooperative party.

Furthermore, in other examples of the present disclosure, the model training apparatus 800 may further include a negotiation unit (not shown) configured to negotiate between a plurality of training participants to determine the training initiator and the training cooperator.

Embodiments of a model training method, apparatus and system according to the present disclosure are described above with reference to fig. 1 through 8. The above model training device can be implemented by hardware, or can be implemented by software, or a combination of hardware and software.

FIG. 9 illustrates a hardware block diagram of a computing device 900 for implementing collaborative training of a logistic regression model via multiple training participants, according to an embodiment of the present disclosure. As shown in fig. 9, computing device 900 may include at least one processor 910, storage (e.g., non-volatile storage) 920, memory 930, and a communication interface 940, and the at least one processor 910, storage 920, memory 930, and communication interface 940 are connected together via a bus 960. The at least one processor 910 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 910 to: the following loop process is executed until a loop end condition is satisfied: the method comprises the steps of dividing a feature sample set into a first number of feature sample subsets, and respectively sending each of a second number of feature sample subsets to a corresponding training cooperative party; obtaining a matrix product between the logistic regression model and the subset of feature samples of the training initiator using secret-shared matrix multiplication; dividing the mark value into a first number of partial mark values, and respectively sending each of the second number of partial mark values to a corresponding training cooperative party; determining a current predictor at the training initiator based on the matrix product at the training initiator; determining a prediction difference value between a current prediction value of a training initiator and a corresponding partial mark value; determining a model updating amount at a training initiator based on the feature sample set and the prediction difference at the training initiator; and updating the sub-model of the training initiator based on the current sub-model of the training initiator and the corresponding model update amount, wherein when the cycle process is not finished, the updated sub-models of the training participants are used as the current sub-model of the next cycle process.

It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 910 to perform the various operations and functions described above in connection with fig. 1-8 in the various embodiments of the present disclosure.

FIG. 10 illustrates a hardware block diagram of a computing device 1000 for implementing collaborative training of a logistic regression model via multiple training participants, according to an embodiment of the present disclosure. As shown in fig. 10, the computing device 1000 may include at least one processor 1010, storage (e.g., non-volatile storage) 1020, memory 1030, and a communication interface 1040, and the at least one processor 1010, storage 1020, memory 1030, and communication interface 1040 are connected together via a bus 1060. The at least one processor 1010 executes at least one computer-readable instruction (i.e., an element described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in the memory 1020 that, when executed, cause the at least one processor 1010 to: the following loop process is executed until a loop end condition is satisfied: receiving a corresponding feature sample subset from a training initiator, the feature sample subset being one of a first number of feature sample subsets resulting from segmenting a feature sample set at the training initiator; obtaining a matrix product between the logistic regression model and the subset of feature samples of the training cooperator using secret sharing matrix multiplication; receiving a corresponding partial marker value from a training initiator, the partial marker value being one of a first number of partial marker values obtained by segmenting the marker value at the training initiator; determining a current predictor at the training cooperator based on the matrix product at the training cooperator; determining a prediction difference value at the training cooperator by using the current prediction value of the training cooperator and the received partial mark value; obtaining a model updating quantity of the training cooperative party by using secret sharing matrix multiplication based on the prediction difference values of the feature sample set and the training cooperative party; and updating the sub-models of the training cooperators based on the current sub-models of the training cooperators and the corresponding model update amounts, wherein when the cycle process is not finished, the updated sub-models of the training participants are used as the current sub-models of the next cycle process.

It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1010 to perform the various operations and functions described above in connection with fig. 1-8 in the various embodiments of the present disclosure.

According to one embodiment, a program product, such as a machine-readable medium (e.g., a non-transitory machine-readable medium), is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-8 in various embodiments of the disclosure. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.

In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.

Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.

It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the invention should be determined from the following claims.

It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.

In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training co-parties, training sample data of the training initiator having a set of feature samples and labeled values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the method being performed by the training initiator, the method comprising:

the following loop process is executed until a loop end condition is satisfied:

dividing the feature sample set into the first number of feature sample subsets, and respectively sending each of the second number of feature sample subsets to a corresponding training cooperative party;

obtaining a matrix product between the logistic regression model and the subset of feature samples of the training initiator using secret-shared matrix multiplication;

dividing the label value into the first number of partial label values, and sending each of the second number of partial label values to a corresponding training cooperator;

determining a current predictor at the training initiator based on a matrix product at the training initiator;

determining a prediction difference value between a current prediction value of the training initiator and a corresponding partial mark value;

determining a model update quantity at the training initiator based on the feature sample set and the prediction difference at the training initiator; and

updating the sub-model of the training initiator based on the current sub-model of the training initiator and the corresponding model update amount, wherein when the cycle process is not finished, the updated sub-model of each training participant is used as the current sub-model of the next cycle process.

2. The method of claim 1, wherein obtaining a matrix product between the logistic regression model and the training initiator's feature sample subset using secret sharing matrix multiplication comprises:

obtaining a matrix product between the logistic regression model and a subset of feature samples of the training initiator using a trusted initiator secret sharing matrix multiplication; or

Obtaining a matrix product between the logistic regression model and the subset of feature samples of the training initiator using untrusted initiator secret shared matrix multiplication.

3. The method of claim 1, wherein determining a current predictor at the training initiator based on a matrix product at the training initiator comprises:

determining a current predictor at the training initiator based on a matrix product at the training initiator according to a Taylor expansion formula.

4. The method of claim 1, wherein updating the training initiator's sub-model based on the training initiator's current sub-model and corresponding partial model update amounts comprises: updating sub-models at the training initiator according to the following equation

W_n+1＝W_n-α·X·e_i，

Wherein, W_n+1Representing an updated sub-model, W, at the training initiator_nRepresenting a current submodel at the training initiator, a representing a learning rate, X representing a set of feature samples at the training initiator, and e_iRepresenting a predicted difference at the training initiator.

5. The method of claim 1, wherein the training initiator and the training cooperator are determined by negotiation of the plurality of training participants.

6. The method of any of claims 1 to 5, wherein the end-of-loop condition comprises:

a predetermined number of cycles; or

The determined prediction difference is within a predetermined range.

7. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training co-parties, training sample data of the training initiator having a set of feature samples and labeled values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the method being performed by a training co-party, the method comprising:

the following loop process is executed until a loop end condition is satisfied:

receiving a corresponding feature sample subset from the training initiator, the feature sample subset being one of the first number of feature sample subsets resulting from segmenting the feature sample set at the training initiator;

obtaining a matrix product between the logistic regression model and a subset of feature samples of the training cooperator using secret sharing matrix multiplication;

receiving a corresponding partial token value from the training initiator, the partial token value being one of the first number of partial token values resulting from segmenting the token value at the training initiator;

determining a current predictor at the training cooperator based on a matrix product at the training cooperator;

determining a prediction difference at the training cooperator using the current prediction value of the training cooperator and the received partial tag value;

obtaining a model update quantity of the training cooperator by using secret shared matrix multiplication based on the feature sample set and the prediction difference value of the training cooperator; and

updating the sub-model of the training cooperator based on the current sub-model of the training cooperator and the corresponding model update amount, wherein when the cycle process is not finished, the updated sub-model of each training participant is used as the current sub-model of the next cycle process.

8. The method of claim 7, wherein obtaining a matrix product between the logistic regression model and the subset of feature samples of the training cooperator using secret sharing matrix multiplication comprises:

obtaining a matrix product between the logistic regression model and a subset of feature samples of the training cooperator using a trusted initializer secret sharing matrix multiplication; or

Obtaining a matrix product between the logistic regression model and the subset of feature samples of the training cooperator using untrusted initializer secret shared matrix multiplication.

9. The method of claim 7, wherein obtaining a model update quantity for the training cooperator using secret sharing matrix multiplication based on the set of feature samples and the predicted difference values for the training cooperator comprises:

obtaining a model updating amount of the training cooperative party by using a secret sharing matrix multiplication with a trusted initializer based on the feature sample set and the prediction difference of the training cooperative party; or

And obtaining the model updating amount of the training cooperative party by using secret sharing matrix multiplication of the untrusted initializer based on the feature sample set and the prediction difference of the training cooperative party.

10. The method of claim 7, wherein determining a current predictor at the training cooperator based on a matrix product at the training cooperator comprises:

determining a current predictor at the training initiator based on a matrix product at the training cooperator in accordance with a Taylor expansion formula.

11. The method of claim 7, wherein updating the sub-models of the training cooperator based on the current sub-model of the training cooperator and a corresponding partial model update amount comprises: updating the sub-model at the training cooperator according to the following equation

W_n+1＝W_n-α·X·e_i，

Wherein, W_n+1Representing an updated sub-model, W, at the training cooperator_nRepresenting the current submodel at the training cooperator, alpha representing the learning rate, X representing the set of feature samples at the training initiator, and e_iRepresenting the trainingPredicted difference at the cooperator.

12. The method of claim 7, wherein the training initiator and the training cooperator are determined by negotiation of the plurality of training participants.

13. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training co-participants, training sample data of the training initiator having a set of feature samples and labeled values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the apparatus being located on a training initiator side, the apparatus comprising:

a sample segmentation unit configured to segment the feature sample set into the first number of feature sample subsets;

a sample sending unit configured to send each of the second number of feature sample subsets to a corresponding training cooperator, respectively;

a matrix product obtaining unit configured to obtain a matrix product between the logistic regression model and the subset of feature samples of the training initiator using secret shared matrix multiplication;

a marker value dividing unit configured to divide the marker value into the first number of partial marker values;

a tag value transmitting unit configured to transmit each of the second number of partial tag values to a corresponding training cooperator, respectively;

a predictor determination unit configured to determine a current predictor at the training initiator based on a matrix product at the training initiator;

a prediction difference determination unit configured to determine a prediction difference between a current prediction value of the training initiator and a corresponding partial marker value;

a model update amount determination unit configured to determine a model update amount at the training initiator based on the feature sample set and the prediction difference value at the training initiator; and

a model updating unit configured to update the sub-model of the training initiator based on a current sub-model of the training initiator and a corresponding model update amount,

wherein the sample division unit, the sample transmission unit, the matrix product acquisition unit, the flag value division unit, the flag value transmission unit, the prediction value determination unit, the prediction difference value determination unit, the model update amount determination unit, and the model update unit are configured to cyclically perform operations until a cycle end condition is satisfied,

wherein, when the cycle process is not finished, the updated sub-model of each training participant is used as the current sub-model of the next cycle process.

14. The apparatus of claim 13, wherein the matrix product acquisition unit is configured to:

15. The apparatus of claim 13, wherein the model updating unit is configured to: updating sub-models at the training initiator according to the following equation

W_n+1＝W_n-α·X·e_i，

Wherein, W_n+1Representing an updated sub-model, W, at the training initiator_nRepresenting the current submodel at the training initiator, alpha representing the learning rate, X representing the training engineA set of feature samples at the origin, and e_iRepresenting a predicted difference at the training initiator.

16. The apparatus of any of claims 13 to 15, further comprising:

a negotiation unit configured to negotiate and determine the training initiator and the training cooperator between the plurality of training participants.

17. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, each training participant having one sub-model, the first number being equal to the number of training participants, the training participants comprising a training initiator and a second number of training co-participants, training sample data of the training initiator having a set of feature samples and labeled values, the training sample data being obtained by horizontal segmentation, the second number being equal to the first number minus one, the apparatus being located on a training co-participant side, the apparatus comprising:

a sample receiving unit configured to receive a corresponding feature sample subset from the training initiator, the feature sample subset being one of the first number of feature sample subsets resulting from segmentation of the feature sample set at the training initiator;

a matrix product obtaining unit configured to obtain a matrix product between the logistic regression model and the subset of feature samples of the training cooperator using secret shared matrix multiplication;

a token value receiving unit configured to receive a corresponding partial token value from the training initiator, the partial token value being one of the first number of partial token values resulting from segmenting the token value at the training initiator;

a predictor determination unit configured to determine a current predictor at the training cooperator based on a matrix product at the training cooperator;

a prediction difference determination unit configured to determine a prediction difference at the training cooperator using a current prediction value of the training cooperator and the received partial tag value;

a model update amount determination unit configured to obtain a model update amount of the training cooperator using secret sharing matrix multiplication based on the feature sample set and the prediction difference of the training cooperator; and

a model updating unit configured to update the submodel of the training cooperator based on a current submodel of the training cooperator and a corresponding model update amount,

wherein the sample receiving unit, the matrix product acquiring unit, the flag value receiving unit, the prediction value determining unit, the prediction difference value determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically perform operations until a cycle end condition is satisfied,

18. The apparatus of claim 17, wherein the matrix product acquisition unit is configured to:

19. The apparatus of claim 17, wherein the model update amount determination unit is configured to:

20. The apparatus of claim 17, wherein the model updating unit is configured to: updating the sub-model at the training cooperator according to the following equation

W_n+1＝W_n-α·X·e_i，

Wherein, W_n+1Representing an updated sub-model, W, at the training cooperator_nRepresenting the current submodel at the training cooperator, alpha representing the learning rate, X representing the set of feature samples at the training initiator, and e_iRepresenting a predicted difference at the training cooperator.

21. The apparatus of any of claims 17 to 20, further comprising:

22. A system for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model comprising a first number of sub-models, the system comprising:

training initiator device comprising the apparatus of any of claims 13 to 16; and

a second number of training cooperator apparatuses, each training cooperator apparatus comprising an apparatus as claimed in any one of claims 17 to 20,

wherein the first number is equal to the number of the training participants, each training participant has a submodel, training sample data of the training initiator has a feature sample set and a labeled value, the training sample data is obtained by horizontal segmentation, and the second number is equal to the first number minus one.

23. A computing device, comprising:

at least one processor, and

a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-6.

24. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 1 to 6.

25. A computing device, comprising:

at least one processor, and

a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 7 to 12.

26. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 7 to 12.