CN112183757B

CN112183757B - Model training method, device and system

Info

Publication number: CN112183757B
Application number: CN201910599381.2A
Authority: CN
Inventors: 陈超超; 李梁; 王力; 周俊
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd
Priority date: 2019-07-04
Filing date: 2019-07-04
Publication date: 2023-10-27
Anticipated expiration: 2039-07-04
Also published as: CN112183757A

Abstract

The present disclosure provides methods and apparatus for training a linear/logistic regression model in which feature sample sets are subjected to vertical-to-horizontal segmentation transformations to obtain transformed feature sample subsets for individual training participants. The current predicted value is obtained based on the current conversion sub-model and the conversion feature sample subset of each training participant. At a first training participant, a prediction difference and a first model update amount are determined, the first model update amount is decomposed and a first partial model update amount is sent to a second training participant. At the second training participant, a second model update amount is obtained and decomposed based on the prediction difference and the corresponding subset of conversion feature samples, and the second partial model update amount is sent to the first training participant. At each training participant, a respective conversion sub-model is updated based on a respective partial model update amount. When the loop end condition is satisfied, a respective sub-model is determined based on the conversion sub-model of the respective training participant.

Description

Model training method, device and system

Technical Field

The present disclosure relates generally to the field of machine learning, and more particularly, to methods, apparatus, and systems for collaborative training of a linear/logistic regression model via multiple training participants using a vertically segmented training set.

Background

The linear regression model and the logistic regression model are regression/classification models widely used in the field of machine learning. In many cases, multiple model training participants (e.g., e-commerce companies, courier companies, and banks) each have different pieces of data of the feature samples used to train the linear/logistic regression model. The multiple model training participants typically want to collectively use each other's data to train a linear/logistic regression model, but do not want to provide their respective data to the other individual model training participants to prevent their own data from being compromised.

In view of this situation, a machine learning method capable of securing data is proposed, which is capable of training a linear/logistic regression model in cooperation with a plurality of model training participants for use by the plurality of model training participants, while securing respective data security of the plurality of model training participants. However, existing machine learning methods capable of securing data are less efficient in model training.

Disclosure of Invention

In view of the foregoing, the present disclosure provides a method, apparatus, and system for collaborative training of a linear/logistic regression model via multiple training participants, which can improve the efficiency of model training while ensuring the security of the respective data of the multiple training participants.

According to one aspect of the present disclosure, there is provided a method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first feature sample subset and a marker value, the second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing the feature sample set, the method being performed by the first training participant, the method comprising: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a current prediction value for a feature sample set using secret sharing matrix multiplication based on a current conversion sub-model and a conversion feature sample subset of each training participant; determining a prediction difference between the current predicted value and a corresponding marking value; determining a first model update amount using the prediction difference and a subset of conversion feature samples at the first training participant; decomposing the first model updating amount into two first partial model updating amounts, and transmitting one first partial model updating amount to the second training participant; and receiving a second partial model update from the second training participant, the second partial model update resulting from decomposing a second model update at the second training participant, the second model update resulting from performing a secret sharing matrix multiplication on the prediction difference and a subset of the conversion feature samples at the second training participant; updating a current conversion sub-model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount, wherein, when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model for the next cyclic process; determining a sub-model of the first training participant based on the conversion sub-model of the first training participant and the second training participant when the cycle end condition is satisfied.

According to another aspect of the present disclosure, there is provided a method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first feature sample subset and a marker value, the second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing the feature sample set, the method being performed by the second training participant, the method comprising: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a current prediction value for a feature sample set using secret sharing matrix multiplication based on a current conversion sub-model and a conversion feature sample subset of each training participant; receiving a first partial model update from the first training participant, the first partial model update resulting from decomposing a first model update at the first training participant, the first model update determined at the first training participant using a predicted difference value and a subset of conversion feature samples at the first training participant, wherein the predicted difference value is a difference value between the current predicted value and a corresponding marker value; performing a secret sharing matrix multiplication on the prediction difference and a subset of conversion feature samples at the second training participant to obtain a second model update amount; decomposing the second model updating amount into two second partial model updating amounts, and transmitting one second partial model updating amount to the first training participant; and updating the current conversion sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount, wherein, when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process; determining a sub-model of the second training participant based on the conversion sub-model of the first training participant and the second training participant when the cycle end condition is satisfied.

According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first feature sample subset and a marker value, the second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing the feature sample set, the apparatus being located on the first training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the sub-models of the training participants to obtain conversion sub-models of the training participants; a sample conversion unit configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at each training participant; a predicted value acquisition unit configured to obtain a current predicted value for a feature sample set using secret sharing matrix multiplication based on a current conversion sub-model and a conversion feature sample subset of each training participant; a prediction difference value determining unit configured to determine a prediction difference value between the current prediction value and a corresponding flag value; a model update amount determination unit configured to determine a first model update amount using the prediction difference value and the first conversion feature sample subset; a model update amount decomposition unit configured to decompose the first model update amount into two first partial model update amounts; a model update amount transmitting/receiving unit configured to transmit a first partial model update amount to the second training participant, and to receive a second partial model update amount from the second training participant, the second partial model update amount being obtained by decomposing a second model update amount at the second training participant, the second model update amount being obtained by performing a secret sharing matrix multiplication on the prediction difference value and the second conversion feature sample subset; a model updating unit configured to update a current conversion sub-model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount; and a model determination unit configured to determine a sub-model of the first training participant based on the conversion sub-model of the first training participant and the second training participant when the cycle end condition is satisfied, wherein the sample conversion unit, the predicted value acquisition unit, the predicted difference determination unit, the model update amount decomposition unit, the model update amount transmission/reception unit, and the model update unit cyclically execute operations until the cycle end condition is satisfied, wherein when a cycle process is not ended, the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process.

According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first feature sample subset and a marker value, the second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing the feature sample set, the apparatus being located on the second training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the sub-models of the training participants to obtain conversion sub-models of the training participants; a sample conversion unit configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at each training participant; a predicted value acquisition unit configured to obtain a current predicted value for a feature sample set using secret sharing matrix multiplication based on a current conversion sub-model and a conversion feature sample subset of each training participant; a model update amount receiving unit configured to receive a first partial model update amount from the first training participant, the first partial model update amount being obtained by decomposing a first model update amount at the first training participant, the first model update amount being determined at the first training participant using a prediction difference value and a subset of conversion feature samples at the first training participant, wherein the prediction difference value is a difference value between the current prediction value and a corresponding flag value; a second model update amount determination unit configured to perform a secret sharing matrix multiplication on the prediction difference value and a subset of conversion feature samples at the second training participant to obtain a second model update amount; a model update amount decomposition unit configured to decompose the second model update amount into two second partial model update amounts; a model update amount transmitting unit configured to transmit a second partial model update amount to the first training participant; a model updating unit configured to update a current sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount; and a model determination unit configured to determine a sub-model of the second training participant based on the conversion sub-model of the first training participant and the second training participant when the cycle end condition is satisfied, wherein the sample conversion unit, the predicted value acquisition unit, the model update amount reception unit, the model update amount determination unit, the model update amount decomposition unit, the model update amount transmission unit, and the model update unit cyclically execute operations until the cycle end condition is satisfied, wherein when a cycle process is not ended, the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process.

According to another aspect of the present disclosure, there is provided a system for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first feature sample subset and a marker value, the second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing the feature sample set, the system comprising: a first training participant device comprising means as described above for co-training a linear/logistic regression model via the first and second training participants; and a second training participant device comprising means for co-training the linear/logistic regression model via the first and second training participants as described above.

According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method performed on the first training participant side as described above.

According to another aspect of the disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method performed on a first training participant side as described above.

According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the training method performed on the second training participant side as described above.

According to another aspect of the disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method performed on a second training participant side as described above.

By utilizing the scheme of the embodiment of the disclosure, the model parameters of the linear/logistic regression model can be obtained by training under the condition that secret data of the training participants are not leaked, and the workload of model training is only in linear relation with the number of the characteristic samples used by training, but not in exponential relation.

Drawings

A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.

FIG. 1 illustrates a schematic diagram of an example of vertically sliced data in accordance with an embodiment of the present disclosure;

FIG. 2 illustrates an architectural diagram showing a system for co-training a linear/logistic regression model via two training participants according to an embodiment of the present disclosure;

FIG. 3 illustrates a flow chart of a method for co-training a linear/logistic regression model via two training participants according to an embodiment of the present disclosure;

FIG. 4 illustrates a flowchart of one example of a model conversion process according to an embodiment of the present disclosure;

FIG. 5 illustrates a flowchart of one example of a feature sample set conversion process according to an embodiment of the present disclosure;

FIG. 6 shows a flowchart of a predictor retrieval process according to an embodiment of the present disclosure;

FIG. 7 illustrates a flow chart of one example of a trusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;

FIG. 8 illustrates a flow chart of one example of a non-trusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;

FIG. 9 illustrates a block diagram of an apparatus for co-training a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure;

fig. 10 shows a block diagram of one example of a predicted value acquisition unit according to an embodiment of the present disclosure;

FIG. 11 illustrates a block diagram of an apparatus for co-training a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure;

FIG. 12 shows a schematic diagram of a computing device for co-training a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure;

fig. 13 shows a schematic diagram of a computing device for co-training a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure.

Detailed Description

The subject matter described herein will now be discussed with reference to example embodiments. It should be appreciated that these embodiments are discussed only to enable a person skilled in the art to better understand and thereby practice the subject matter described herein, and are not limiting of the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, replace, or add various procedures or components as desired. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may be combined in other examples as well.

As used herein, the term "comprising" and variations thereof mean open-ended terms, meaning "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment. The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. Unless the context clearly indicates otherwise, the definition of a term is consistent throughout this specification.

Secret sharing methods are cryptographic techniques that decompose a secret into stored secret shares, each of which is owned and managed by one of multiple parties, and a single party cannot recover the complete secret, only if several parties cooperate together. The secret sharing method aims at preventing the secret from being too concentrated so as to achieve the purposes of dispersing risks and tolerating intrusion.

Secret sharing methods can be broadly divided into two categories: there is a trusted initializer (trust initializier) secret sharing method and an untrusted initializer secret sharing method. In the secret sharing method with a trusted initializer, the trusted initializer is required to perform parameter initialization (often to generate a random number satisfying a certain condition) for each participant participating in the multiparty security calculation. After the initialization is completed, the trusted initializing party destroys the data and disappears at the same time, and the data is not needed in the following multiparty security calculation process.

The trusted initializer secret sharing matrix multiplication is applicable to the following situations: the complete secret data is the product of the first set of secret shares and the second set of secret shares, and each of the participants has one of the first set of secret shares and one of the second set of secret shares. Through the secret sharing matrix multiplication with the trusted initializing party, each party in the plurality of parties can obtain partial complete secret data of complete secret data, the sum of the partial complete secret data obtained by each party is the complete secret data, and each party discloses the obtained partial complete secret data to other parties, so that each party can obtain the complete secret data without disclosing the secret share owned by each party, and the safety of the data of each party is ensured.

The non-trusted initializer secret sharing matrix multiplication is one of the secret sharing methods. The secret sharing matrix multiplication without trusted initializers is applicable in case the complete secret is the product of the first secret share and the second secret share and both parties have the first secret share and the second secret share, respectively. By secret sharing matrix multiplication without trusted initializers, each of the two parties that own the respective secret shares generates and discloses data that is different from the secret shares it owns, but the sum of the data that the two parties each disclose is equal to the product of the secret shares that the two parties each own (i.e. the complete secret). Thus, the parties can recover the complete secret by the secret sharing matrix multiplication cooperative work of the trusted initializing party without disclosing the secret shares owned by the parties, which ensures the security of the data of the parties.

In the present disclosure, the training sample set used in the linear/logistic regression model training scheme is a vertically sliced training sample set. The term "vertically slicing a training sample set" refers to slicing the training sample set into a plurality of training sample subsets according to a module/function (or some specified rule), each training sample subset containing a portion of the training subsamples of each training sample in the training sample set, all of the training subsamples contained in the training sample subset comprising the training sample. In one example, assume that the training sample includes a tag y ₀ And attributesThen after vertical segmentation, the training participant Alice owns y of the training sample ₀ And->Training participant Bob owns the training sampleIn another example, assume that the training sample includes tag y ₀ And attribute-> Then after vertical segmentation, the training participant Alice owns y of the training sample ₀ And-> Training participant Bob owns the training sampleAnd->In addition to these two examples, there are other possibilities, not listed here.

Let us assume that a sample x of attribute values described by d attributes (also called features) is given ^T ＝(x ₁ ；x ₂ ；…；x _d ) Wherein x is _i The value of x on the ith attribute and T represent the transpose, then the linear regression model is Y=Wx, and the logistic regression model is Y=1/(1+e) ^-wx ) Where Y is the predictor and W is the model parameter of the linear/logistic regression model (i.e., the model described in this disclosure),W _P refers to the sub-model at each training partner P in this disclosure. In this disclosure, attribute value samples are also referred to as feature data samples.

In this disclosure, each training participant has different portions of data of the training samples used to train the linear/logistic regression model. For example, assuming that a training sample set includes 100 training samples, each training sample containing a plurality of eigenvalues and labeled actual values, for example, the data possessed by a first participant may be a partial eigenvalue and labeled actual value for each of the 100 training samples, and the data possessed by a second participant may be a partial eigenvalue (e.g., remaining eigenvalue) for each of the 100 training samples.

The matrix multiplication computation described anywhere in the present disclosure requires a determination as to whether or not to transpose one or more corresponding matrices of two or more matrices participating in matrix multiplication, as the case may be, to satisfy a matrix multiplication rule, thereby completing the matrix multiplication computation.

Embodiments of methods, apparatuses, and systems for co-training a linear/logistic regression model via two training participants according to the present disclosure are described in detail below with reference to the accompanying drawings.

Fig. 1 shows a schematic diagram of an example of a vertically sliced training sample set in accordance with an embodiment of the present disclosure. In fig. 1, 2 data parties Alice and Bob are shown, as are multiple data parties. For each training sample, the partial training subsamples owned by the data parties Alice and Bob are combined together to form the complete content of the training sample. For example, assume that the content of a certain training sample includes a label (hereinafter referred to as "tag value") y ₀ And attribute characteristics (hereinafter referred to as "characteristic samples")Then after vertical segmentation, the training participant Alice owns y of the training sample ₀ And->Training participant Bob owns the training sample +.>

Fig. 2 shows an architecture diagram illustrating a system 1 (hereinafter referred to as model training system 1) for co-training a linear/logistic regression model via two training participants according to an embodiment of the present disclosure.

As shown in fig. 2, model training system 1 includes a first training participant device 10 and a second training participant device 20. The first training participant device 10 and the second training participant device 20 may communicate with each other over a network 30 such as, but not limited to, the internet or a local area network. In this disclosure, the first training participant device 10 and the second training participant device 20 are collectively referred to as training participant devices. Wherein the first training partner device 10 possesses a flag value and the second training partner device 20 does not possess a flag value.

In this disclosure, the trained linear/logistic regression model is decomposed into 2 sub-models, one for each training participant device. A training sample set for model training is located at the first training participant device 10 and the second training participant device 20, the training sample set being a vertically partitioned training sample set as described above, and the training sample set comprising a feature data set and corresponding marker values, i.e. X shown in fig. 1 ₀ And y ₀ . The sub-model owned by each training participant, and the corresponding training samples, is secret to that training participant and cannot be learned or completely learned by other training participants.

In this disclosure, the linear/logistic regression model and the sub-models of the individual training participants are represented using a weight vector W and a weight sub-vector Wi, respectively, where i is used to represent the sequence numbers or identifications (e.g., a and B) of the training participants. The feature data set is represented using a feature matrix X and the predictor and marker values are each using a predictor vectorAnd a marker value vector Y.

In performing model training, the first training participant device 10 and the second training participant device 20 cooperatively train the linear/logistic regression model using respective subsets of training samples and respective sub-models to perform secret sharing matrix multiplication to obtain predictions for the training sample sets. The specific training process for the model will be described in detail below with reference to fig. 3 to 8.

In this disclosure, the first training participant device 10 and the second training participant device 20 may be any suitable computing devices having computing capabilities. The computing device includes, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, personal Digital Assistants (PDAs), handsets, messaging devices, wearable computing devices, consumer electronic devices, and the like.

Fig. 3 illustrates a flowchart of a method for co-training a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure. In the training method shown in fig. 3, a first training partner Alice has a sub-model W of a linear/logistic regression model _A The second training partner Bob has a sub-model W of the linear/logistic regression model _B The first training party Alice has a first subset X of feature samples _A And a marker value Y, the second training partner Bob having a second feature sample subset X _B First feature sample subset X _A And a second feature sample subset X _B Is obtained by vertically slicing a feature sample set X for model training.

As shown in FIG. 3, first, at block 301, a first training party Alice, a second training party Bob initialize its submodel parameters, i.e., weight submodel W _A And W is _B To obtain initial values of its sub-model parameters and to initialize the number of training cycles t that have been performed to zero. Here, it is assumed that the end condition of the loop process is to perform a predetermined number of training loopsThe loop, for example, performs T training cycles.

After the above initialization, at block 302, at Alice and Bob, model transformation processes are performed on the respective initial sub-models, respectively, to obtain transformed sub-models.

FIG. 4 illustrates a flowchart of one example of a model conversion process according to an embodiment of the present disclosure.

As shown in fig. 4, at Alice, at block 410, the submodel W possessed by Alice is obtained _A Decomposition into W _A1 And W is _A2 . Here, the sub-model W _A In the decomposition process, for the submodel W _A The attribute value of the element is decomposed into 2 partial attribute values, and 2 new elements are obtained by using the decomposed partial attribute values. Then, the obtained 2 new elements are respectively assigned to W _A1 And W is _A2 Thereby obtaining W _A1 And W is _A2 。

Next, at block 420, at Bob, the submodel W that Bob has is modeled as _B Decomposition into W _B1 And W is _B2 。

Then, at block 430, alice will W _A2 Send to Bob, and at block 440 Bob will W _B1 To Alice.

Next, at block 450, at Alice, for W _A1 And W is _B1 Splicing to obtain a converted submodel W _A '. The resulting transformed submodel W _A The dimensions of' are equal to the dimensions of the feature sample set used for model training. At block 460, at Bob, for W _A2 And W is _B2 Splicing to obtain a converted submodel W _B '. Also, the resulting transformed submodel W _B The dimensions of' are equal to the dimensions of the feature sample set used for model training.

Returning to FIG. 3, after model conversion is completed as above, at block 303, the first training partner Alice and the second training partner Bob cooperate to perform a model conversion on the first feature sample subset X _A And a second feature sample subset X _B Performing vertical segmentation-horizontal segmentation conversion to obtain a first conversion feature sample subset X _A ' and second conversion feature samplesThe subset X _B '. The resulting first subset of conversion feature samples X _A ' and second conversion feature sample subset X _B Each feature sample in' has the complete feature content of each training sample, i.e., similar to the feature sample subset obtained by horizontally slicing the feature sample set.

Fig. 5 shows a flowchart of a feature sample set conversion process according to an embodiment of the present disclosure.

As shown in FIG. 5, at block 510, at Alice, a first subset X of feature samples is taken _A Decomposition into X _A1 And X _A2 . At block 520, at Bob, a second feature sample subset X _B Decomposition into X _B1 And X _B2 . For feature sample subset X _A And X _B Is decomposed into sub-models W _A The decomposition process of (2) is exactly the same. Then, at block 530, alice will X _A2 Send to Bob, and at block 540 Bob will X _B1 To Alice.

Next, at block 550, at Alice, for X _A1 And X _B1 Stitching to obtain a first conversion feature sample subset X _A '. The resulting first subset of conversion feature samples X _A The dimension of' is equal to the dimension of the feature sample set X used for model training. At block 560, at Bob, for X _A2 And X _B2 Stitching to obtain a second conversion feature sample subset X _B '. Converting feature sample subset X _B The dimension of' is the same as the dimension of the feature sample set X.

For the first feature sample subset X as above _A And a second feature sample subset X _B After the vertical cut-to-horizontal cut transition, the operations of blocks 304 through 314 are performed in a loop until the loop end condition is satisfied.

Specifically, at block 304, a current sub-model W based on each training participant _A And W is _B Conversion characteristic sample subset X of each training participant _A ' and X _B ' obtaining the current predicted value of the linear/logistic regression model to be trained for the feature sample set X using secret sharing matrix multiplicationHow to use secret sharing matrix multiplication to obtain the current predictive value +.f. of the linear/logistic regression model to be trained for the feature sample set X>The description will be made below with reference to fig. 6 to 8.

At the time of obtaining the current predicted valueThereafter, at block 305, at the first training partner Alice, the current predicted value +.>Predictive difference between corresponding marker value Y +. >Here, E is a column vector, Y is a column vector representing the marker value of training sample X, and +.>Is a column vector representing the current predicted value of training sample X. E, Y and +.>Are column vectors having only a single element. E, Y and +.>Are column vectors with multiple elements, wherein, < >>Each element in Y is a marker value of a corresponding training sample in the plurality of training samples,and each element in E is a difference between the labeled value of the corresponding training sample of the plurality of training samples and the current predicted value.

Then, at block 306, at Alice, the prediction difference E and the first subset of conversion feature samples X are used _A ' determining a first model update amount tmp1=x _A ' E. Then, at block 307, at Alice, the first model update amount TMP1 is decomposed into tmp1=tmp1 _A +TMP1 _B . Here, the decomposition process for TMP1 is the same as the above-described decomposition process, and will not be described here. Subsequently, at block 308, alice will TMP1 _B To Bob.

Then, at block 309, alice and Bob pair the prediction difference E and the second conversion feature sample subset X _B ' performing a secret sharing matrix multiplication to calculate a second model update amount tmp2=x _B ' E. Then, at block 310, at Bob, the second model update amount TMP2 is decomposed into tmp2=tmp2 _A +TMP2 _B . Subsequently, at block 311, bob will TMP2 _A To Alice.

Next, at block 312, at Alice, TMP1 based _A And TMP2 _A To the current conversion sub-model W at Alice _A ' update. Specifically, first, the TMP is calculated _A ＝TMP1 _A +TMP2 _A Then, TMP is used _A To update the current conversion sub-model W _A ' for example, the following equation (1) may be used to perform the submodel update:

wherein W is _A ' n is the current transformation sub-model at Alice, W _A ' is (n+1) is the updated conversion sub-model at Alice, α is the learning rate (learning rate), and S is the number of training samples used by the round of model training process, i.e., the batch size (batch size) of the round of model training process.

At block 313, at Bob, TMP1 based _B And TMP2 _B To the current conversion submodel W at Bob _B ' update.Specifically, first, the TMP is calculated _B ＝TMP1 _B +TMP2 _B Then, TMP is used _B To update the current conversion sub-model W _B ' for example, the following equation (2) may be used to perform the submodel update:

wherein W is _B ' n is the current conversion sub-model at Bob, W _B ' is (n+1) is the updated conversion sub-model at Bob, α is the learning rate (learning rate), and S is the number of training samples used by the round of model training process, i.e., the batch size (batch size) of the round of model training process.

Then, at block 314, it is determined whether a predetermined number of cycles has been reached, i.e., whether a cycle end condition has been reached. If a predetermined number of loops (e.g., T) is reached, block 315 is entered. If the predetermined number of cycles has not been reached, flow returns to the operation of block 302 to perform the next training cycle in which the updated sub-model obtained by each training participant in the current cycle is used as the current sub-model for the next training cycle.

At block 315, sub-models (i.e., trained sub-models) at Alice and Bob are determined based on the updated conversion sub-models of Alice and Bob, respectively.

Specifically, W is trained as above _A ' and W _B ' Alice will W _A '[|A|:]Send to Bob, and Bob will W _B '[0：|A|]To Alice. Here, W is _A '[|A|:]Refers to W _A Vector components after dimension A in' A (i.e., A), W _B '[0：|A|]Refers to W _B Vector components before dimension a in' i.e., a|a|, i.e., components from 0 to a|a|. For example, assume that w= [0,1,2,3,4 ]If |A| is 2, then W [0: |A| | is 2]＝[0,1]And W [ |a|:]＝[2,3,4]. Next, at Alice, calculate W _A ＝W _A '[0：|A|]+W _B '[0：|A|]And at Bob, calculate W _B ＝W _B '[|A|：]+W _B '[|A|：]Thereby obtaining trained sub-models W at Alice and Bob _A And W is _B 。

Here, it is to be noted that, in the above-described example, the end condition of the training cycle process means that the predetermined number of cycles is reached. In other examples of the present disclosure, the end condition of the training cycle may also be that the determined predicted difference value lies within a predetermined range, i.e., each element E in the predicted difference value E _i Each element E being within a predetermined range, e.g. predictive of a difference E _i Is less than a predetermined threshold or the average of the predicted differences E is less than a predetermined threshold. Accordingly, the operations of block 314 in fig. 3 may be performed after the operations of block 305.

It is to be noted here that, in X _i X is a single characteristic sample _i Is a feature vector (column vector or row vector) made up of multiple attributes, and E is a single prediction difference. At X _i X in the case of a plurality of characteristic samples _i Is a feature matrix, and the attribute of each feature sample forms a feature matrix X _i Is a predictive difference vector, and E is a column element/row element of (c). In calculating X _i In E, multiplied by each element in E is a matrix X _i A characteristic value of each sample corresponding to a certain characteristic of the sample. For example, assuming E is a column vector, each time multiplied E is multiplied by matrix X _i The elements in the row represent the feature value of a feature corresponding to each sample.

Fig. 6 shows a flowchart of a predictor acquisition process according to an embodiment of the present disclosure.

As shown in fig. 6, first, at block 601, at Alice, a first subset X of conversion feature samples is used _A ' and current conversion sub-model W _A ' calculate Z _A1 ＝X _A '*W _A '. At Bob, at block 602, a second conversion feature sample subset X is used _B ' and current conversion sub-model W _B ' calculate Z _B1 ＝X _B '*W _B '。

Then, at block 603, alice and Bob use the secret sharing momentMatrix multiplication to calculate Z ₂ ＝X _A '*W _B ' and Z ₃ ＝X _B '*W _A '. Here, the secret sharing matrix multiplication may use a trusted party initialized secret sharing matrix multiplication and an untrusted party initialized secret sharing matrix multiplication. The description of the trusted party initialized secret sharing matrix multiplication and the untrusted party initialized secret sharing matrix multiplication will be described below with reference to fig. 7 and 8, respectively.

Next, at block 604, at Alice, Z is taken ₂ Decomposition into Z _A2 And Z _B2 . At block 505, at Bob, Z ₃ Decomposition into Z _A3 And Z _B3 . Here, for Z ₂ And Z ₃ The decomposition process of (2) is the same as the decomposition process described above for the feature sample subset and will not be described here.

Then, at block 606, alice will Z _B2 Send to Bob, and at block 607 Bob will Z _A3 To Alice.

Next, at block 608, at Alice, Z is calculated _A ＝Z _A1 +Z _A2 +Z _A3 . At block 609, at Bob, Z is calculated _B ＝Z _B1 +Z _B2 +Z _B3 . Then, at block 610, bob will Z _B Send to Alice, and in block 611 Alice will Z _A To Bob.

At the respective receiving points Z _A And Z _B Thereafter, at block 612, at Alice and Bob, predictions are obtained

Fig. 7 illustrates a flowchart of one example of a trusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure. The trusted party secret sharing matrix multiplication shown in FIG. 7 to calculate Z ₂ ＝X _A '*W _B ' illustrate by way of example, wherein X _A ' is a subset of the transformed samples at Alice (hereinafter referred to as the feature matrix), W _B ' is the conversion sub-model at Bob (hereinafter referred to as the weight vector).

As shown in fig. 7, first, at the trusted initializer 30,generating 2 random weight vectors W _R，1 And W is _R，2 2 random feature matrices X _R，1 、X _R，2 2 random flag value vectors Y _R，1 、Y _R，2 Wherein, the method comprises the steps of, wherein,here, the dimensions of the random weight vector are the same as those of the conversion sub-model (weight vector) of each training participant, the dimensions of the random feature matrix are the same as those of the conversion sample subset (feature matrix), and the dimensions of the random flag value vector are the same as those of the flag value vector.

Then, at block 701, the trusted initializer 30 will generate a W _R，1 、X _R，1 And Y _R，1 Send to Alice, and at block 702, will generate W _R，2 、X _R，2 And Y _R，2 To Bob.

Next, at block 703, at Alice, a feature matrix X is computed _A ' decomposition into 2 feature sub-matrices, i.e. feature sub-matrix X _A1 ' and X _A2 '。

For example, assume feature matrix X _A ' comprising two feature samples S1 and S2, the feature samples S1 and S2 each comprising 3 attribute values, wherein s1= [ a ] ₁ ¹ ,a ₂ ¹ ,a ₃ ¹ ]And s2= [ a ] ₁ ² ,a ₂ ² ,a ₃ ² ]Then, in the process of combining the feature matrix X _A ' decomposition into 2 feature submatrices X _A1 ' and X _A2 ' after that, the first feature submatrix X _A1 ' include feature subsamples [ a ] ₁₁ ¹ ,a ₂₁ ¹ ,a ₃₁ ¹ ]And feature subsamples [ a ] ₁₁ ² ,a ₂₁ ² ,a ₃₁ ² ]Second feature submatrix X _A2 ' include feature subsamples [ a ] ₁₂ ¹ ,a ₂₂ ¹ ,a ₃₂ ¹ ]And feature subsamples [ a ] ₁₂ ² ,a ₂₂ ² ,a ₃₂ ² ]Wherein a is ₁₁ ¹ +a ₁₂ ¹ ＝a ₁ ¹ ，a ₂₁ ¹ +a ₂₂ ¹ ＝a ₂ ¹ ，a ₃₁ ¹ +a ₃₂ ¹ ＝a ₃ ¹ ，a ₁₁ ² +a ₁₂ ² ＝a ₁ ² ，a ₂₁ ² +a ₂₂ ² ＝a ₂ ² And a ₃₁ ² +a ₃₂ ² ＝a ₃ ² 。

Then, at block 704, alice decomposes the decomposed feature sub-matrix X _A2 ' send to Bob.

At block 705, at Bob, the weight vector W _B ' decomposition into 2 weight sub-vectors W _B1 ' and W _B2 '. The decomposition process of the weight vector is the same as the decomposition process described above. At block 706, bob weights the weight sub-vector W _B1 ' send to Alice.

Then, at each training participant, a weight sub-vector difference E and a feature sub-matrix difference D at that training participant are determined based on the weight sub-vector, the corresponding feature sub-matrix, and the received random weight vector and random feature matrix for each training participant. For example, at block 707, at Alice, its weight sub-vector difference value e1=w is determined _B1 '-W _R，1 Feature submatrix difference d1=x _A1 '-X _R，1 . At block 708, at Bob, its weight sub-vector difference value e2=w is determined _B2 '-W _R，2 Feature submatrix difference d2=x _A2 '-X _R，2 。

After each training participant determines the respective weight sub-vector differences Ei and feature sub-matrix differences Di, alice sends D1 and E1, respectively, to the training partner Bob at block 709. At block 710, the training partner Bob sends D2 and E2 to Alice.

Then, at each training participant, the weight sub-vector differences and the feature sub-matrix differences at each training participant are summed to obtain a weight sub-vector total difference E and a feature sub-matrix total difference D, respectively, at block 711. For example, as shown in fig. 7, d=d1+d2, and e=e1+e2.

Then, at each training participant, based on the received random weight vector W _R,i Random feature matrix X _R,i Random flag value vector Y _R,i And calculating the corresponding predicted value vector Zi by the weight sub-vector total difference E and the feature sub-matrix total difference D.

In one example of the present disclosure, at each training participant, the product of the training participant's random marker value vector, the total difference of the weight sub-vectors, and the training participant's random feature matrix, and the product of the total difference of the feature sub-matrices, and the training participant's random weight vector, may be summed to obtain a corresponding predictor vector (first calculation). Alternatively, the random flag value vector of the training participant, the product of the total difference of the weight sub-vectors and the random feature matrix of the training participant, the product of the total difference of the feature sub-matrices and the random weight vector of the training participant, and the product of the total difference of the weight sub-vectors and the total difference of the feature sub-matrices may be summed to obtain the corresponding prediction value matrix (second calculation mode).

It is to be noted here that, in the calculation of the prediction value matrix at each training partner, only one of the prediction value matrices calculated at the training partner contains the product of the total difference of the weight sub-vectors and the total difference of the feature sub-matrices. In other words, for each training participant, only one training participant's predictor vector is calculated according to the second calculation, while the remaining training participants calculate the corresponding predictor vector according to the first calculation.

For example, at block 712, at Alice, a corresponding predictor vector z1=y is calculated _R，1 +E*X _R，1 +D*W _R，1 +d×e. At block 713, at Bob, a corresponding predictor vector z2=y is calculated _R，2 +E*X _R，2 +D*W _R，2 。

Here, fig. 7 shows that Z1 calculated at Alice contains d×e. In other examples of the present disclosure, d×e may also be included in Zi calculated by Bob, and accordingly, d×e may not be included in Z1 calculated at Alice. In other words, only one of the computed zis at the respective training participants contains d×e.

Alice then sends Z1 to Bob at block 714. At block 715, bob sends Z2 to Alice.

Then, at blocks 716 and 717, the respective training participants sum z=z1+z2 to obtain the secret sharing matrix multiplication result.

Fig. 8 illustrates a flowchart of one example of a trusted-free initializer secret sharing matrix multiplication, according to an embodiment of the present disclosure. In FIG. 8, to train X between parties Alice and Bob _A '*W _B The' calculation process is illustrated as an example.

As shown in FIG. 8, first, at block 801, if X at Alice _A The number of rows of' (hereinafter referred to as the first feature matrix) is not even, and/or the current submodel parameter W at Bob _B The columns of' (hereinafter referred to as the first weight sub-matrix) are not even, then for the first feature matrix X _A ' and/or first weight sub-matrix W _B ' dimension-fill-in processing is performed to enable the first feature matrix X _A The' number of rows is an even number and/or a first weight sub-matrix W _B The column number of' is even. For example, a first feature matrix X _A The end of the row is incremented by a row 0 value and/or the first weight submatrix W _B The' column end is added with one more column 0 value to perform dimension patch processing. In the following description, it is assumed that a first weight sub-matrix W _B ' dimension I.J, and first feature matrix X _A The' dimension is J x K, where J is an even number.

The operations of blocks 802 through 804 are then performed at Alice to obtain a random feature matrix X1, second and third feature matrices X2 and X3. Specifically, at block 802, a random feature matrix X1 is generated. Here, the dimension of the random feature matrix X1 is equal to the first feature matrix X _A The' dimensions are the same, i.e. the dimension of the random feature matrix X1 is J X K. At block 803, the first feature matrix X is subtracted from the random feature matrix X1 _A ' to obtain a second feature matrix X2. The dimension of the second feature matrix X2 is j×k. At block 804, the even row submatrix x1_e of the random feature matrix X1 is subtractedThe odd-numbered row submatrix x1_o of the random feature matrix X1 to obtain a third feature matrix X3. The dimension of the third feature matrix X3 is j×k, where j=j/2.

In addition, the operations of blocks 805 through 807 are performed at Bob to obtain a random weight submatrix W _B1 Second and third weight submatrices W _B2 And W is _B3 . Specifically, at block 805, a random weight submatrix W is generated _i1 . Here, a random weight submatrix W _B1 Dimension and first weight sub-matrix W _B ' the dimensions are identical, i.e. the random weight submatrix W _i1 Is I x J. At block 806, for a first weight sub-matrix W _B ' and random weight submatrix W _B1 Summing to obtain a second weight sub-matrix W _B2 . Second weight sub-matrix W _B2 Is I x J. At block 807, the random weight submatrix W _B1 Odd column submatrix W _{B1_o} Adding a random weight submatrix W _B1 Even row sub-matrix W _{B1_e} To obtain a third weight sub-matrix W _B3 . Third weight submatrix W _B3 Is I x J, where j=j/2.

Alice then sends the generated second and third feature matrices X2 and X3 to Bob at block 808, and Bob sends the second weight submatrix W at block 809 _B2 And a third weight submatrix W _B3 To Alice.

Next, at block 810, at Alice, y1=w based on equation _B2 *(2*X _A '-X1)-W _B3 * (x3+x1_e) to obtain a first matrix product Y1, and at block 812, the first matrix product Y1 is sent to Bob.

At block 811, at Bob, based on equation y2= (W _B '+2*W _B1 )*X2+(W _B3 +W _{B1_o} ) X3 calculates a second matrix product Y2 and, at block 813, sends the second matrix product Y2 to Alice.

Then, at blocks 814 and 815, the first and second matrix products Y1 and Y2 are summed at Alice and Bob, respectively, to obtain X _A '*W _B '＝Y _B ＝Y1+Y2。

Here, fig. 6 to 8 show a calculation process of the current predicted value y=w×x in the linear regression model. In the case of a logistic regression model, W X may be determined according to the procedure shown in fig. 6-8, and then the determined W X is substituted into the logistic regression model y=1/(1+e) ^-wx ) Thereby calculating a current predicted value.

By using the linear/logistic regression model training method disclosed in fig. 3 to 8, model parameters of the linear/logistic regression model can be obtained by training without leakage of secret data of the plurality of training participants, and the workload of model training only has a linear relationship with the number of feature samples used for training, not an exponential relationship, so that the efficiency of model training can be improved under the condition that the safety of the respective data of the plurality of training participants is ensured.

Fig. 9 shows a schematic diagram of an apparatus (hereinafter referred to as a model training apparatus) 900 for co-training a linear/logistic regression model via two training participants according to an embodiment of the present disclosure. Each training party has a sub-model of the linear/logistic regression model, a first training party (Alice) has a first subset of feature samples and a marker value, a second training party (Bob) has a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the feature sample set for model training, the model training device 900 being located on the first training party side.

As shown in fig. 9, the model training apparatus 900 includes a model conversion unit 910, a sample conversion unit 920, a predicted value acquisition unit 930, a prediction difference determination unit 940, a model update amount determination unit 950, a model update amount decomposition unit 960, a model update amount transmission/reception unit 970, a model update unit 980, and a model determination unit 990.

The model conversion unit 910 is configured to perform model conversion processing on the sub-models of the respective training participants to obtain conversion sub-models of the respective training participants. The operation of the model conversion unit 910 may refer to the operation of the block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.

At the time of model training, the sample conversion unit 920, the predicted value acquisition unit 930, the prediction difference determination unit 940, the model update amount determination unit 950, the model update amount decomposition unit 960, the model update amount transmission/reception unit 970, and the model update unit 980 are configured to perform operations in a loop until a loop end condition is satisfied. The cycle end condition may include: reaching a predetermined number of cycles; or the determined prediction difference is within a predetermined range. At the end of the cyclic process, the updated conversion sub-model of each training participant is used as the current conversion sub-model for the next cyclic process.

Specifically, during each iteration, the sample conversion unit 920 is configured to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at the respective training participants. The operation of the sample conversion unit 920 may refer to the procedure described above with reference to fig. 5.

The predictor obtaining unit 930 is configured to obtain a current predictor for the feature sample set using secret sharing matrix multiplication based on the current conversion sub-model and the conversion feature sample subset of the respective training participants. The operation of the predictor obtaining unit 930 may refer to the operation of the block 304 described above with reference to fig. 3 and the operations described with reference to fig. 6 to 8.

The prediction difference determination unit 940 is configured to determine a prediction difference between the current prediction value and the corresponding flag value. The operation of the prediction difference determination unit 940 may refer to the operation of the block 305 described above with reference to fig. 3.

The model update amount determination unit 950 is configured to determine a first model update amount using the prediction difference value and the subset of conversion feature samples at the first training partner. The operation of the model update amount determination unit 950 may refer to the operation of block 306 described above with reference to fig. 3.

The model update amount decomposition unit 960 is configured to decompose a first model update amount into two first partial model update amounts. The operation of the model update amount decomposition unit 960 may refer to the operation of block 307 described above with reference to fig. 3.

The model update amount transmitting/receiving unit 970 is configured to transmit a first partial model update amount to the second training participant, and to receive a second partial model update amount from the second training participant, the second partial model update amount being obtained by decomposing the second model update amount at the second training participant, the second model update amount being obtained by performing secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant. The operation of the model update amount transmission/reception unit 970 may refer to the operation of blocks 308/311 described above with reference to fig. 3.

The model update unit 980 is configured to update the current conversion sub-model at the first training participant based on the remaining first partial model update quantity and the received second partial model update quantity. The operation of the model updating unit 980 may refer to the operation of block 312 described above with reference to fig. 3.

The model determination unit 990 is configured to determine a sub-model of the first training participant based on the conversion sub-model of the first training participant and the second training participant when the loop end condition is satisfied. The operation of the model determination unit 990 may refer to the operation of block 315 described above with reference to fig. 3.

In one example of the present disclosure, the sample conversion unit 920 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample splicing module (not shown). The sample decomposition module is configured to decompose the first feature sample subset into two first partial feature sample subsets. The sample transmitting/receiving module is configured to transmit a first partial feature sample subset to a second training participant and to receive a second partial feature sample subset from the second training participant, the second partial feature sample subset being obtained by decomposing the feature sample subset at the second training participant. The sample stitching module is configured to stitch the remaining first partial feature sample subset and the received second partial feature sample subset to obtain a first converted feature sample subset.

Fig. 10 shows a block diagram of one example of a predicted value acquisition unit (hereinafter referred to as "predicted value acquisition unit 1000") according to an embodiment of the present disclosure. As shown in fig. 10, the predicted value acquisition unit 1000 may include a first calculation module 1010, a second calculation module 1020, a matrix product decomposition module 1030, a matrix product transmission/reception module 1040, a first summation module 1050, and a value transmission/reception module 1060 and a second summation module 1070.

The first calculation module 1010 is configured to calculate a conversion sub-model (W _A ') a subset of conversion feature samples (X) with a first training partner _A ') first matrix product. The operation of the first calculation module 1010 may refer to the operation of block 601 described above with reference to fig. 6.

The second calculation module 1020 is configured to calculate a conversion sub-model (W) of the second training participant using secret sharing matrix multiplication _B ') a subset of conversion feature samples (X) with a first training partner _A ') and a conversion sub-model (W) of the first training partner _A ') with a second training partner (X) _B ') a third matrix product. The operation of the second computing module 1020 may refer to the operation of block 603 described above with reference to fig. 6 and the operation described with reference to fig. 7-8.

The matrix product decomposition module 1030 is configured to decompose the calculated second matrix products to obtain 2 second partial matrix products. The operation of the matrix product decomposition module 1030 may refer to the operation of block 604 described above with reference to fig. 6.

Matrix product transmit/receive module 1040 is configured to transmit a second partial matrix product to the second training participant and to receive a third partial matrix product from the second training participant. The third partial matrix product is obtained by decomposing the third matrix product at the second training partner. The third matrix product is a transformation sub-model (W _A ') with a second training partner (X) _B ') matrix product. The operation of the matrix product transmit/receive module 1040 may refer to the operations of blocks 606 and 607 described above with reference to fig. 6.

The first summing module 1050 is configured to sum the first matrix product, the second partial matrix product, and the third partial matrix product to obtain a first matrix product sum value at the first training participant. The operation of the first summing module 1050 may refer to the operation of block 608 described above with reference to fig. 6.

The sum-value transmitting/receiving module 1060 is configured to receive a second matrix product-sum value (Z _B ) And multiplying a first matrix product obtained at the first training participant by a value (Z _A ) To the second training participant. The operation of the sum value transmit/receive module 1060 may refer to the operation of blocks 610/611 described above with reference to fig. 6.

The second summing module 1070 is configured to sum the resulting first and second matrix product-sum values to obtain a current predicted value of the linear/logistic regression model for the set of feature samples. The operation of the second summing module 1070 may refer to the operation of block 612 described above with reference to fig. 6.

In one example of the present disclosure, the second computing module 1020 may be configured to: computing a conversion sub-model (W) of the second training party using trusted initializer secret sharing matrix multiplication _B ') a subset of conversion feature samples (X) with a first training partner _A ') and a conversion sub-model (W) of the first training partner _A ') with a second training partner (X) _B ') a third matrix product. The operation of the second computing module 1020 may refer to the operation performed at the first training participant a described above with reference to fig. 7.

In another example of the present disclosure, the second computing module 1120 may be configured to: computing a conversion sub-model (W) of the second training party using untrusted initializer secret sharing matrix multiplication _B ') a subset of conversion feature samples (X) with a first training partner _A ') and a conversion sub-model (W) of the first training partner _A ') with a second training partner (X) _B ') a third matrix product. Operation of the second computing module 1020 may be described above with reference to FIG. 8 in the first trainingOperations performed at party a.

Fig. 11 shows a schematic diagram of an apparatus (hereinafter referred to as a model training apparatus) 1100 for co-training a linear/logistic regression model via two training participants according to an embodiment of the present disclosure. Each training party has a sub-model of the linear/logistic regression model, a first training party (Alice) has a first subset of feature samples and a marker value, a second training party (Bob) has a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the feature sample set for model training, the model training device 1100 being located on the second training party side.

As shown in fig. 11, the model training apparatus 1100 includes a model conversion unit 1110, a sample conversion unit 1120, a predicted value acquisition unit 1130, a model update amount reception unit 1140, a model update amount determination unit 1150, a model update amount decomposition unit 1160, a model update amount transmission unit 1170, a model update unit 1180, and a model determination unit 1190.

The model conversion unit 1110 is configured to perform model conversion processing on the sub-models of the respective training participants to obtain conversion sub-models of the respective training participants. The operation of the model conversion unit 1110 may refer to the operation of the block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.

At the time of model training, the sample conversion unit 1120, the predicted value acquisition unit 1130, the model update amount reception unit 1140, the model update amount determination unit 1150, the model update amount decomposition unit 1160, the model update amount transmission unit 1170, and the model update unit 1180 are configured to perform operations in a loop until a loop end condition is satisfied. The cycle end condition may include: reaching a predetermined number of cycles; or the determined prediction difference is within a predetermined range. At the end of the cyclic process, the updated conversion sub-model of each training participant is used as the current conversion sub-model for the next cyclic process.

Specifically, during each iteration, the sample conversion unit 1120 is configured to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at the respective training participants. The operation of the sample conversion unit 1120 may refer to the process described above with reference to fig. 5. Further, the sample conversion unit 1120 may have the same structure as the sample conversion unit 920.

The predictor obtaining unit 1130 is configured to obtain current predictors for the feature sample set using secret sharing matrix multiplication based on the current conversion sub-model and the conversion feature sample subset of the respective training participants. Here, the predicted value acquisition unit 1130 may be configured to obtain the current predicted value for the feature sample set using a trusted initializer secret sharing matrix multiplication or an untrusted initializer secret sharing matrix multiplication. The operation of the predictor obtaining unit 1130 may refer to the operation of the block 304 described above with reference to fig. 3. The predicted value acquisition unit 1130 may employ the same structure as the predicted value acquisition unit 930 (i.e., the structure shown in fig. 10). Accordingly, the second calculation module in the predictor obtaining unit 1130 is configured to calculate a conversion sub-model (W _B ') a subset of conversion feature samples (X) with a first training partner _A ') and a conversion sub-model (W) of the first training partner _A ') with a second training partner (X) _B ') a third matrix product.

The model update amount receiving unit 1140 is configured to receive a first partial model update amount from a first training participant, the first partial model update amount being obtained by decomposing the first model update amount at the first training participant, the first model update amount being determined at the first training participant using a prediction difference value and a subset of conversion feature samples at the first training participant, wherein the prediction difference value is a difference value between a current prediction value and a corresponding marker value. The operation of the model update amount receiving unit 1140 may refer to the operation of block 308 described above with reference to fig. 3.

The second model update amount determination unit 1150 is configured to perform a secret sharing matrix multiplication on the prediction difference and the subset of conversion feature samples at the second training partner to obtain a second model update amount. The operation of the second model update amount determination unit 1150 may refer to the operation of block 309 described above with reference to fig. 3. Here, the second model update amount determination unit 1150 may be implemented using the second calculation module 1020 described in fig. 10. That is, the second model update amount determination unit 1150 may be configured to perform a trusted initializer secret sharing matrix multiplication or a non-trusted initializer secret sharing matrix multiplication on the prediction difference and the subset of conversion feature samples at the second training partner to obtain the second model update amount.

The model update amount decomposition unit 1160 is configured to decompose the second model update amount into two second partial model update amounts. The operation of the model update amount decomposition unit 1160 may refer to the operation of block 310 described above with reference to fig. 3.

The model update amount transmitting unit 1170 is configured to transmit a second partial model update amount to the first training participant. The operation of the model update amount transmission unit 1170 may refer to the operation of the block 311 described above with reference to fig. 3.

The model update unit 1180 is configured to update the current conversion sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount. The operation of the model update unit 1180 may refer to the operation of block 313 described above with reference to fig. 3.

The model determination unit 1190 is configured to determine a sub-model of the second training participant based on the conversion sub-models of the first training participant and the second training participant when the loop end condition is satisfied. The operation of the model determination unit 1190 may refer to the operation of block 315 described above with reference to fig. 3.

Embodiments of model training methods, apparatus, and systems according to the present disclosure are described above with reference to fig. 1-11. The above model training apparatus may be implemented in hardware, or may be implemented in software or a combination of hardware and software.

Fig. 12 illustrates a hardware block diagram of a computing device 1200 for implementing co-training a linear/logistic regression model via two training participants, according to an embodiment of the disclosure. As shown in fig. 12, computing device 1200 may include at least one processor 1210, memory (e.g., non-volatile memory) 1220, memory 1230, and communication interface 1240, with at least one processor 1210, memory 1220, memory 1230, and communication interface 1240 connected together via bus 1260. The at least one processor 1210 executes at least one computer readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in memory that, when executed, cause the at least one processor 1210 to: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at each training participant; obtaining a current predicted value for the feature sample set using secret sharing matrix multiplication based on the current conversion sub-model and the conversion feature sample subset of each training participant; determining a prediction difference value between the current prediction value and the corresponding marking value; determining a first model update amount using the prediction difference and a subset of conversion feature samples at the first training partner; decomposing the first model updating quantity into two first partial model updating quantities, and transmitting one first partial model updating quantity to a second training participant; and receiving a second partial model update from the second training participant, the second partial model update resulting from decomposing the second model update at the second training participant, the second model update resulting from performing a secret sharing matrix multiplication on the prediction difference and the subset of conversion feature samples at the second training participant; updating the current conversion sub-model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount, wherein, when the cyclic process is not over, the updated conversion sub-model of the respective training participant is used as the current conversion sub-model of the next cyclic process; when the loop end condition is satisfied, a sub-model of the first training participant is determined based on the conversion sub-models of the first training participant and the second training participant.

It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1210 to perform the various operations and functions described above in connection with fig. 1-11 in various embodiments of the present disclosure.

Fig. 13 illustrates a hardware block diagram of a computing device 1300 for implementing co-training a linear/logistic regression model via two training participants, according to an embodiment of the disclosure. As shown in fig. 13, computing device 1300 may include at least one processor 1310, memory (e.g., non-volatile memory) 1320, memory 1330, and communication interface 1340, with at least one processor 1310, memory 1320, memory 1330, and communication interface 1340 being connected together via a bus 1360. At least one processor 1310 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in memory that, when executed, cause the at least one processor 1310 to: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at each training participant; obtaining a current predicted value for the feature sample set using secret sharing matrix multiplication based on the current conversion sub-model and the conversion feature sample subset of each training participant; receiving a first partial model update from a first training participant, the first partial model update resulting from decomposing the first model update at the first training participant, the first model update determined at the first training participant using a prediction difference and a subset of conversion feature samples at the first training participant, wherein the prediction difference is a difference between a current prediction value and a corresponding marker value; performing a secret sharing matrix multiplication on the prediction difference and the subset of conversion feature samples at the second training partner to obtain a second model update amount; decomposing the second model updating quantity into two second partial model updating quantities, and transmitting one second partial model updating quantity to the first training participant; and updating the current conversion sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount, wherein, when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process; and determining a sub-model of the second training party based on the conversion sub-model of the first training party and the second training party when the cycle end condition is satisfied.

It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1310 to perform the various operations and functions described above in connection with fig. 1-11 in various embodiments of the present disclosure.

According to one embodiment, a program product such as a machine-readable medium (e.g., a non-transitory machine-readable medium) is provided. The machine-readable medium may have instructions (i.e., the elements described above implemented in software) that, when executed by a machine, cause the machine to perform the various operations and functions described above in connection with fig. 1-11 in various embodiments of the disclosure. In particular, a system or apparatus provided with a readable storage medium having stored thereon software program code implementing the functions of any of the above embodiments may be provided, and a computer or processor of the system or apparatus may be caused to read out and execute instructions stored in the readable storage medium.

In this case, the program code itself read from the readable medium may implement the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.

Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or cloud by a communications network.

It will be appreciated by those skilled in the art that various changes and modifications can be made to the embodiments disclosed above without departing from the spirit of the invention. Accordingly, the scope of the invention should be limited only by the attached claims.

It should be noted that not all the steps and units in the above flowcharts and the system configuration diagrams are necessary, and some steps or units may be omitted according to actual needs. The order of execution of the steps is not fixed and may be determined as desired. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented jointly by some components in multiple independent devices.

In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may include permanently dedicated circuitry or logic (e.g., a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware unit or processor may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The particular implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

The detailed description set forth above in connection with the appended drawings describes exemplary embodiments, but does not represent all embodiments that may be implemented or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and a signature value, the second training participant having a second subset of feature samples, the first and second subsets of feature samples obtained by vertically slicing the feature sample sets, the method performed by the first training participant, the method comprising:

carrying out model conversion processing on the sub-models of all training participants together with the second training participants to obtain conversion sub-models of all training participants;

the following loop process is performed until the loop end condition is satisfied:

performing vertical-horizontal segmentation conversion on the feature sample subsets with the second training participants to obtain converted feature sample subsets at the training participants;

along with the second training participants, obtaining current predictions for the feature sample sets using secret sharing matrix multiplication based on the current conversion sub-model and the conversion feature sample subsets of the respective training participants;

Determining a prediction difference between the current predicted value and a corresponding marking value;

determining a first model update amount using the prediction difference and a subset of conversion feature samples at the first training participant;

decomposing the first model updating amount into two first partial model updating amounts, and transmitting one first partial model updating amount to the second training participant; and

receiving a second partial model update from the second training participant, the second partial model update resulting from decomposing a second model update at the second training participant, the second model update resulting from performing a secret sharing matrix multiplication on the prediction difference and a subset of the conversion feature samples at the second training participant;

updating a current conversion sub-model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount, wherein, when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model for the next cyclic process;

determining a sub-model of the first training participant based on the conversion sub-model of the first training participant and the second training participant when the cycle end condition is satisfied,

Wherein, with the second training participants, performing model conversion processing on the sub-models of the training participants to obtain conversion sub-models of the training participants includes:

decomposing the sub-model of the first training participant into two first partial sub-models;

transmitting a first partial sub-model to the second training participant and receiving a second partial sub-model from the second training participant, the second partial sub-model being obtained by decomposing the sub-model at the second training participant; and

stitching the remaining first part sub-model and the received second part sub-model to obtain a conversion sub-model at the first training partner,

wherein performing vertical-to-horizontal segmentation conversion on the respective feature sample subsets with the second training participants to obtain converted feature sample subsets at the respective training participants comprises:

decomposing the first feature sample subset into two first partial feature sample subsets;

transmitting a first subset of partial feature samples to the second training participant;

receiving a second subset of partial feature samples from the second training participant, the second subset of partial feature samples resulting from decomposing the second subset of feature samples at the second training participant; and

The remaining first partial feature sample subset and the received second partial feature sample subset are stitched to obtain a first converted feature sample subset at the first training participant.

2. The method of claim 1, wherein, with the second training participant, using secret shared matrix multiplication to obtain current predictions for the feature sample set based on the current conversion sub-model and the conversion feature sample subset of the respective training participant comprises:

performing matrix product calculation by using the current conversion sub-model and the first conversion characteristic sample subset to obtain a local matrix product at the first training participant;

performing, with the second training participant, a secret sharing matrix multiplication calculation using the first subset of conversion feature samples and a current conversion sub-model at the second training participant, resulting in a secret sharing matrix product at the first training participant;

decomposing the secret sharing matrix product at the first training participant into two first partial secret sharing matrix products, and transmitting one first partial secret sharing matrix product to the second training participant;

Receiving a second partial secret sharing matrix product from the second training participant, the second partial secret sharing matrix product being obtained by decomposing the resulting secret sharing matrix product at the second training participant, the secret sharing matrix product at the second training participant being obtained by performing a secret sharing matrix multiplication calculation using a second subset of conversion feature samples at the second training participant with the current conversion sub-model at the first training participant;

summing the local matrix product at the first training participant, the remaining first partial secret sharing matrix product and the received second partial secret sharing matrix product to obtain a partial current predicted value at the first training participant;

transmitting a portion of the current predicted value at the first training participant to the second training participant, and receiving a portion of the current predicted value at the second training participant from the second training participant, the portion of the current predicted value at the second training participant being obtained by summing, at the second training participant, a local matrix product at the second training participant, a remaining second partial secret shared matrix product, and the received first partial secret shared matrix product, the local matrix product at the second training participant being obtained by matrix product calculation at the second training participant using a current conversion sub-model at the second training participant and the second conversion feature sample subset; and

And summing the partial current predicted value at the first training participant and the received partial current predicted value at the second training participant to obtain the current predicted value for the characteristic sample set.

3. The method of claim 2, wherein the first subset of conversion feature samples is characterized as a feature matrix, the current conversion sub-model at the second training participant is characterized as a weight vector,

performing, with the second training participant, a secret sharing matrix multiplication calculation using the first subset of conversion feature samples and a current conversion sub-model at the second training participant, the deriving a secret sharing matrix product at the first training participant comprising:

each training participant receives the random weight vector, the random feature matrix and the random marker value vector from the trusted initializer;

decomposing the feature matrix into two feature submatrices, and transmitting one feature submatrix to the second training participant;

receiving a weight sub-vector from the second training participant, the weight sub-vector being obtained by decomposing the weight vector at the second training participant;

At each training participant, determining a weight sub-vector difference value and a feature sub-matrix difference value at the training participant based on the weight sub-vector, the corresponding feature sub-matrix, and the received random weight vector and random feature matrix of each training participant;

sharing the respectively determined weight sub-vector differences and feature sub-matrix differences between the first and second training participants;

summing the weight sub-vector difference value and the feature sub-matrix difference value at each training participant to obtain a weight sub-vector total difference value and a feature sub-matrix total difference value, and calculating respective predicted value vectors based on the received random weight vector, random feature matrix, random marker value vector, weight sub-vector total difference value and feature sub-matrix total difference value;

receiving, from the second training participant, a vector of predicted values determined at the second training participant; and

and summing the predicted value vectors determined by the first training participant and the second training participant to obtain a secret sharing matrix product at the first training participant.

4. The method of claim 2, wherein the first subset of conversion feature samples is characterized as a first feature matrix X, the current conversion sub-model at the second training participant is characterized as a first weight sub-matrix Wi,

if the number of rows of the first feature matrix X is not even, performing dimension filling processing on the first feature matrix X at the first training participant so that the number of rows of the first feature matrix X after dimension filling is even, and/or if the number of columns of the first weight sub-matrix Wi is not even, performing dimension filling processing on the first weight sub-matrix Wi at the second training participant so that the number of columns of the first weight sub-matrix Wi after dimension filling is even;

generating a random feature matrix X1;

subtracting the first feature matrix X from the random feature matrix X1 to obtain a second feature matrix X2;

even row submatrix of random feature matrix X1Odd row submatrix minus random feature matrix X1>Obtaining a third feature matrix X3;

transmitting the generated second and third feature matrices X2 and X3 to the second training participant, and receiving a second weight sub-matrix from the second training participant And a third weight submatrix->Wherein said second weight sub-matrix +.>By a first weight sub-matrix Wi and a random weight sub-matrix for the second training partner>Is summed and said third weight sub-matrix +.>By sub-matrix of said random weights +.>Adding the odd column submatrices to the random weight submatrices>Obtained by the even column sub-matrix of (a);

based on the equationPerforming matrix calculation to obtain a first matrix product Y1, and transmitting the first matrix product Y1 to the second training participant;

receiving a second matrix product Y2 from the second training participant, the second matrix product Y2 based on an equationTo perform matrix calculations at the second training partner, wherein +.>Is the random weight submatrix +.>Odd column sub-matrices of (a); and

the first matrix product Y1 and the second matrix product Y2 are summed to obtain a secret shared matrix product at the first training participant.

5. The method of any one of claims 1 to 4, wherein the cycle end condition comprises:

a predetermined number of cycles; or alternatively

The predicted difference is within a predetermined range.

6. A method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and a signature value, the second training participant having a second subset of feature samples, the first and second subsets of feature samples obtained by vertically slicing the feature sample sets, the method performed by the second training participant, the method comprising:

carrying out model conversion processing on the sub-models of all training participants together with the first training participants to obtain conversion sub-models of all training participants;

performing vertical-horizontal segmentation conversion on the feature sample subsets with the first training participants to obtain converted feature sample subsets at the training participants;

along with the first training participant, obtaining a current prediction value for a feature sample set using secret sharing matrix multiplication based on a current conversion sub-model and a conversion feature sample subset of each training participant;

Receiving a first partial model update from the first training participant, the first partial model update resulting from decomposing a first model update at the first training participant, the first model update determined at the first training participant using a predicted difference value and a subset of conversion feature samples at the first training participant, wherein the predicted difference value is a difference value between the current predicted value and a corresponding marker value;

performing a secret sharing matrix multiplication on the prediction difference and a subset of conversion feature samples at the second training participant to obtain a second model update amount;

decomposing the second model updating amount into two second partial model updating amounts, and transmitting one second partial model updating amount to the first training participant; and

updating the current conversion sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount, wherein, when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process;

Determining a sub-model of the second training party based on the conversion sub-model of the first training party and the second training party when the cycle end condition is satisfied,

the model conversion processing is performed on the sub-model of each training participant together with the first training participant, so as to obtain a conversion sub-model of each training participant, which comprises the following steps:

decomposing the sub-model of the second training participant into two second partial sub-models;

transmitting a second part of the sub-model to the first training participant and receiving a first part of the sub-model from the first training participant, the first part of the sub-model being obtained by decomposing the sub-model at the first training participant; and

stitching the remaining second part sub-model and the received first part sub-model to obtain a conversion sub-model at the second training partner,

wherein performing vertical-to-horizontal segmentation conversion on the respective feature sample subsets with the first training participant to obtain converted feature sample subsets at the respective training participants comprises:

Decomposing the second feature sample subset into two second partial feature sample subsets;

transmitting a second subset of partial feature samples to the first training participant;

receiving a first subset of partial feature samples from the first training participant, the first subset of partial feature samples resulting from decomposing the subset of feature samples at the first training participant; and

and splicing the remaining second partial feature sample subset and the received first partial feature sample subset to obtain a second conversion feature sample subset at the second training participant.

7. The method of claim 6, wherein, with the first training participant, using secret shared matrix multiplication to obtain current predictions for the feature sample set based on the current conversion sub-model and the conversion feature sample subset of the respective training participant comprises:

performing matrix product calculation by using the current conversion submodel of the second training participant and the second conversion characteristic sample subset to obtain a local matrix product at the second training participant;

performing, with the first training participant, a secret sharing matrix multiplication calculation using the second subset of conversion feature samples and a current conversion sub-model at the first training participant, resulting in a secret sharing matrix product at the second training participant;

Decomposing the secret sharing matrix product at the second training participant into two second partial secret sharing matrix products, and transmitting one second partial secret sharing matrix product to the first training participant;

receiving a first partial secret sharing matrix product from the first training participant, the first partial secret sharing matrix product being obtained by decomposing the resulting secret sharing matrix product at the first training participant, the secret sharing matrix product at the first training participant being obtained by performing a secret sharing matrix multiplication calculation using a first subset of conversion feature samples at the first training participant and a current conversion sub-model at the second training participant;

summing the local matrix product at the second training participant, the remaining second partial secret sharing matrix product and the received first partial secret sharing matrix product to obtain a partial current predicted value at the second training participant;

transmitting a portion of the current predicted value at the second training participant to the first training participant, and receiving a portion of the current predicted value at the first training participant from the first training participant, the portion of the current predicted value at the first training participant being obtained by summing, at the first training participant, a local matrix product at the first training participant, a remaining first partial secret shared matrix product, and the received second partial secret shared matrix product, the local matrix product at the first training participant being obtained by matrix product calculation at the first training participant using a current conversion sub-model at the first training participant and the first conversion feature sample subset; and

And summing the partial current predicted value at the second training party and the received partial current predicted value at the first training party to obtain the current predicted value aiming at the characteristic sample set.

8. The method of claim 7, wherein the second subset of conversion feature samples is characterized as a feature matrix, the current conversion sub-model at the first training participant is characterized as a weight vector,

performing, with the first training participant, a secret sharing matrix multiplication computation using the second subset of conversion feature samples and a current conversion sub-model at the first training participant, the deriving a secret sharing matrix product at the second training participant comprising:

decomposing the feature matrix into two feature submatrices, and transmitting one feature submatrix to the first training participant;

receiving a weight sub-vector from the first training participant, the weight sub-vector being obtained by decomposing a weight matrix possessed by the first training participant;

receiving, from the first training participant, a vector of predictors determined at the first training participant; and

and summing the predicted value vectors determined by the first training party and the second training party to obtain a secret sharing matrix product at the second training party.

9. The method of claim 7, wherein the second subset of conversion feature samples is characterized as a first feature matrix X, the current conversion sub-model at the first training participant is characterized as a first weight sub-matrix Wi,

if the number of rows of the first feature matrix X is not even, performing dimension filling processing on the first feature matrix X at the second training participant so that the number of rows of the first feature matrix X after dimension filling is even, and/or if the number of columns of the first weight sub-matrix Wi is not even, performing dimension filling processing on the first weight sub-matrix Wi at the first training participant so that the number of columns of the first weight sub-matrix Wi after dimension filling is even;

generating a random feature matrix X1;

transmitting the generated second and third feature matrices X2 and X3 to the first training participant, and receiving a second weight sub-matrix from the first training participant And a third weight submatrix->Wherein said second weight sub-matrix +.>By a first weight sub-matrix Wi and a random weight sub-matrix for said first training partner>Is summed and said third weight sub-matrix +.>By sub-matrix of said random weights +.>Is of (1)Several columns of submatrices plus the random weight submatrices +.>Obtained by the even column sub-matrix of (a);

based on the equationPerforming matrix calculation to obtain a first matrix product Y1, and transmitting the first matrix product Y1 to the first training participant;

receiving a second matrix product Y2 from the first training participant, the second matrix product Y2 based on an equationIs calculated by matrix calculation at the first training partner, wherein +_>Is the random weight submatrix +.>Odd column sub-matrices of (a); and

the first matrix product Y1 and the second matrix product Y2 are summed to obtain a secret shared matrix product at the second training participant.

10. An apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and a signature value, the second training participant having a second subset of feature samples, the first and second subsets of feature samples obtained by vertically slicing the feature sample sets, the apparatus being located on the first training participant side, the apparatus comprising:

The model conversion unit is configured to perform model conversion processing on the sub-models of the training participants together with the second training participants so as to obtain conversion sub-models of the training participants;

a sample conversion unit configured to perform vertical-horizontal segmentation conversion on respective feature sample subsets together with the second training participants to obtain converted feature sample subsets at the respective training participants;

a predicted value acquisition unit configured to obtain, with the second training participants, current predicted values for the feature sample sets using secret sharing matrix multiplication based on the current conversion sub-model and the conversion feature sample subsets of the respective training participants;

a prediction difference value determining unit configured to determine a prediction difference value between the current prediction value and a corresponding flag value;

a model update amount determination unit configured to determine a first model update amount using the prediction difference value and a subset of conversion feature samples at a first training participant;

a model update amount decomposition unit configured to decompose the first model update amount into two first partial model update amounts;

a model update amount transmitting/receiving unit configured to transmit a first partial model update amount to the second training participant, and to receive a second partial model update amount from the second training participant, the second partial model update amount being obtained by decomposing a second model update amount at the second training participant, the second model update amount being obtained by performing a secret sharing matrix multiplication on the prediction difference value and a conversion feature sample subset of the second training participant;

A model updating unit configured to update a current conversion sub-model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount; and

a model determination unit configured to determine a sub-model of the first training participant based on the conversion sub-model of the first training participant and the second training participant when a cycle end condition is satisfied,

wherein the sample conversion unit, the predicted value acquisition unit, the predicted difference value determination unit, the model update amount decomposition unit, the model update amount transmission/reception unit, and the model update unit cyclically execute operations until the cycle end condition is satisfied,

wherein, when the cyclic process is not finished, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process,

wherein the model conversion unit is configured to:

wherein the sample conversion unit includes:

a sample decomposition module configured to decompose the first feature sample subset into two first partial feature sample subsets;

a sample transmitting/receiving module configured to transmit a first subset of partial feature samples to the second training participant and to receive a second subset of partial feature samples from the second training participant, the second subset of partial feature samples being obtained by decomposing the subset of feature samples at the second training participant; and

a sample stitching module configured to stitch the remaining first partial feature sample subset and the received second partial feature sample subset to obtain a first converted feature sample subset at the first training participant.

11. The apparatus of claim 10, wherein the predictor obtaining unit is configured to:

12. An apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and a signature value, the second training participant having a second subset of feature samples, the first and second subsets of feature samples obtained by vertically slicing the feature sample sets, the apparatus being located on the second training participant side, the apparatus comprising:

The model conversion unit is configured to perform model conversion processing on the sub-models of the training participants together with the first training participants so as to obtain conversion sub-models of the training participants;

a sample conversion unit configured to perform vertical-horizontal segmentation conversion on respective feature sample subsets together with the first training participants to obtain converted feature sample subsets at the respective training participants;

a predicted value acquisition unit configured to obtain, with the first training participants, current predicted values for a feature sample set using secret sharing matrix multiplication based on a current conversion sub-model and a conversion feature sample subset of each training participant;

a model update amount receiving unit configured to receive a first partial model update amount from the first training participant, the first partial model update amount being obtained by decomposing a first model update amount at the first training participant, the first model update amount being determined at the first training participant using a prediction difference value and a subset of conversion feature samples at the first training participant, wherein the prediction difference value is a difference value between the current prediction value and a corresponding flag value;

A second model update amount determination unit configured to perform a secret sharing matrix multiplication on the prediction difference value and a subset of conversion feature samples at the second training participant to obtain a second model update amount;

a model update amount decomposition unit configured to decompose the second model update amount into two second partial model update amounts;

a model update amount transmitting unit configured to transmit a second partial model update amount to the first training participant;

a model updating unit configured to update a current conversion sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount; and

a model determination unit configured to determine a sub-model of the second training participant based on the conversion sub-models of the first training participant and the second training participant when a cycle end condition is satisfied,

wherein the sample conversion unit, the predicted value acquisition unit, the model update amount reception unit, the model update amount determination unit, the model update amount decomposition unit, the model update amount transmission unit, and the model update unit cyclically execute operations until the cycle end condition is satisfied,

wherein the model conversion unit is configured to:

wherein the sample conversion unit includes:

a sample decomposition module configured to decompose the second subset of feature samples into two second partial subsets of feature samples;

a sample transmitting/receiving module configured to transmit a second subset of partial feature samples to the first training participant and to receive a first subset of partial feature samples from the first training participant, the first subset of partial feature samples being obtained by decomposing the subset of feature samples at the first training participant; and

A sample stitching module configured to stitch the remaining second partial feature sample subset and the received first partial feature sample subset to obtain a second converted feature sample subset at the second training participant.

13. The apparatus of claim 12, wherein the predictor obtaining unit is configured to:

14. A system for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and a signature value, the second training participant having a second subset of feature samples, the first and second subsets of feature samples obtained by vertically slicing the feature sample sets, the system comprising:

a first training participant device comprising the apparatus of claim 10 or 11; and

a second training participant device comprising the apparatus of any of claims 12 or 13.

15. A computing device, comprising:

at least one processor, and

a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-5.

16. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 5.

17. A computing device, comprising:

at least one processor, and

a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 6 to 9.

18. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 6 to 9.