CN112183565B - Model training method, device and system - Google Patents

Model training method, device and system Download PDF

Info

Publication number
CN112183565B
CN112183565B CN201910600908.9A CN201910600908A CN112183565B CN 112183565 B CN112183565 B CN 112183565B CN 201910600908 A CN201910600908 A CN 201910600908A CN 112183565 B CN112183565 B CN 112183565B
Authority
CN
China
Prior art keywords
training
model
sub
training participant
participant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910600908.9A
Other languages
Chinese (zh)
Other versions
CN112183565A (en
Inventor
陈超超
李梁
王力
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910600908.9A priority Critical patent/CN112183565B/en
Publication of CN112183565A publication Critical patent/CN112183565A/en
Application granted granted Critical
Publication of CN112183565B publication Critical patent/CN112183565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/085Secret sharing or secret splitting, e.g. threshold schemes

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides methods and apparatus for training a logistic regression model. In the method, the sub-models of the training participants are subjected to model conversion processing to obtain corresponding conversion sub-models. The following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset of each training participant; and calculating the matrix product of the converted logistic regression model and each conversion characteristic sample subset, and obtaining the current predicted value of each training participant based on the matrix product. The marker values at the first training participants are decomposed into a first number of partial marker values and one partial marker value is sent to each second training participant. At each training participant, a respective prediction difference value and model update amount are determined and a conversion sub-model update is performed. When the cycle end condition is satisfied, a sub-model of each training participant is determined based on the conversion sub-model of each training participant.

Description

Model training method, device and system
Technical Field
The present disclosure relates generally to the field of machine learning, and more particularly, to methods, apparatus, and systems for collaborative training of logistic regression models via multiple training participants using a horizontally segmented training set.
Background
The logistic regression model is a regression/classification model widely used in the field of machine learning. In many cases, multiple model training participants (e.g., e-commerce companies, courier companies, and banks) each have different portions of data of the feature samples used to train the logistic regression model. The model training participants typically want to collectively use each other's data to train a logistic regression model, but do not want to provide their respective data to other individual model training participants to prevent their own data from being compromised.
In view of this situation, a machine learning method capable of protecting data security is proposed, which is capable of training a logistic regression model in cooperation with a plurality of model training participants for use by the plurality of model training participants, while guaranteeing respective data security of the plurality of model training participants. However, existing machine learning methods capable of securing data are less efficient in model training.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a method, apparatus, and system for collaborative training of a logistic regression model via multiple training participants, which can improve the efficiency of model training while ensuring the security of the respective data of the multiple training participants.
According to one aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first feature sample subset and a marker value, each second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing a feature sample set for model training, the second number being equal to the first number minus one, the method being performed by the first training participant, the method comprising: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner using secret sharing matrix multiplication; decomposing the marking value into the first number of partial marking values, and respectively transmitting each of the second number of partial marking values to a corresponding second training participant; determining a current predicted value at the first training participant based on a matrix product at the first training participant; determining a predicted difference between the current predicted value of the first training participant and the corresponding partial marker value; determining a model update amount at the first training participant based on the converted feature sample set and a predicted difference value at the first training participant; updating the conversion sub-model of the first training participant based on the current conversion sub-model of the first training participant and a corresponding model update amount, wherein when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process; and determining a sub-model of the first training participant based on the conversion sub-model of each training participant when the cycle end condition is satisfied.
According to another aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first feature sample subset and a marker value, each second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing a feature sample set for model training, the second number being equal to the first number minus one, the method being performed by the second training participant, the method comprising: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the second training partner using secret sharing matrix multiplication; receiving a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant; determining a current predicted value at the second training participant based on a matrix product at the second training participant; determining a predicted difference value at the second training participant using the current predicted value and the received partial marker value of the second training participant; obtaining a model update amount at the second training participant using secret sharing matrix multiplication based on the converted feature sample set and a predicted difference value of the second training participant; updating the conversion sub-model of the second training participant based on the current conversion sub-model of the second training participant and a corresponding model update amount, wherein when a cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process; and determining a sub-model of the second training participant based on the conversion sub-model of each training participant when the cycle end condition is satisfied.
According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first feature sample subset and a marker value, each second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing a feature sample set for model training, the second number being equal to the first number minus one, the apparatus being located on the first training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the sub-models of the training participants to obtain conversion sub-models of the training participants; a sample conversion unit configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at each training participant; a matrix product acquisition unit configured to obtain matrix products between the logistic regression model and a subset of conversion feature samples at the first training partner using secret sharing matrix multiplication; a flag value decomposition unit configured to decompose the flag value into the first number of partial flag values; a marker value transmitting unit configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively; a predictor determination unit configured to determine a current predictor at the first training participant based on a matrix product at the first training participant; a prediction difference determining unit configured to determine a prediction difference between a current predicted value of the first training participant and a corresponding partial marker value; a model update amount determination unit configured to determine a model update amount at the first training participant based on the converted feature sample set and a prediction difference value at the first training participant; a model updating unit configured to update a conversion sub-model of the first training participant based on a current conversion sub-model of the first training participant and a corresponding model update amount; and a model determining unit configured to determine a sub-model of the first training participant based on the conversion sub-model of each training participant when the cycle end condition is satisfied, wherein the sample converting unit, the matrix product acquiring unit, the flag value decomposing unit, the flag value transmitting unit, the predicted value determining unit, the predicted difference determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically execute operations until the cycle end condition is satisfied, wherein the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process when a cycle process is not ended.
According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first feature sample subset and a marker value, each second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing a feature sample set for model training, the second number being equal to the first number minus one, the apparatus being located on the second training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the sub-models of the training participants to obtain conversion sub-models of the training participants; a sample conversion unit configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at each training participant; a matrix product acquisition unit configured to obtain matrix products between the model-converted logistic regression model and the subset of conversion feature samples at the second training partner using secret sharing matrix multiplication; a marker value receiving unit configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant; a predictor determination unit configured to determine a current predictor at the second training participant based on a matrix product at the second training participant; a prediction difference determination unit configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value; a model update amount determination unit configured to obtain a model update amount of the second training participant using secret sharing matrix multiplication based on the converted feature sample set and a predicted difference value of the second training participant; a model updating unit configured to update a conversion sub-model of the second training participant based on a current conversion sub-model of the second training participant and a corresponding model update amount; and a model determining unit configured to determine a sub-model of the second training participant based on the conversion sub-model of the respective training participants when the cycle end condition is satisfied, wherein the sample converting unit, the matrix product acquiring unit, the flag value receiving unit, the predicted value determining unit, the predicted difference determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically execute operations until the cycle end condition is satisfied, wherein when the cycle process is not ended, the updated conversion sub-model of the respective training participants is used as a current conversion sub-model of a next cycle process.
According to another aspect of the present disclosure, there is provided a system for co-training a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the system comprising: a first training participant device comprising means as described above on the first training participant side; and a second number of second training participant devices, each second training participant device comprising means as described above on the second training participant side, wherein each training participant has a sub-model, the first training participant has a first subset of feature samples and a marker value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the set of feature samples for model training, the second number being equal to the first number minus one.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the training method performed on the first training participant side as described above.
According to another aspect of the disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method performed on a first training participant side as described above.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the training method performed on the second training participant side as described above.
According to another aspect of the disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method performed on a second training participant side as described above.
By utilizing the scheme of the embodiment of the disclosure, the model parameters of the logistic regression model can be obtained by training under the condition that the secret data of the plurality of training participants are not leaked, and the workload of model training is only in a linear relation, but not an exponential relation with the number of the characteristic samples used by training.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.
FIG. 1 illustrates a schematic diagram of an example of vertically sliced data in accordance with an embodiment of the present disclosure;
FIG. 2 illustrates an architectural diagram showing a system for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the present disclosure;
FIG. 3 illustrates a flowchart of a method for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure;
FIG. 4 illustrates a flow chart of a model conversion process according to an embodiment of the present disclosure;
FIG. 5 illustrates a flow chart of a feature sample set conversion process according to an embodiment of the present disclosure;
FIG. 6 illustrates a flow chart of a process of performing a secret sharing matrix multiplication with a trusted initializer on a current sub-model of each training participant and a subset of conversion feature samples of the training participant, in accordance with an embodiment of the present disclosure;
FIG. 7 illustrates a flow chart of a process of performing a secret sharing matrix multiplication of an untrusted initialization party on a current sub-model of each training party and a set of conversion feature samples of the training initiator, according to an embodiment of the present disclosure;
FIG. 8 illustrates a flow chart of one example of a non-trusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;
FIG. 9 illustrates a block diagram of an apparatus for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure;
FIG. 10 illustrates a block diagram of an apparatus for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure;
FIG. 11 illustrates a schematic diagram of a computing device for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure;
FIG. 12 shows a schematic diagram of a computing device for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be appreciated that these embodiments are discussed only to enable a person skilled in the art to better understand and thereby practice the subject matter described herein, and are not limiting of the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, replace, or add various procedures or components as desired. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may be combined in other examples as well.
As used herein, the term "comprising" and variations thereof mean open-ended terms, meaning "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment. The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. Unless the context clearly indicates otherwise, the definition of a term is consistent throughout this specification.
Secret sharing methods are cryptographic techniques that decompose a secret into stored secret shares, each of which is owned and managed by one of multiple parties, and a single party cannot recover the complete secret, only if several parties cooperate together. The secret sharing method aims at preventing the secret from being too concentrated so as to achieve the purposes of dispersing risks and tolerating intrusion.
Secret sharing methods can be broadly divided into two categories: there is a trusted initializer (trust initializier) secret sharing method and an untrusted initializer secret sharing method. In the secret sharing method with a trusted initializer, the trusted initializer is required to perform parameter initialization (often to generate a random number satisfying a certain condition) for each participant participating in the multiparty security calculation. After the initialization is completed, the trusted initializing party destroys the data and disappears at the same time, and the data is not needed in the following multiparty security calculation process.
The trusted initializer secret sharing matrix multiplication is applicable to the following situations: the complete secret data is the product of the first set of secret shares and the second set of secret shares, and each of the participants has one of the first set of secret shares and one of the second set of secret shares. Through the secret sharing matrix multiplication with the trusted initializing party, each party in the plurality of parties can obtain partial complete secret data of complete secret data, the sum of the partial complete secret data obtained by each party is the complete secret data, and each party discloses the obtained partial complete secret data to other parties, so that each party can obtain the complete secret data without disclosing the secret share owned by each party, and the safety of the data of each party is ensured.
The non-trusted initializer secret sharing matrix multiplication is one of the secret sharing methods. The secret sharing matrix multiplication without trusted initializers is applicable in case the complete secret is the product of the first secret share and the second secret share and both parties have the first secret share and the second secret share, respectively. By secret sharing matrix multiplication without trusted initializers, each of the two parties that own the respective secret shares generates and discloses data that is different from the secret shares it owns, but the sum of the data that the two parties each disclose is equal to the product of the secret shares that the two parties each own (i.e. the complete secret). Thus, the parties can recover the complete secret by the secret sharing matrix multiplication cooperative work of the trusted initializing party without disclosing the secret shares owned by the parties, which ensures the security of the data of the parties.
In the present disclosure, the training sample set used in the logistic regression model training scheme is a vertically segmented training sample set. The term "vertically slicing a training sample set" refers to slicing the training sample set into a plurality of training sample subsets according to a module/function (or some specified rule), each training sample subset containing a portion of the training subsamples of each training sample in the training sample set, all of the training subsamples contained in the training sample subset comprising the training sample. In one example, assume that the training sample includes a tag y 0 And attributes
Figure BDA0002119246020000081
Then after vertical segmentation, the training participant Alice owns y of the training sample 0 And->
Figure BDA0002119246020000082
Training participant Bob owns the training sample +.>
Figure BDA0002119246020000083
In another example, assume that the training sample includes tag y 0 And attribute->
Figure BDA0002119246020000084
Figure BDA0002119246020000085
Then after vertical segmentation, the training participant Alice owns y of the training sample 0 And->
Figure BDA0002119246020000086
Training participant Bob owns the training sample +.>
Figure BDA0002119246020000087
And->
Figure BDA0002119246020000088
In addition to these two examples, there are other possibilities, not listed here.
Let us assume that a sample x of attribute values described by d attributes (also called features) is given T =(x 1 ;x 2 ;…;x d ) Wherein x is i Is that the value of x on the ith attribute and T represents the transpose, then the logistic regression model is Y=1/(1+e) -wx ) Where Y is the predicted value, and W is the model parameters of the logistic regression model (i.e., the model described in this disclosure),
Figure BDA0002119246020000091
W P refers to the sub-model at each training partner P in this disclosure. In this disclosure, attribute value samples are also referred to as feature data samples.
In this disclosure, each training participant has different portions of data of the training samples used to train the logistic regression model. For example, assuming that a training sample set includes 100 training samples, each training sample containing a plurality of eigenvalues and labeled actual values, for example, the data possessed by a first participant may be a partial eigenvalue and labeled actual value for each of the 100 training samples, and the data possessed by a second participant may be a partial eigenvalue (e.g., remaining eigenvalue) for each of the 100 training samples.
The matrix multiplication computation described anywhere in the present disclosure requires a determination as to whether or not to transpose one or more corresponding matrices of two or more matrices participating in matrix multiplication, as the case may be, to satisfy a matrix multiplication rule, thereby completing the matrix multiplication computation.
Embodiments of methods, apparatuses, and systems for collaborative training of logistic regression models via multiple training participants according to the present disclosure are described in detail below with reference to the accompanying drawings.
Fig. 1 shows a schematic diagram of an example of a vertically sliced training sample set in accordance with an embodiment of the present disclosure. In fig. 1, 2 data parties Alice and Bob are shown, as are multiple data parties. For each training sample, the partial training subsamples owned by the data parties Alice and Bob are combined together to form the complete content of the training sample. For example, assume that the content of a certain training sample includes a label (hereinafter referred to as "tag value") y 0 And attribute characteristics (hereinafter referred to as "characteristic samples")
Figure BDA0002119246020000092
Then after vertical segmentation, the training participant Alice owns y of the training sample 0 And->
Figure BDA0002119246020000093
Training participant Bob owns the training sample +.>
Figure BDA0002119246020000094
Fig. 2 shows a schematic architecture diagram illustrating a system 1 (hereinafter referred to as model training system 1) for co-training a logistic regression model via a plurality of training participants according to an embodiment of the present disclosure.
As shown in fig. 2, the model training system 1 includes a training initiator device 10 and at least one training partner device 20. In fig. 2, 2 training partner devices 20 are shown. In other embodiments of the present disclosure, one training partner device 20 may be included or more than 2 training partner devices 20 may be included. The training initiator device 10 and the at least one training partner device 20 may communicate with each other via a network 30 such as, but not limited to, the internet or a local area network. In this disclosure, the training initiator device 10 and the at least one training partner device 20 are collectively referred to as training participant devices.
In the present disclosure, the trained logistic regression model is decomposed into a first number of sub-models. Here, the first number is equal to the number of training participant devices participating in the model training. Here, it is assumed that the number of training participant devices is N. Accordingly, the logistic regression model is decomposed into N sub-models, one for each training participant device. A training sample set for model training is located at the training initiator device 10, which is a horizontally partitioned training sample set as described above, and includes a feature data set and corresponding marker values, i.e., x0 and y0 shown in fig. 1. The sub-model owned by each training participant, and the corresponding training samples, is secret to that training participant and cannot be learned or completely learned by other training participants.
In the present disclosure, the training initiator device 10 and the at least one training partner device 20 cooperate to train the logistic regression model using a training sample set at the training initiator device 10 and the respective sub-models. The specific training process for the model will be described in detail below with reference to fig. 3 to 8.
In this disclosure, training initiator device 10 and training partner device 20 may be any suitable computing devices having computing capabilities. The computing device includes, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, personal Digital Assistants (PDAs), handsets, messaging devices, wearable computing devices, consumer electronic devices, and the like.
Fig. 3 illustrates a flowchart of a method for co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure. In fig. 3, a first training party Alice and 2 second training parties Bob and Charlie are illustrated as examples. Sub-model W of first training participant Alice with logistic regression model A The second training partner Bob has a submodel W of the logistic regression model B And a second training participant Charlie sub-model W with a logistic regression model C . The first training party Alice has a first subset X of feature samples A And a marker value Y, the second training partner Bob having a second feature sample subset X B And the second training participant Charlie has a third subset of feature samples X C . First feature sample subset X A Second feature sample subset X B And a third feature sample subset X C Is obtained by vertically slicing a feature sample set X for model training. Sub-model W A 、W B And W is C The logistic regression model W is composed.
As shown in fig. 3, first, at block 301, a first training partner Alice, a second training partner Bob and Charlie initialize their submodel parameters, i.e., a weight submodel W A 、W B And W is C To obtain initial values of its sub-model parameters and to initialize the number of training cycles t that have been performed to zero. Here, it is assumed that the end condition of the loop process is to perform a predetermined number of training loops, for example, T training loops.
After initialization as above, at block 302, at Alice, bob and Charlie, for the respective initial sub-models W, respectively A 、W B And W is C Performing model conversion processing to obtain a conversion sub-model W A '、W B ' and W C '。
FIG. 4 illustrates a flowchart of one example of a model conversion process according to an embodiment of the present disclosure.
As shown in fig. 4, at block 410, at Alice, the submodel W that Alice has is created A Decomposition into W A1 、W A2 And W is A3 . Here, the sub-model W A In the decomposition process, for the submodel W A The attribute value of the element is decomposed into 3 partial attribute values, and 3 new elements are obtained by using the decomposed partial attribute values. Then, the resulting 3 new elements are respectively assigned to W A1 、W A2 And W is A3 Thereby obtaining W A1 、W A2 And W is A3 . At Bob, sub-model W possessed by Bob B Decomposition into W B1 、W B2 And W is B3 . At Charlie, the sub model W of Charlie C Decomposition into W C1 、W C2 And W is C3
Then, at block 420, alice will W A2 Send to Bob and W A3 To Charlie. At block 430, bob will W B1 Send to Alice and send W B3 To Charlie. At block 440, charlie will W C1 Send to Alice and send W C2 To Bob.
Next, at block 450, at Alice, for W A1 、W B1 And W is C1 Splicing to obtain a converted submodel W A '. The resulting transformed submodel W A The dimensions of' are equal to the dimensions of the feature sample set used for model training. At Bob, for W A2 、W B2 And W is C2 Splicing to obtain a converted submodel W B '. At Charlie, for W A3 、W B3 And W is C3 Splicing to obtain a converted submodel W C '. Also, the transformed submodel W B ' and W C The dimensions of' are equal to the dimensions of the feature sample set used for model training.
Returning to fig. 3, after the model conversion is completed as above, the operations of blocks 303 to 311 are performed in a loop until the loop end condition is satisfied.
Specifically, at block 303, a first training party Alice and a second training party Bob, charlie cooperate to align a first subset of feature samples X A Second feature sample subset X B And third characteristic sampleThe subset X C (i.e., feature sample set) is subjected to a vertical-to-horizontal-cut conversion to obtain a first converted feature sample subset X A ' second conversion feature sample subset X B ' and third conversion feature sample subset X C '. The resulting first subset of conversion feature samples X A ' second conversion feature sample subset X B ' and third conversion feature sample subset X C Each feature sample in' has the complete feature content of each training sample, i.e., similar to the feature sample subset obtained by horizontally slicing the feature sample set.
Fig. 5 shows a flowchart of a feature sample set conversion process according to an embodiment of the present disclosure.
As shown in FIG. 5, at block 510, at Alice, a first subset X of feature samples is taken A Decomposition into X A1 、X A2 And X A3 . At Bob, a second feature sample subset X B Decomposition into X B1 、X B2 And X B3 . At Charlie, the third feature sample subset X C Decomposition into X C1 、X C2 And X C3 . For feature sample subset X A 、X B And X C Is decomposed into sub-models W A The decomposition process of (2) is exactly the same. Then, at block 520, alice will X A2 Send to Bob and X A3 To Charlie. At block 530, bob will X B1 Send to Alice and send X B3 To Charlie. At block 540, charlie will X C1 Send to Alice and send X C2 To Bob.
Next, at block 550, at Alice, for X A1 、X B1 And X C1 Stitching to obtain a first conversion feature sample subset X A '. At Bob, for X A2 、X B2 And X C2 Stitching to obtain a second conversion feature sample subset X B '. At Charlie, for X A3 、X B3 And X C3 Stitching to obtain a third conversion feature sample subset X C '. The resulting subset of conversion feature samples X A '、X B ' and X C ' trainingThe dimensions of the training sample set are the same.
For the first feature sample subset X as above A Second feature sample subset X B And a third feature sample subset X C After performing the vertical-to-horizontal cut conversion, at block 304, the model-converted logistic regression model (i.e., w=w A '+W B '+W C ') conversion feature sample subset X of the individual training participants A '、X B ' and X C ' performing secret-shared matrix multiplication to obtain corresponding matrix products, i.e., matrix products W X at Alice, respectively A ' matrix product W X at Bob B Matrix product W X at' Charlie C '。
In one example of the present disclosure, a secret sharing matrix multiplication is used to obtain a model-transformed logistic regression model W and a subset of conversion feature samples X for each training participant A '、X B ' and X C The matrix product between' may include: obtaining a model-transformed logistic regression model W and a subset X of conversion feature samples for each training participant using a trusted initializer secret sharing matrix multiplication A '、X B ' and X C ' matrix product between. How to obtain a matrix product using a trusted initializer secret sharing matrix multiplication will be described below with reference to fig. 6.
In another example of the present disclosure, a secret sharing matrix multiplication is used to obtain a model-transformed logistic regression model W and a subset of conversion feature samples X for each training participant A '、X B ' and X C The matrix product between' may include: obtaining a model-transformed logistic regression model W and a subset X of conversion feature samples for each training participant using untrusted initializer secret sharing matrix multiplication A '、X B ' and X C ' matrix product between. How to obtain a matrix product using a non-trusted initializer secret sharing matrix multiplication will be described below with reference to fig. 7-8.
After the matrix product of each training participant is obtained as above, at block 305, at the first training participant AliceDecomposing the marked value Y to obtain 3 partial marked values Y A 、Y B And Y C . The decomposition process for the marker value Y is the same as the decomposition process for the feature sample set X above and will not be described here.
Next, at block 306, the first training partner Alice marks a portion of the value Y B To the second training partner Bob and to the partial marking value Y C Sent to the second training participant Charlie while Alice retains a partial tag value Y A As its own partial tag value.
Then, at block 307, at each training participant, a current predicted value at each training participant is determined based on the matrix product of each training participant. For example, the current predicted values at the various training participants may utilize the formula
Figure BDA0002119246020000131
To obtain, wherein->
Figure BDA0002119246020000132
Is the predicted value at training participant i, w=w A '+W B '+W C ' is a logistic regression model and X i Is a subset of the conversion feature samples at each training participant.
In addition, the formula can be also calculated
Figure BDA0002119246020000133
The taylor formula expansion is performed, that is,
Figure BDA0002119246020000134
thus, the matrix product W.X of each training participant can be utilized based on the Taylor expansion formula i To calculate the current predicted values of the individual training participants. As for the approximation to several terms required when the taylor formula expansion is performed, it can be determined based on the accuracy required for the application scenario.
At block 308, at each training participant, a current prediction value and a respective partial signature based on each training participantValues, predictive differences at each training participant are determined. I.e. the predicted difference at Alice
Figure BDA0002119246020000135
Predicted value at Bob->
Figure BDA0002119246020000136
And predictive value at Charlie +.>
Figure BDA0002119246020000137
Here, e is a column vector, Y is a column vector representing the marker value of training sample X, and +.>
Figure BDA0002119246020000141
Is a column vector representing the current predicted value of training sample X. If training sample X contains only a single training sample, e, Y and +.>
Figure BDA0002119246020000142
Are column vectors having only a single element. If training sample X contains multiple training samples, e, Y and +.>
Figure BDA0002119246020000143
Are column vectors with multiple elements, wherein, < >>
Figure BDA0002119246020000144
Each element in Y is a marker value of a corresponding training sample in the plurality of training samples, and each element in e is a difference between the marker value of the corresponding training sample in the plurality of training samples and the current predicted value. It is to be noted that, in the above description, e A 、e B And e C Collectively referred to as e, and Y A 、Y B And Y C Collectively referred to as Y.
Then, at block 309, a set of transformed feature samples X (x=x A '+X B '+X C ') predicted differences e of the respective training participants A 、e B And e C Determining model update amount TMP of each training participant A 、TMP B And TMP (TMP) C . Specifically, the model update amount TMP at Alice A =X*e A Model update amount TMP at Bob B =X*e B And model update amount TMP at Charlie C =X*e C . Here, the model update amount TMP at Bob B And model update amount TMP at Charlie C Obtained using secret sharing matrix multiplication.
Next, at block 310, at each training participant, the current conversion sub-model at the training participant is updated based on the current conversion sub-model of the training participant and the corresponding model update amount. For example, training initiator Alice uses the current conversion sub-model W A ' and corresponding model update amount TMP A To update the conversion sub-model at the training initiator Alice, and to train the collaborator Bob to use the current conversion sub-model W B ' and corresponding model update amount TMP B To update the conversion sub-model at training partner Bob, and training partner Charlie uses the current conversion sub-model W C ' and corresponding model update amount TMP C To update the conversion sub-model at the training partner Charlie.
In one example of the present disclosure, updating the current sub-model at the training participant based on the current sub-model of the training participant and the corresponding model update amount may update the current sub-model W at the training participant according to the following equation n+1 =W n -α·TMP i =W n -α·X·e i Wherein W is n+1 Representing an updated conversion sub-model at the training participant, W n Representing the current conversion sub-model at the training participant, alpha representing a learning rate, X representing the feature sample set, and e i Representing the predicted difference at the training partner. Wherein when the training participant is the first training participant Alice, the updated current conversion sub-model may be calculated separately at Alice. At the training participation side is secondX.e. during training of participants i Obtained at the second training party using a secret sharing matrix multiplication which may be performed using a similar procedure as shown in fig. 8 or fig. 7-8, except that X corresponds to W, and e i Corresponding to X. It is to be noted here that, when X is a single feature sample, X is a feature vector (column vector or row vector) composed of a plurality of attributes, and e i Is a single predicted difference. Where X is a plurality of feature samples, X is a feature matrix, the attributes of each feature sample constitute a column/row of elements of the feature matrix X, and e i Is the predictive difference vector. In the calculation of X.e i At the same time, with e i Multiplied by the respective elements of the matrix X are the eigenvalues of the respective samples corresponding to a certain characteristic of the matrix X. For example, assume e i Is a column vector, then each time multiplication, e i Multiplied by a row in the matrix X, the elements in this row represent the eigenvalues of a certain characteristic corresponding to each sample.
After the respective conversion submodel updates are completed at the respective training participants as described above, a determination is made at block 311 as to whether a predetermined number of cycles has been reached, i.e., whether a cycle end condition has been reached. If the predetermined number of loops is reached, a wait block 312 is entered. If the predetermined number of cycles has not been reached, flow returns to the operation of block 302 to perform the next training cycle in which the updated conversion sub-model obtained by each training participant in the current cycle is used as the current conversion sub-model for the next cycle.
At block 315, sub-models (i.e., trained sub-models) at Alice, bob, and Charlie are determined based on the updated conversion sub-models of Alice, bob, and Charlie, respectively.
Specifically, W is trained as above A '、W B ' and W C ' Alice will W A '[|A|:|B|]Send to Bob and W A '[|B|:]To Charlie. Bob will W B '[0:|A|]Send to Alice and send W B '[|B|:]To Charlie. Charlie will W C '[0:|A|]TransmittingTo Alice and W C '[|A|:|B|]To Bob. Here, [ |b|:]refers to the vector components in the matrix after dimension B (i.e., |b|), 0: i A]Refers to the vector components before dimension A (i.e., |A|) in the matrix, i.e., the components from 0 to |A|, and [ |A|: |B| ]]Is the vector component in the matrix after the a-dimension and before the B-dimension. For example, assume that w= [0,1,2,3,4,5,6]If |A| is 2 and |B| is 2, then W [0: |A| | is 2]=[0,1]And W [ |A|: |B| ]]=[2,3]And W [ |b|:]=[4,5,6]。
next, at Alice, calculate W A =W A '[0:|A|]+W B '[0:|A|]+W C '[0:|A|]At Bob, calculate W B =W A '[|A|:|B|]+W B '[|A|:|B|]+W C '[|A|:|B|]And at Charlie, calculate W C =W A '[|B|:]+W B '[|B|:]+W C '[|B|:]Thereby obtaining a trained sub-model W at Alice, bob and Charlie A 、W B And W is C
Here, it is to be noted that, in the above-described example, the end condition of the training cycle process means that the predetermined number of cycles is reached. In other examples of the present disclosure, the end condition of the training cycle may also be that the determined predicted difference lies within a predetermined range, i.e., the predicted difference e A 、e B And e C Each element e of (2) i Each element e being within a predetermined range, e.g. predictive difference e i Is less than a predetermined threshold or the average of the predicted differences E is less than a predetermined threshold. Accordingly, the operations of block 311 in fig. 3 may be performed after the operations of block 308.
FIG. 6 illustrates a flow chart of one example of a secret sharing matrix multiplication process with trusted initializers. This example is illustrated to calculate X A W is illustrated as an example. In case of using a trusted initializer secret sharing matrix multiplication, the model training system 1 shown in fig. 2 further comprises a trusted initializer device 30.
As shown in fig. 6, first, at the trusted initializer 30, a first number of random weight vectors, a first number of random feature matrices, and a first number of random flag value vectors are generated, and the product of the sum of the first number of random weight vectors and the sum of the first number of random feature matrices is equal to the sum of the first number of random flag value vectors. Here, the first number is equal to the number of training participants.
For example, as shown in FIG. 6, the trusted initializer generates 3 random weight vectors W R,1 、W R,2 And W is R,3 3 random feature matrices X R,1 、X R,2 And X R,3 And 3 random flag value vectors Y R,1 、Y R,2 And Y R,3 Wherein, the method comprises the steps of, wherein,
Figure BDA0002119246020000161
here, the dimensions of the random weight vector are the same as the dimensions of the transformed sub-model of the respective model training participants, the dimensions of the random feature matrix are the same as the dimensions of the transformed feature sample subset, and the dimensions of the random marker value vector are the same as the dimensions of the marker value vector.
Then, at block 601, the generated W R,1 、X R,1 And Y R,1 Sent to the first training participant Alice, at block 602, the generated W R,2 、X R,2 And Y R,2 To the second training party Bob and, at block 603, to send the generated W R,3 、X R,3 And Y R,3 And transmitting to a second training participant Charlie.
Next, at block 604, at Alice, a feature sample subset X A ' hereinafter referred to as feature matrix X A ) Into a first number of feature sub-matrices, e.g. into 3 feature sub-matrices X as shown in FIG. 6 A1 '、X A2 ' and X A3 '。
Alice then transmits each of a second number of the decomposed first number of feature sub-matrices to a corresponding second training party, respectively, the second number being equal to the first number minus one. For example, at blocks 605 and 606, 2 feature submatrices X A2 ' and X A3 ' send to Bob and Charlie, respectively.
Then, at each training participant, a weight sub-vector (i.e., a conversion sub-model W A '、W B ' and W C ') and corresponding feature submatrix X A1 '、X A2 ' and X A3 And the received random weight vector and random feature matrix, determining a weight sub-vector difference E and a feature sub-matrix difference D at the training partner. For example, at block 607, at Alice, its weight sub-vector difference value e1=w is determined A '-W R,1 Feature submatrix difference d1=x A1 '-X R,1 . At block 608, at Bob, its weight sub-vector difference value e2=w is determined B '-W R,2 Feature submatrix difference d2=x A2 '-X R,2 . At block 609, at Charlie, its weight sub-vector difference value e3=w is determined C '-W R,3 Feature submatrix difference d3=x A3 '-X R,3
Determining respective weight sub-vector differences E at respective training participants i Sum feature submatrix difference D i The training participants then determine the respective weight sub-vector differences E i Sum feature submatrix difference D i Is disclosed to the remaining training participants. For example, at blocks 610 and 611, alice sends D1 and E1 to Bob and Charlie, respectively. Bob sends D2 and E2 to Alice and Charlie, respectively, at blocks 612 and 613. At blocks 614 and 615 Charlie sends D3 and E3 to Alice and Bob, respectively.
The weight sub-vector differences and feature sub-matrix differences at each training participant are then summed at each training participant to obtain a weight sub-vector total difference E and a feature sub-matrix total difference D, respectively, at block 616. For example, as shown in fig. 6, d=d1+d2+d3, and e=e1+e2+e3.
Then, at each training participant, based on the received random weight vector W R,i Random feature matrix X R,i Random flag value vector Y R,i And calculating the corresponding predicted value vector Zi by the weight sub-vector total difference E and the feature sub-matrix total difference D.
In one example of the present disclosure, at each training participant, the product of the training participant's random marker value vector, the total difference of the weight sub-vectors, and the training participant's random feature matrix, and the product of the total difference of the feature sub-matrices, and the training participant's random weight vector, may be summed to obtain a corresponding predictor vector (first calculation). Alternatively, the random flag value vector of the training participant, the product of the total difference of the weight sub-vectors and the random feature matrix of the training participant, the product of the total difference of the feature sub-matrices and the random weight vector of the training participant, and the product of the total difference of the weight sub-vectors and the total difference of the feature sub-matrices may be summed to obtain the corresponding prediction value matrix (second calculation mode).
It is to be noted here that, in the calculation of the prediction value matrix at each training partner, only one of the prediction value matrices calculated at the training partner contains the product of the total difference of the weight sub-vectors and the total difference of the feature sub-matrices. In other words, for each training participant, only one training participant's predictor vector is calculated according to the second calculation, while the remaining training participants calculate the corresponding predictor vector according to the first calculation.
For example, at block 617, at Alice, a corresponding predictor vector z1=y is calculated R,1 +E*X R,1 +D*W R,1 +d×e. At block 618, at Bob, a corresponding predictor vector z2=y is calculated R,2 +E*X R,2 +D*W R,2 . At block 619, at Charlie, a corresponding predictor vector z3=y is calculated R,3 +E*X R,3 +D*W R,3
Here, fig. 6 shows that Z1 calculated at Alice contains d×e. In other examples of the present disclosure, d×e may be included in Zi calculated by either Bob or Charlie, and accordingly, d×e may not be included in Z1 calculated at Alice. In other words, only one of the computed zis at the respective training participants contains d×e.
Each training participant then discloses the calculated respective predicted value vector to the remaining training participants. For example, at blocks 620 and 621, alice sends a predictor vector Z1 to Bob and Charlie, respectively. At blocks 622 and 623, bob sends the predictor vector Z2 to Alice and Charlie, respectively. At blocks 624 and 625, charlie sends the predictor vector Z3 to Alice and Bob, respectively.
Then, at blocks 626, 627 and 628, each training participant sums the predicted value vectors for each training participant, z=z1+z2+z3, to obtain a corresponding matrix product result.
Fig. 7 illustrates a flow chart of a process for obtaining a matrix product for each training participant using a non-trusted initializer secret sharing matrix multiplication based on a current sub-model for each training participant and a subset of conversion feature samples for each training participant in accordance with an embodiment of the present disclosure. The following description will take the example of calculating the matrix product at Alice. The matrix product calculation process for training participants Bob and Charlie is similar to Alice.
As shown in FIG. 7, first, at block 710, at Alice, a first weight submatrix (i.e., a conversion submodel) W at Alice is calculated A ' and first feature matrix (first transformed feature sample subset) X A ' matrix product Y A1 =W A *X A
Next, at block 720, a first weight sub-matrix (e.g., W) of each second training participant (e.g., bob and Charlie) is calculated using secret sharing matrix multiplication of the untrusted initializer B ' and W C ') with a first feature matrix X A ' matrix product (Y A2 =W B '*X A ' and Y A3 =W C '*X A '). How the matrix product is calculated using the untrusted initializer secret sharing matrix multiplication will be described in detail below with reference to fig. 8.
Then, at Alice, the resulting respective matrix products (e.g., Y A1 、Y A2 And Y A3 ) Summing to obtain a matrix product Y at Alice A =Y A1 +Y A2 +Y A3
FIG. 8A flowchart of one example of a trusted-free initializer secret sharing matrix multiplication process is shown, according to an embodiment of the present disclosure. In FIG. 8, to train X between parties Alice and Bob A '*W B The' calculation process is illustrated as an example.
As shown in FIG. 8, first, at block 801, if X at Alice A The number of rows of' (hereinafter referred to as the first feature matrix) is not even, and/or the current submodel parameter W at Bob B The columns of' (hereinafter referred to as the first weight sub-matrix) are not even, then for the first feature matrix X A ' and/or first weight sub-matrix W B ' dimension-fill-in processing is performed to enable the first feature matrix X A The' number of rows is an even number and/or a first weight sub-matrix W B The column number of' is even. For example, a first feature matrix X A The end of the row is incremented by a row 0 value and/or the first weight submatrix W B The' column end is added with one more column 0 value to perform dimension patch processing. In the following description, it is assumed that a first weight sub-matrix W B ' dimension I.J, and first feature matrix X A The' dimension is J x K, where J is an even number.
The operations of blocks 802 through 804 are then performed at Alice to obtain a random feature matrix X1, second and third feature matrices X2 and X3. Specifically, at block 802, a random feature matrix X1 is generated. Here, the dimension of the random feature matrix X1 is equal to the first feature matrix X A The' dimensions are the same, i.e. the dimension of the random feature matrix X1 is J X K. At block 803, the first feature matrix X is subtracted from the random feature matrix X1 A ' to obtain a second feature matrix X2. The dimension of the second feature matrix X2 is j×k. At block 804, the even row submatrix x1_e of the random feature matrix X1 is subtracted from the odd row submatrix x1_o of the random feature matrix X1 to obtain a third feature matrix X3. The dimension of the third feature matrix X3 is j×k, where j=j/2.
In addition, the operations of blocks 805 through 807 are performed at Bob to obtain a random weight submatrix W B1 Second and third weight submatrices W B2 And W is B3 . Specifically, at block 805, a random weight submatrix W is generated i1 . Here, random weightsHeavy submatrix W B1 Dimension and first weight sub-matrix W B ' the dimensions are identical, i.e. the random weight submatrix W i1 Is I x J. At block 806, for a first weight sub-matrix W B ' and random weight submatrix W B1 Summing to obtain a second weight sub-matrix W B2 . Second weight sub-matrix W B2 Is I x J. At block 807, the random weight submatrix W B1 Odd column submatrix W B1_o Adding a random weight submatrix W B1 Even row sub-matrix W B1_e To obtain a third weight sub-matrix W B3 . Third weight submatrix W B3 Is I x J, where j=j/2.
Alice then sends the generated second and third feature matrices X2 and X3 to Bob at block 808, and Bob sends the second weight submatrix W at block 809 B2 And a third weight submatrix W B3 To Alice.
Next, at block 810, at Alice, y1=w based on equation B2 *(2*X A '-X1)-W B3 * (x3+x1_e) to obtain a first matrix product Y1, and at block 812, the first matrix product Y1 is sent to Bob.
At block 811, at Bob, based on equation y2= (W B '+2*W B1 )*X2+(W B3 +W B1_o ) X3 calculates a second matrix product Y2 and, at block 813, sends the second matrix product Y2 to Alice.
Then, at blocks 814 and 815, the first and second matrix products Y1 and Y2 are summed at Alice and Bob, respectively, to obtain X A '*W B '=Y A2 =Y1+Y2。
Furthermore, it is to be noted that illustrated in fig. 3-8 are model training scenarios of 1 first training participant and 2 second training participants, in other examples of the present disclosure, 1 second training participant may be included or more than 2 second training participants may be included.
By utilizing the logistic regression model training method disclosed in fig. 3-8, model parameters of the logistic regression model can be obtained through training under the condition that secret data of a plurality of training participants are not leaked, and the workload of model training only has a linear relation with the number of characteristic samples used for training, but not an exponential relation, so that the efficiency of model training can be improved under the condition that the safety of the data of each of the plurality of training participants is ensured.
Fig. 9 shows a schematic diagram of an apparatus (hereinafter referred to as a model training apparatus) 900 for co-training a logistic regression model via a plurality of training participants according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, the first number being equal to the number of training participants, and each training participant having one sub-model. The training participants include a first training participant and a second number of second training participants. The first training participants have a first subset of feature samples and a marker value, and each second training participant has a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the feature sample set for model training, the second number being equal to the first number minus one. Model training apparatus 900 is located on the first training participant side.
As shown in fig. 9, the model training apparatus 900 includes a model conversion unit 901, a sample conversion unit 902, a matrix product acquisition unit 903, a flag value decomposition unit 904, a flag value transmission unit 905, a predicted value determination unit 906, a predicted difference value determination unit 907, a model update amount determination unit 908, a model update unit 909, and a model determination unit 910.
The model conversion unit 901 is configured to perform model conversion processing on the sub-models of the respective training participants to obtain conversion sub-models of the respective training participants. The operation of the model conversion unit 901 may refer to the operation of the block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.
At the time of model training, the sample conversion unit 902, the matrix product acquisition unit 903, the flag value decomposition unit 904, the flag value transmission unit 905, the predicted value determination unit 906, the predicted difference value determination unit 907, the model update amount determination unit 908, and the model update unit 909 are configured to cyclically execute operations until a cycle end condition is satisfied. The cycle end condition may include: reaching a predetermined number of cycles; or the determined prediction difference is within a predetermined range. At the end of the cyclic process, the updated conversion sub-model of each training participant is used as the current conversion sub-model for the next cyclic process.
Specifically, during each cycle, the sample conversion unit 902 is configured to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at the respective training participants. The operation of the sample conversion unit 902 may refer to the operation of block 303 described above with reference to fig. 3 and to the process described with reference to fig. 5.
The matrix product acquisition unit 903 is configured to obtain matrix products between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner using secret sharing matrix multiplication. The operation of the sample conversion unit 903 may refer to the operation of block 303 described above with reference to fig. 3 and the operation described with reference to fig. 6-8.
The flag value decomposition unit 904 is configured to decompose a flag value into a first number of partial flag values. The operation of the flag value decomposition unit 904 may refer to the operation of block 305 described above with reference to fig. 3.
The marker value transmitting unit 905 is configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively. The operation of the flag value transmission unit 905 may refer to the operation of the block 306 described above with reference to fig. 3.
The predictor determination unit 906 is configured to determine a current predictor at the first training participant based on the matrix product at the first training participant. The operation of the predictor determining unit 906 may refer to the operation of the block 307 described above with reference to fig. 3.
The prediction difference determination unit 907 is configured to determine a prediction difference between the current prediction value of the first training participant and the corresponding partial marker value. The operation of the prediction difference determination unit 907 may refer to the operation of block 308 described above with reference to fig. 3.
The model update amount determination unit 908 is configured to determine an amount of model update at the first training participant based on the feature sample set and the predicted difference value at the first training participant. The operation of the model update amount determination unit 908 may refer to the operation of block 309 described above with reference to fig. 3.
The model update unit 909 is configured to update the current conversion sub-model of the first training participant based on the current conversion sub-model of the first training participant and the corresponding model update amount.
The model determination unit 910 is configured to determine a sub-model of the first training participant based on the conversion sub-model of the respective training participant when the loop end condition is satisfied.
In one example of the present disclosure, the matrix product acquisition unit 903 may be configured to: a matrix product between the model-converted logistic regression model and the subset of conversion feature samples of the first training partner is obtained using a trusted initializer secret sharing matrix multiplication. The operation of the matrix product acquisition unit 903 may refer to the operation performed at Alice described above with reference to fig. 6.
In another example of the present disclosure, the matrix product acquisition unit 903 may be configured to: matrix products between the model-converted logistic regression model and the subset of conversion feature samples of the first training partner are obtained using a non-trusted initializer secret sharing matrix multiplication. The operation of the matrix product acquisition unit 903 may refer to the operation performed at Alice described above with reference to fig. 7-8.
In one example of the present disclosure, the sample conversion unit 902 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample splicing module (not shown). The sample decomposition module is configured to decompose a first subset of feature samples into a first number of first partial subsets of feature samples. The sample transmitting/receiving module is configured to transmit each of a second number of first partial feature sample subsets to a corresponding second training participant, and to receive a corresponding second partial feature sample subset from each second training participant, each second partial feature sample subset received being one of the first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at each second training participant. The sample stitching module is configured to stitch the remaining first partial feature sample subset and the received respective second partial feature sample subsets to obtain a converted feature sample subset at the first training participant.
Fig. 10 shows a block diagram of an apparatus for co-training a logistic regression model via a plurality of training participants (hereinafter referred to as model training apparatus 1000) according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, the first number being equal to the number of training participants, and each training participant having one sub-model. The training participants include a first training participant and a second number of second training participants. The first training participants have a first subset of feature samples and a marker value, and each second training participant has a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the feature sample set for model training, the second number being equal to the first number minus one. Model training apparatus 1000 is located on the second training participant side.
As shown in fig. 10, the model training apparatus 1000 includes a model conversion unit 1010, a sample conversion unit 1020, a matrix product acquisition unit 1030, a flag value receiving unit 1040, a predicted value determining unit 1050, a predicted difference value determining unit 1060, a model update amount determining unit 1070, a model update unit 1080, and a model determining unit 1090.
The model conversion unit 1010 is configured to perform model conversion processing on the sub-models of the respective training participants to obtain conversion sub-models of the respective training participants. The operation of the model conversion unit 1010 may refer to the operation of the block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.
In performing model training, the sample conversion unit 1020, the matrix product acquisition unit 1030, the flag value receiving unit 1040, the predicted value determining unit 1050, the predicted difference value determining unit 1060, the model update amount determining unit 1070, and the model update unit 1080 are configured to perform operations in a loop until a loop end condition is satisfied. The cycle end condition may include: reaching a predetermined number of cycles; or the determined prediction difference is within a predetermined range. At the end of the cyclic process, the updated conversion sub-model of each training participant is used as the current conversion sub-model for the next cyclic process.
Specifically, during each cycle, the sample conversion unit 1020 is configured to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain converted feature sample subsets at the respective training participants. The operation of the sample conversion unit 1020 may refer to the operation of the block 303 described above with reference to fig. 3 and the operation described with reference to fig. 5.
The matrix product acquisition unit 1030 is configured to obtain matrix products between the model-converted logistic regression model and the subset of conversion feature samples at the second training partner using secret sharing matrix multiplication. The operation of the matrix product acquisition unit 1030 may refer to the operation of block 304 described above with reference to fig. 3 and to the processes described with reference to fig. 6-8.
The marker value receiving unit 1040 is configured to receive, from the first training participant, a corresponding partial marker value that is one of a first number of partial marker values resulting from decomposition of the marker value at the first training participant. The operation of the tag value receiving unit 1040 may refer to the operation of block 306 described above with reference to fig. 3.
The predictor determination unit 1050 is configured to determine a current predictor at the second training participant based on the matrix product at the second training participant. The operation of the predictor determining unit 1050 may refer to the operation of the block 307 described above with reference to fig. 3.
The prediction difference determination unit 1060 is configured to determine a prediction difference at the second training participant using the current prediction value and the received partial marker value of the second training participant. The operation of the prediction difference determination unit 1060 may refer to the operation of block 308 described above with reference to fig. 3.
The model update amount determination unit 1070 is configured to obtain the model update amount of the second training partner using secret sharing matrix multiplication based on the feature sample set and the predicted difference value of the second training partner. The operation of the model update amount determination unit 1070 may refer to the operation of block 309 described above with reference to fig. 3.
The model update unit 1080 is configured to update the current conversion sub-model of the second training participant based on the current conversion sub-model of the second training participant and the corresponding model update amounts. The operation of model update unit 1080 may refer to the operation of block 310 described above with reference to fig. 3.
The model determination unit 1090 is configured to determine a sub-model of the second training participant based on the conversion sub-model of the respective training participants when the cycle end condition is satisfied. The operation of the model determination unit 1090 may refer to the operation of block 312 described above with reference to fig. 3.
In one example of the present disclosure, the sample conversion unit 1020 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample splicing module (not shown). The sample decomposition module is configured to decompose the second subset of feature samples into a first number of second partial subsets of feature samples. The sample transmitting/receiving module is configured to transmit each of a second number of second partial feature sample subsets to the first training participant and the remaining second training participants, and to receive a first partial feature sample subset from the first training participant and a second partial feature sample subset from each of the remaining second training participants, the first partial feature sample subset being one of a first number of first partial feature sample subsets obtained by decomposing the feature sample subset at the first training participant, each of the received second partial feature sample subsets being one of a first number of second partial feature sample subsets obtained by decomposing the respective second feature sample subset at each of the remaining second training participants. The sample stitching module is configured to stitch the remaining second partial feature sample subset, the received first and second partial feature sample subsets to obtain a converted feature sample subset at the second training participant.
In one example of the present disclosure, the matrix product acquisition unit 1030 may be configured to: a matrix product between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner is obtained using a trusted initializer secret sharing matrix multiplication. The operation of the matrix product acquisition unit 1030 may refer to the operation performed at the second training partner described above with reference to fig. 6.
In another example of the present disclosure, the matrix product acquisition unit 1030 may be configured to: matrix products between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner are obtained using a non-trusted initializer secret sharing matrix multiplication. The operation of the matrix product acquisition unit 1030 may refer to the operation performed at the second training partner described above with reference to fig. 7-8.
In one example of the present disclosure, the model update amount determination unit 1070 may be configured to: based on the feature sample set and the predicted difference value of the second training participant, a model update amount of the second training participant is obtained using a trusted initializer secret sharing matrix multiplication.
In another example of the present disclosure, the model update amount determination unit 1070 may be configured to: based on the feature sample set and the predicted difference value of the second training participant, a model update amount of the second training participant is obtained using a non-trusted initializer secret sharing matrix multiplication.
Embodiments of model training methods, apparatus, and systems according to the present disclosure are described above with reference to fig. 1-10. The above model training apparatus may be implemented in hardware, or may be implemented in software or a combination of hardware and software.
FIG. 11 illustrates a hardware architecture diagram of a computing device 1100 for implementing co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure. As shown in fig. 11, computing device 1100 may include at least one processor 1110, memory (e.g., non-volatile memory) 1120, memory 1130, and communication interface 1140, and at least one processor 1110, memory 1120, memory 1130, and communication interface 1140 are connected together via bus 1160. At least one processor 1110 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in memory that, when executed, cause at least one processor 1110 to: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at each training participant; obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner using secret sharing matrix multiplication; decomposing the marking value into a first number of partial marking values and respectively transmitting each of the second number of partial marking values to a corresponding second training participant; determining a current predicted value at the first training participant based on the matrix product at the first training participant; determining a predicted difference between the current predicted value of the first training participant and the corresponding partial marker value; determining a model update amount at the first training participant based on the feature sample set and the predicted difference at the first training participant; updating the conversion sub-model of the first training participant based on the current conversion sub-model of the first training participant and a corresponding model update amount, wherein when the loop process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next loop process; and determining a sub-model of the first training participant based on the conversion sub-model of each training participant when the cycle end condition is satisfied.
It should be appreciated that computer-executable instructions stored in memory, when executed, cause the at least one processor 1110 to perform the various operations and functions described above in connection with fig. 1-10 in various embodiments of the present disclosure.
FIG. 12 illustrates a hardware architecture diagram of a computing device 1200 for implementing co-training a logistic regression model via a plurality of training participants, according to an embodiment of the disclosure. As shown in fig. 12, computing device 1200 may include at least one processor 1210, memory (e.g., non-volatile memory) 1220, memory 1230, and communication interface 1240, with at least one processor 1210, memory 1220, memory 1230, and communication interface 1240 connected together via bus 1260. The at least one processor 1210 executes at least one computer readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in memory that, when executed, cause the at least one processor 1210 to: performing model conversion processing on the sub-models of all training participants to obtain conversion sub-models of all training participants; the following loop process is performed until the loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at each training participant; obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the second training partner using secret sharing matrix multiplication; receiving a corresponding partial mark value from a first training participant, wherein the partial mark value is one partial mark value in a first number of partial mark values obtained by decomposing the mark value at the first training participant; determining a current predicted value at the second training participant based on the matrix product at the second training participant; determining a predicted difference value at the second training participant using the current predicted value of the second training participant and the received partial marker value; obtaining a model update amount at the second training participant using secret sharing matrix multiplication based on the feature sample set and a predicted difference value of the second training participant; updating the conversion sub-model of the second training participant based on the current conversion sub-model of the second training participant and the corresponding model update amount, wherein when the cyclic process is not over, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cyclic process; and determining a sub-model of the second training participant based on the conversion sub-model of each training participant when the cycle end condition is satisfied.
It should be appreciated that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1210 to perform the various operations and functions described above in connection with fig. 1-10 in various embodiments of the present disclosure.
According to one embodiment, a program product such as a machine-readable medium (e.g., a non-transitory machine-readable medium) is provided. The machine-readable medium may have instructions (i.e., the elements described above implemented in software) that, when executed by a machine, cause the machine to perform the various operations and functions described above in connection with fig. 1-10 in various embodiments of the disclosure. In particular, a system or apparatus provided with a readable storage medium having stored thereon software program code implementing the functions of any of the above embodiments may be provided, and a computer or processor of the system or apparatus may be caused to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium may implement the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or cloud by a communications network.
It will be appreciated by those skilled in the art that various changes and modifications can be made to the embodiments disclosed above without departing from the spirit of the invention. Accordingly, the scope of the invention should be limited only by the attached claims.
It should be noted that not all the steps and units in the above flowcharts and the system configuration diagrams are necessary, and some steps or units may be omitted according to actual needs. The order of execution of the steps is not fixed and may be determined as desired. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented jointly by some components in multiple independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may include permanently dedicated circuitry or logic (e.g., a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware unit or processor may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The particular implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments, but does not represent all embodiments that may be implemented or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (17)

1. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first feature sample subset and a marker value, each second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing a feature sample set for model training, the first number being not less than 2, the second number being equal to the first number minus one, the method performed by the first training participant, the method comprising:
performing model conversion processing on the sub-model of the first training participant in cooperation with each second training participant to obtain a conversion sub-model at the first training participant;
the following loop process is performed until the loop end condition is satisfied:
performing vertical-horizontal segmentation conversion on the feature sample set in cooperation with each second training participant to obtain a converted feature sample subset at the first training participant;
Obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner using secret sharing matrix multiplication;
decomposing the marking value into the first number of partial marking values, and respectively transmitting each of the second number of partial marking values to a corresponding second training participant;
determining a current predicted value at the first training participant based on a matrix product at the first training participant;
determining a predicted difference between the current predicted value of the first training participant and the corresponding partial marker value;
determining a model update amount at the first training participant based on the converted feature sample set and a prediction difference value at the first training participant;
updating the conversion sub-model of the first training participant based on the current conversion sub-model of the first training participant and a corresponding model update amount, wherein when the cycle end condition is not satisfied, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cycle process; and
determining a sub-model of the first training participant based on the conversion sub-model of the respective training participant when the cycle end condition is satisfied,
Wherein the model conversion processing of the sub-model of the first training participant in cooperation with the respective second training participants to obtain a converted sub-model at the first training participant comprises:
model decomposing the sub-model of the first training participant to obtain the first number of sub-model components;
transmitting one sub-model component of the obtained second number of sub-model components to each second training participant respectively, and receiving one sub-model component obtained by carrying out model decomposition on the sub-model at the second training participant from each second training participant; and
model stitching the locally retained sub-model components and the sub-model components received from the respective second training participants to obtain a transformed sub-model at the first training participant,
wherein cooperating with each second training participant to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at the first training participant comprises:
decomposing the first subset of feature samples into the first number of first partial subsets of feature samples;
transmitting each of the second number of first partial feature sample subsets to a corresponding second training participant;
Receiving a second subset of partial feature samples from each second training participant, each received second subset of partial feature samples being one of a first number of second subsets of partial feature samples obtained by decomposing the second subset of feature samples at each second training participant; and
the remaining first partial feature sample subset and the received second partial feature sample subset are stitched to obtain a converted feature sample subset at the first training participant.
2. The method of claim 1, wherein using secret-sharing matrix multiplication to obtain a matrix product between a model-converted logistic regression model and the subset of conversion feature samples of the first training partner comprises:
obtaining a matrix product between a model-converted logistic regression model and a subset of conversion feature samples of the first training participant using a trusted initializer secret sharing matrix multiplication; or alternatively
A matrix product between the model-converted logistic regression model and the subset of conversion feature samples of the first training partner is obtained using a non-trusted initializer secret sharing matrix multiplication.
3. The method of claim 1, wherein determining the current predicted value at the training initiator based on the matrix product at the training initiator comprises:
a current predicted value at the training initiator is determined based on a matrix product at the training initiator according to a taylor expansion formula.
4. A method according to any one of claims 1 to 3, wherein the cycle end condition comprises:
a predetermined number of cycles; or alternatively
The determined prediction difference is within a predetermined range.
5. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first feature sample subset and a marker value, each second training participant having a second feature sample subset, the first and second feature sample subsets being obtained by vertically slicing a feature sample set for model training, the first number being not less than 2, the second number being equal to the first number minus one, the method performed by the second training participant, the method comprising:
Performing model conversion processing on the sub-model of the second training participant in cooperation with the rest of the training participants to obtain a conversion sub-model at the second training participant;
the following loop process is performed until the loop end condition is satisfied:
performing vertical-to-horizontal segmentation conversion on the feature sample set in cooperation with the remaining training participants to obtain a converted feature sample subset at the second training participant;
obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the second training partner using secret sharing matrix multiplication;
receiving a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant;
determining a current predicted value at the second training participant based on a matrix product at the second training participant;
determining a predicted difference value at the second training participant using the current predicted value and the received partial marker value of the second training participant;
Obtaining a model update amount at the second training participant using secret sharing matrix multiplication based on the converted feature sample set and a predicted difference value of the second training participant;
updating the conversion sub-model of the second training participant based on the current conversion sub-model of the second training participant and a corresponding model update amount, wherein when the cycle end condition is not satisfied, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cycle process; and
determining a sub-model of the second training participant based on the conversion sub-model of the respective training participant when the cycle end condition is satisfied,
wherein performing model conversion processing on the sub-model of the second training participant in cooperation with the remaining training participants to obtain a converted sub-model at the second training participant comprises:
model decomposing the sub-model of the second training participant to obtain the first number of sub-model components;
transmitting one of the obtained second number of sub-model components to the remaining training participants, respectively, and receiving one sub-model component obtained by model decomposing the sub-model at each of the remaining training participants from the remaining training participants; and
Model stitching the locally retained sub-model components and the sub-model components received from each of the remaining training participants to obtain a transformed sub-model at the second training participant,
wherein cooperating with the remaining training participants to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at the second training participant comprises:
decomposing the second subset of feature samples into the first number of second partial subsets of feature samples;
transmitting each of the second number of second partial feature sample subsets to the first training participant and to the other second training participants;
receiving a first partial feature sample subset from the first training participant and a second partial feature sample subset from each of the remaining second training participants, the first partial feature sample subset being one of a first number of first partial feature sample subsets obtained by decomposing the feature sample subset at the first training participant, each second partial feature sample subset received being one of a first number of second partial feature sample subsets obtained by decomposing the respective second feature sample subset at each of the remaining second training participants; and
The remaining second partial feature sample subset and the received first and second partial feature sample subsets are stitched to obtain a converted feature sample subset at the second training participant.
6. The method of claim 5, wherein using secret-sharing matrix multiplication to obtain a matrix product between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner comprises:
obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner using a trusted initializer secret sharing matrix multiplication; or alternatively
Matrix products between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner are obtained using a non-trusted initializer secret sharing matrix multiplication.
7. The method of claim 5, wherein obtaining the model update amount for the second training party using secret sharing matrix multiplication based on the converted feature sample set and the predicted difference value for the second training party comprises:
obtaining a model update of the second training participant using a trusted initializer secret sharing matrix multiplication based on the converted feature sample set and the predicted difference of the second training participant; or alternatively
Based on the converted feature sample set and the predicted difference value of the second training participant, a model update amount of the second training participant is obtained using a non-trusted initializer secret sharing matrix multiplication.
8. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a marker value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples for model training, the first number being not less than 2, the second number being equal to the first number minus one, the apparatus being on the first training participant side, the apparatus comprising:
a model conversion unit configured to perform model conversion processing on the sub-model of the first training participant in cooperation with each second training participant to obtain a converted sub-model at the first training participant;
A sample conversion unit configured to cooperate with each second training participant to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at the first training participant;
a matrix product acquisition unit configured to obtain matrix products between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner using secret sharing matrix multiplication;
a flag value decomposition unit configured to decompose the flag value into the first number of partial flag values;
a marker value transmitting unit configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively;
a predictor determination unit configured to determine a current predictor at the first training participant based on a matrix product at the first training participant;
a prediction difference determining unit configured to determine a prediction difference between a current predicted value of the first training participant and a corresponding partial marker value;
a model update amount determination unit configured to determine a model update amount at the first training participant based on the converted feature sample set and a prediction difference value at the first training participant;
A model updating unit configured to update a conversion sub-model of the first training participant based on a current conversion sub-model of the first training participant and a corresponding model update amount; and
a model determination unit configured to determine a sub-model of the first training participant based on the conversion sub-model of the respective training participant when the cycle end condition is satisfied,
wherein the sample conversion unit, the matrix product acquisition unit, the flag value decomposition unit, the flag value transmission unit, the predicted value determination unit, the predicted difference determination unit, the model update amount determination unit, and the model update unit are configured to cyclically execute operations until the cycle end condition is satisfied,
wherein, when the cycle end condition is not satisfied, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cycle process,
wherein the model conversion unit is configured to:
model decomposing the sub-model of the first training participant to obtain the first number of sub-model components;
transmitting one of the obtained second number of sub-model components to the remaining training participants, respectively, and receiving one sub-model component obtained by model decomposing the sub-model at each of the remaining training participants from the remaining training participants; and
Model stitching the locally retained sub-model components and the sub-model components received from each of the remaining training participants to obtain a transformed sub-model at the first training participant,
wherein the sample conversion unit includes:
a sample decomposition module configured to decompose the first subset of feature samples into the first number of first partial subsets of feature samples;
a sample transmitting/receiving module configured to transmit each of the second number of first partial feature sample subsets to a corresponding second training participant, and to receive a corresponding second partial feature sample subset from each second training participant, each second partial feature sample subset received being one of the first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at each second training participant; and
a sample stitching module configured to stitch the remaining first partial feature sample subset and the received second number of second partial feature sample subsets to obtain a converted feature sample subset at the first training participant.
9. The apparatus of claim 8, wherein the matrix product acquisition unit is configured to:
Obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner using a trusted initializer secret sharing matrix multiplication; or alternatively
A matrix product between the model-converted logistic regression model and the subset of conversion feature samples at the first training partner is obtained using a non-trusted initializer secret sharing matrix multiplication.
10. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a marker value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples for model training, the first number being not less than 2, the second number being equal to the first number minus one, the apparatus being on the second training participant side, the apparatus comprising:
A model conversion unit configured to perform model conversion processing on the sub-model of the second training participant in cooperation with the rest of the training participants to obtain a converted sub-model at the second training participant;
a sample conversion unit configured to cooperate with the remaining training participants to perform a vertical-to-horizontal segmentation conversion on the feature sample set to obtain a converted feature sample subset at the second training participant;
a matrix product acquisition unit configured to obtain matrix products between the model-converted logistic regression model and the subset of conversion feature samples at the second training partner using secret sharing matrix multiplication;
a marker value receiving unit configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant;
a predictor determination unit configured to determine a current predictor at the second training participant based on a matrix product at the second training participant;
a prediction difference determination unit configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value;
A model update amount determination unit configured to obtain a model update amount of the second training participant using secret sharing matrix multiplication based on the converted feature sample set and a predicted difference value of the second training participant;
a model updating unit configured to update a conversion sub-model of the second training participant based on a current conversion sub-model of the second training participant and a corresponding model update amount; and
a model determination unit configured to determine a sub-model of the second training participant based on the conversion sub-model of the respective training participant when the cycle end condition is satisfied,
wherein the sample conversion unit, the matrix product acquisition unit, the flag value reception unit, the predicted value determination unit, the predicted difference determination unit, the model update amount determination unit, and the model update unit are configured to cyclically execute operations until the cycle end condition is satisfied,
wherein, when the cycle end condition is not satisfied, the updated conversion sub-model of each training participant is used as the current conversion sub-model of the next cycle process,
wherein the model conversion unit is configured to:
Model decomposing the sub-model of the second training participant to obtain the first number of sub-model components;
transmitting one of the obtained second number of sub-model components to the remaining training participants, respectively, and receiving one sub-model component obtained by model decomposing the sub-model at each of the remaining training participants from the remaining training participants; and
model stitching the locally retained sub-model components and the sub-model components received from each of the remaining training participants to obtain a transformed sub-model at the second training participant,
wherein the sample conversion unit includes:
a sample decomposition module configured to decompose the second subset of feature samples into the first number of second partial subsets of feature samples;
a sample transmitting/receiving module configured to transmit each of the second number of second partial feature sample subsets to a first training participant and the remaining second training participants, and to receive a first partial feature sample subset from the first training participant and a second partial feature sample subset from each of the remaining second training participants, the first partial feature sample subset being one of a first number of first partial feature sample subsets obtained by decomposing a feature sample subset at the first training participant, each of the received second partial feature sample subsets being one of a first number of second partial feature sample subsets obtained by decomposing a respective second feature sample subset at each of the remaining second training participants; and
A sample stitching module configured to stitch the remaining second partial feature sample subset, the received first and second partial feature sample subsets to obtain a converted feature sample subset at the second training participant.
11. The apparatus of claim 10, wherein the matrix product acquisition unit is configured to:
obtaining a matrix product between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner using a trusted initializer secret sharing matrix multiplication; or alternatively
Matrix products between the model-converted logistic regression model and the subset of conversion feature samples of the second training partner are obtained using a non-trusted initializer secret sharing matrix multiplication.
12. The apparatus of claim 10, wherein the model update amount determination unit is configured to:
obtaining a model update of the second training participant using a trusted initializer secret sharing matrix multiplication based on the converted feature sample set and the predicted difference of the second training participant; or alternatively
Based on the converted feature sample set and the predicted difference value of the second training participant, a model update amount of the second training participant is obtained using a non-trusted initializer secret sharing matrix multiplication.
13. A system for co-training a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the system comprising:
a first training participant device comprising the apparatus of claim 8 or 9; and
a second number of second training participant devices, each second training participant device comprising an apparatus as claimed in any one of claims 10 to 12,
wherein each training participant has a sub-model, the first training participant has a first subset of feature samples and a marker value, and each second training participant has a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the feature sample set for model training, the first number being not less than 2, the second number being equal to the first number minus one.
14. A computing device, comprising:
at least one processor, and
a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-4.
15. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 4.
16. A computing device, comprising:
at least one processor, and
a memory coupled to the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 5 to 7.
17. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 5 to 7.
CN201910600908.9A 2019-07-04 2019-07-04 Model training method, device and system Active CN112183565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910600908.9A CN112183565B (en) 2019-07-04 2019-07-04 Model training method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910600908.9A CN112183565B (en) 2019-07-04 2019-07-04 Model training method, device and system

Publications (2)

Publication Number Publication Date
CN112183565A CN112183565A (en) 2021-01-05
CN112183565B true CN112183565B (en) 2023-07-14

Family

ID=73915148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910600908.9A Active CN112183565B (en) 2019-07-04 2019-07-04 Model training method, device and system

Country Status (1)

Country Link
CN (1) CN112183565B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109413087A (en) * 2018-11-16 2019-03-01 京东城市(南京)科技有限公司 Data sharing method, device, digital gateway and computer readable storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105656692B (en) * 2016-03-14 2019-05-24 南京邮电大学 Area monitoring method based on more example Multi-label learnings in wireless sensor network
CN110537191A (en) * 2017-03-22 2019-12-03 维萨国际服务协会 Secret protection machine learning
US10270599B2 (en) * 2017-04-27 2019-04-23 Factom, Inc. Data reproducibility using blockchains
CN109241749A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN109327421A (en) * 2017-08-01 2019-02-12 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN111543025A (en) * 2017-08-30 2020-08-14 因福尔公司 High precision privacy preserving real valued function evaluation
CN108520303A (en) * 2018-03-02 2018-09-11 阿里巴巴集团控股有限公司 A kind of recommendation system building method and device
CN108921358B (en) * 2018-07-16 2021-10-01 广东工业大学 Prediction method, prediction system and related device of power load characteristics
CN109446430B (en) * 2018-11-29 2021-10-01 西安电子科技大学 Product recommendation method and device, computer equipment and readable storage medium
EP3602379B1 (en) * 2019-01-11 2021-03-10 Advanced New Technologies Co., Ltd. A distributed multi-party security model training framework for privacy protection
CN110709863B (en) * 2019-01-11 2024-02-06 创新先进技术有限公司 Logistic regression modeling method, storage medium, and system using secret sharing

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109413087A (en) * 2018-11-16 2019-03-01 京东城市(南京)科技有限公司 Data sharing method, device, digital gateway and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Federated Machine Learning: Concept and Applications;QIANG YANG 等;《arXiv:1902.04885v1》;1-19 *

Also Published As

Publication number Publication date
CN112183565A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN111523673B (en) Model training method, device and system
CN110942147B (en) Neural network model training and predicting method and device based on multi-party safety calculation
CN110929886B (en) Model training and predicting method and system
CN110929870B (en) Method, device and system for training neural network model
CN111062487B (en) Machine learning model feature screening method and device based on data privacy protection
CN111061963B (en) Machine learning model training and predicting method and device based on multi-party safety calculation
CN111723404B (en) Method and device for jointly training business model
CN111523556B (en) Model training method, device and system
CN111079939B (en) Machine learning model feature screening method and device based on data privacy protection
CN112052942B (en) Neural network model training method, device and system
CN110851785A (en) Longitudinal federated learning optimization method, device, equipment and storage medium
CN111738438B (en) Method, device and system for training neural network model
CN110929887B (en) Logistic regression model training method, device and system
CN112132270B (en) Neural network model training method, device and system based on privacy protection
CN111523674B (en) Model training method, device and system
CN115730333A (en) Security tree model construction method and device based on secret sharing and homomorphic encryption
CN112183757B (en) Model training method, device and system
CN112183759B (en) Model training method, device and system
CN111523134A (en) Homomorphic encryption-based model training method, device and system
CN112101531A (en) Neural network model training method, device and system based on privacy protection
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
CN111523675B (en) Model training method, device and system
CN114492850A (en) Model training method, device, medium, and program product based on federal learning
CN112183565B (en) Model training method, device and system
CN111738453B (en) Business model training method, device and system based on sample weighting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40044586

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant