CN110210233A - Joint mapping method, apparatus, storage medium and the computer equipment of prediction model - Google Patents

Joint mapping method, apparatus, storage medium and the computer equipment of prediction model Download PDF

Info

Publication number
CN110210233A
CN110210233A CN201910319424.7A CN201910319424A CN110210233A CN 110210233 A CN110210233 A CN 110210233A CN 201910319424 A CN201910319424 A CN 201910319424A CN 110210233 A CN110210233 A CN 110210233A
Authority
CN
China
Prior art keywords
enterprise
data
sample characteristics
encryption
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910319424.7A
Other languages
Chinese (zh)
Other versions
CN110210233B (en
Inventor
毕野
黄博
吴振宇
王建明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910319424.7A priority Critical patent/CN110210233B/en
Priority to PCT/CN2019/102911 priority patent/WO2020211240A1/en
Publication of CN110210233A publication Critical patent/CN110210233A/en
Application granted granted Critical
Publication of CN110210233B publication Critical patent/CN110210233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses joint mapping method, apparatus, storage medium and the computer equipments of a kind of prediction model, it is related to information technology field, essentially consisting in can be avoided third party and colludes with data providing, the data for revealing other data providings can guarantee the safety of data while the modeling of each cartel.The described method includes: obtaining the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise;According to the sample characteristics data and the class label, the Encryption Model of each enterprise is constructed;The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, obtain the encryption data of each enterprise;According to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.The present invention is suitable for the joint mapping of prediction model.

Description

Joint mapping method, apparatus, storage medium and the computer equipment of prediction model
Technical field
The present invention relates to information technology fields, joint mapping method, apparatus, storage more particularly, to a kind of prediction model Medium and computer equipment.
Background technique
Prediction model in financial intelligent recommendation field decision-making, in terms of play key effect, In order to obtain the higher prediction model of precision of prediction, modeling would generally be combined between enterprise, especially the phenomenon that present analysis is non- It is often complicated, when mass data being needed to be trained, in cartel modeling, truthful data can't be divided between enterprise It enjoys, before sharing data, enterprise would generally encrypt the data of oneself, to ensure the privacy of business data, later Prediction model is constructed according to the encryption data that each enterprise shares.
Currently, common prediction model is linear regression model (LRM) and Logic Regression Models, for linear regression model (LRM) and patrol Collect the data encryption mode of regression model, it usually needs each enterprise of third direction provides corresponding random number or public key, respectively The random number or public key that a enterprise is provided by third party encrypt the data of oneself, are shared with other enterprises again later Industry.However, the data encryption process of linear regression model and Logic Regression Models, requires third-party presence, and It is required that third party is sincere enough, otherwise the random number for being supplied to certain enterprise is leaked to other enterprises by third party, other enterprises are returned The data that just can obtain the enterprise are postponed, the leakage of inside data of enterprise is caused, in addition, current cipher mode is all according to choosing Depending on the prediction model selected, above two prediction model all only relates to addition and multiplication, therefore its corresponding cipher mode is not Suitable for all prediction models.
Summary of the invention
The present invention provides joint mapping method, apparatus, storage medium and the computer equipments of a kind of prediction model, mainly It is that can be avoided third party colludes with data providing, reveals the data of other data providings, is modeled in each cartel While can guarantee the safeties of data.
According to the first aspect of the invention, a kind of joint mapping method of prediction model is provided, comprising:
Obtain the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise;
According to the sample characteristics data and the class label, the Encryption Model of each enterprise is constructed;
The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, are obtained each The encryption data of enterprise;
According to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.
According to the second aspect of the invention, a kind of joint mapping device of prediction model is provided, comprising:
Acquiring unit, for obtaining the sample characteristics data and the corresponding classification mark of the sample characteristics data of each enterprise Label;
First construction unit, for constructing adding for each enterprise according to the sample characteristics data and the class label Close model;
Encryption unit is carried out for the sample characteristics data of each enterprise to be separately input into corresponding Encryption Model Encryption, obtains the encryption data of each enterprise;
Second construction unit, for according to each enterprise encryption data and its corresponding class label joint mapping Prediction model.
According to the third aspect of the present invention, a kind of computer readable storage medium is provided, computer journey is stored thereon with Sequence, the program perform the steps of when being executed by processor
Obtain the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise;
According to the sample characteristics data and the class label, the Encryption Model of each enterprise is constructed;
The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, are obtained each The encryption data of enterprise;
According to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.
According to the fourth aspect of the present invention, a kind of computer equipment is provided, including memory, processor and is stored in On reservoir and the computer program that can run on a processor, the processor perform the steps of when executing described program
Obtain the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise;
According to the sample characteristics data and the class label, the Encryption Model of each enterprise is constructed;
The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, are obtained each The encryption data of enterprise;
According to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.
The joint mapping method, apparatus and computer equipment of a kind of prediction model provided by the invention need the with current The intervention of tripartite encrypts business data, and is compared according to the mode of the encryption data of enterprise joint modeling, energy of the present invention Enough obtain the sample characteristics data and the corresponding label data of sample characteristics data of each enterprise;And according to sample characteristics data and Class label constructs the Encryption Model of each enterprise;At the same time, the sample characteristics data of each enterprise are separately input into pair The Encryption Model answered is encrypted, and the encryption data of each enterprise is obtained;And according to the encryption data and its correspondence of each enterprise Class label joint mapping prediction model, thus without third-party intervention, enterprise can be by Encryption Model to inside Data encrypted, collude with so as to avoid third party and other enterprises, reveal inside data of enterprise, improve enterprises The safety of data, at the same Encryption Model to business data encryption by way of be applicable not only to Linear Regression Forecasting Model and Logistic regression prediction model can be applicable to other prediction models.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 shows a kind of joint mapping method flow diagram of prediction model provided in an embodiment of the present invention;
Fig. 2 shows the joint mapping method flow diagrams of another prediction model provided in an embodiment of the present invention;
Fig. 3 shows a kind of structural schematic diagram of the joint mapping device of prediction model provided in an embodiment of the present invention;
Fig. 4 shows the structural schematic diagram of the joint mapping device of another prediction model provided in an embodiment of the present invention;
Fig. 5 shows a kind of entity structure schematic diagram of computer equipment provided in an embodiment of the present invention.
Specific embodiment
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and in combination with Examples.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
Such as background technique, currently, common prediction model is linear regression model (LRM) and Logic Regression Models, for linearly returning Return the data encryption mode of model and Logic Regression Models, it usually needs each enterprise of third direction provide corresponding random number or Person's public key.However, the data encryption process of linear regression model and Logic Regression Models, requires third-party presence, And it is required that third party is sincere enough, otherwise third party colludes with other enterprises, will cause the leakage of inside data of enterprise, this Outside, current cipher mode is all depending on the prediction model of selection, and above two prediction model all only relates to addition and multiplies Method, therefore its corresponding cipher mode is not particularly suited for all prediction models.
To solve the above-mentioned problems, the embodiment of the invention provides a kind of joint mapping methods of prediction model, such as Fig. 1 institute Show, which comprises
101, the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise are obtained.
Wherein, the corresponding class label of sample characteristics data is true classification belonging to sample characteristics data, in each enterprise When industry joint modeling, inside data of enterprise and other enterprises are shared, in order not to which the truthful data of enterprise is leaked to other Enterprise needs to establish the Encryption Model of each enterprise according to inside data of enterprise, by Encryption Model to inside data of enterprise into Row encryption, then give encrypted data sharing to other enterprises, when constructing the Encryption Model of each enterprise, first have to obtain each The sample characteristics data and the corresponding class label of sample characteristics data of a enterprise, for example, each cartel building prediction mould Type predicts that the gender of people, the input of prediction model is characterized data, and the output of prediction model is the gender of people, to prediction When model is trained, the characteristic in training set includes the duration of online, the period of online, the spent amount of money of online shopping, likes The thing eaten is liked in the place gone, but these characteristics are not to be shared by all enterprises, wherein what P1 enterprise grasped Sample characteristics data include online duration, the period of online, the spent amount of money of online shopping, and P2 enterprise grasp sample characteristics data Including like place, like the thing eaten, the corresponding gender of respective every group of sample characteristics data known to P1 and P2 enterprise Label, respectively obtain P1 and P2 enterprise sample characteristics data gender label corresponding with the sample characteristics data, according to P1 with The sample characteristics data of P2 enterprise gender label corresponding with the sample characteristics data, establishes the encryption mould of P1 and P2 enterprise respectively Type.
102, according to the sample characteristics data and the class label, the Encryption Model of each enterprise is constructed.
It, can will be in enterprise in order to improve the precision of prediction model, when each cartel models for the embodiment of the present invention Portion's data sharing needs to construct Encryption Model pair to other enterprises in order not to which the truthful data of enterprise is leaked to other enterprises The internal data of enterprise is encrypted, and specifically when constructing Encryption Model, which can be gradient decline tree encryption mould Type, using predetermined gradient decline tree algorithm to the enterprise's sample characteristics data and the corresponding class label of sample characteristics data of acquisition It is trained, constructs the Encryption Model of each enterprise respectively, for example, 100 groups of sample characteristics data of P1 enterprise, including online Duration, the period of online, the spent amount of money of online shopping, every group of characteristic correspond to unique gender label, decline tree using gradient and calculate Method is trained 100 groups of sample characteristics data of P1 enterprise, constructs Encryption Model, to apply the Encryption Model to the enterprise Internal data is encrypted, and guarantees the privacy of the internal data of enterprise.
103, the sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, are obtained The encryption data of each enterprise.
For the embodiment of the present invention, each enterprise establishes Encryption Model according to the sample characteristics data and label classification of oneself Afterwards, the sample characteristics data of enterprise are inputted into corresponding Encryption Model, converts sample characteristics data in the sample of 0-1 member composition Feature vector encrypts inside data of enterprise with this.
For example, P1 enterprise constructs Encryption Model according to the sample characteristics data of oneself, which is gradient decline Encryption Model is set, which includes two trees, shares 5 leaf nodes, certain group sample characteristics data of P1 enterprise are input to Gradient decline tree Encryption Model, this group of sample characteristics data fallen in one tree second leaf node and second tree First leaf node, the dimension of leaf node number representative sample feature vector, different leaf node representative sample features to The different components of amount, if sample characteristics data are fallen on leaf node, by point of the corresponding sampling feature vectors of the leaf node Magnitude is set as 1, if sample characteristics data are not fallen on leaf node, by point of the corresponding sampling feature vectors of the leaf node Magnitude is set as 0, thus this group of sample characteristics data by gradient decline tree Encryption Model encryption after be converted into one five tie up to It measures Z1=[0,1,0,1,0], therefore is encrypted by sample characteristics data of the Encryption Model to enterprise, do not needed third-party Intervention, and other enterprises can not push back former data according to the encryption data of sharing, ensure that the safety of inside data of enterprise.
104, according to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.
For the embodiment of the present invention, by the encryption data of each enterprise and its corresponding class label and the sample of enterprise Eigen data aggregate constructs prediction model at prediction training set, and according to the prediction training set, for example, sample characteristics data X=[X1, X2] is possessed by enterprise P1 and enterprise P2 respectively, and enterprise P1 possesses sample characteristics data X1, and enterprise P2 possesses sample Characteristic X2, sample characteristics data X1 are encrypted by the Encryption Model that P1 enterprise constructs, and are converted into sampling feature vectors Z1, sample characteristics data X2 are encrypted by the Encryption Model that P2 enterprise constructs, and are converted into sampling feature vectors Z2, can be incited somebody to action Z=[Z1, Z2] is as prediction training set, in addition, in order to further increase the precision of prediction model, each enterprise not only can root It is predicted that training set Z=[Z1, Z2] constructs prediction model, it, can also be by Z=[X1, Z1, Z2] as pre- for P1 enterprise Training set is surveyed, and constructing prediction model according to the prediction training set can also be by Z=[X2, Z1, Z2] for P2 enterprise Prediction model is constructed as prediction training set, and according to the prediction training set.
A kind of joint mapping method of prediction model provided in an embodiment of the present invention, with need at present it is third-party intervention pair Business data is encrypted, and is compared according to the mode that encryption data cartel models, and the present invention can obtain each enterprise Sample characteristics data and the corresponding label data of the sample characteristics data;And according to the sample characteristics data and the mark Data are signed, the Encryption Model of each enterprise is constructed;At the same time, the sample characteristics data of each enterprise are separately input into Corresponding Encryption Model is encrypted, and the encryption data of each enterprise is obtained;And according to the encryption data of each enterprise and Its corresponding class label joint mapping prediction model, thus without third-party intervention, enterprise can pass through Encryption Model Internal data are encrypted, are colluded with so as to avoid third party and other enterprises, inside data of enterprise is revealed, improves enterprise The safety of industry internal data, while linear regression prediction is applicable not only in such a way that Encryption Model is to business data encryption Model and logistic regression prediction model, can be applicable to other prediction models.
Further, in order to better illustrate the above-mentioned process to inside data of enterprise encryption, as to above-described embodiment Refinement and extension, the embodiment of the invention provides the joint mapping methods of another prediction model, as shown in Fig. 2, the side Method includes:
201, the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise are obtained.
For the embodiment of the present invention, the sample characteristics data and the corresponding class label of sample characteristics data of each enterprise are pre- It is first stored in the database of each enterprise, when constructing the Encryption Model of each enterprise, the sample of enterprise is obtained from database Eigen data class label corresponding with the sample characteristics data.
202, the sample characteristics data and the class label are trained using predetermined gradient decline tree algorithm, with Construct the gradient decline tree Encryption Model.
For the embodiment of the present invention, the Encryption Model is gradient decline tree Encryption Model, and the step 202 specifically can be with Include: that initial training is carried out to the sample characteristics data and the class label using default decision Tree algorithms, obtains preliminary Decision-tree model;The class label and the preliminary decision-tree model are matched, the sample characteristics data is obtained and returns The each leaf node for belonging to the preliminary decision-tree model corresponds to the true probability value of classification;The sample characteristics data are defeated Enter to the preliminary decision-tree model and carry out class prediction, obtains the sample characteristics attribution data in the preliminary decision tree mould Each leaf node of type corresponds to the prediction probability value of classification;According to the difference of the true probability value and the prediction probability value Value, determines the residual error gradient drop-out value of preliminary repetitive exercise;According to the residual error gradient drop-out value, the sample characteristics data and The class label is iterated training to the preliminary decision-tree model, and the step of computing repeatedly residual error gradient drop-out value; It is when the residual error gradient drop-out value of calculating is the smallest residual error gradient drop-out value, the smallest residual error gradient drop-out value is corresponding The decision-tree model of iteration level training is determined as the gradient decline tree Encryption Model.
For example, 100 groups of sample characteristics data of P1 enterprise, the period of duration, online including online, the spent gold of online shopping Volume, every group of characteristic correspond to unique gender label, using gradient decline tree algorithm to 100 groups of sample characteristics numbers of P1 enterprise According to being trained, building gradient decline tree Encryption Model specifically gives initial estimation function Fk(x), it can also set initial Estimation function Fk(x)=0, k=1 ..., K, wherein K represents K classification, predicts for personality, and K is equal to 2, utilizes initial estimation Function estimates that sample characteristics data, the estimated value for obtaining sample characteristics data is F1(x) ..., FK(x), later to sample The estimated value of characteristic carries out logical conversion, obtains sample characteristics attribution data in the Probability p of each classification kk(x),
According to the probability value of the true probability value of the sample characteristics data and initial estimation Function Estimation, logarithm is obtained seemingly Right loss function are as follows:
Wherein, ykFor the true probability value of sample characteristics data, for example, when a sample belongs to classification k, yk=1, it is no Then yk=0, by sample characteristics attribution data in the Probability p of each classification kk(x) loss function is substituted into, and to its derivation, it can be with Obtain the gradient of loss function are as follows:
It is possible thereby to which calculating i-th of sample characteristics data to correspond to the gradient error of classification k is yik-pK, m-1, wherein M-1 represents the number of iterations, i.e. initial estimation function takes turns iteration by m-1, it can thus be appreciated that gradient error is i pairs of sample characteristics data Answer the true probability of classification k and after m-1 takes turns iteration prediction probability difference, missed later according to sample characteristics data and gradient Difference obtains decision-tree model, according to the decision-tree model of generation, calculates the residual error match value of each leaf node are as follows:
Wherein, J represents the leaf node number of decision-tree model, calculate the residual error match value of each leaf node with it is last round of The sum of estimation function of iteration obtains the estimation function of epicycle iteration are as follows:
Thus every single-step iteration all can establish a decision tree according to the current corresponding gradient error of sample characteristics data, So that the gradient of loss function is marched forward toward negative side, eventually pass through preset the number of iterations, so that gradient is minimum, determines at this time final Estimation function be gradient decline tree Encryption Model.
203, the sample characteristics data of each enterprise the gradient decline tree Encryption Model is input to encrypt, Obtain the corresponding sampling feature vectors of the sample characteristics data;The sampling feature vectors are determined as each enterprise Encryption data.
For the embodiment of the present invention, the Encryption Model that the sample characteristics data of enterprises are input to enterprise is added It is close, sample characteristics data are switched to the sampling feature vectors formed for 0-1 member, and the sampling feature vectors that 0-1 member is formed are made , can be shared with other enterprises for the encryption data of enterprise, specifically, step 203 further include: by the sample of each enterprise Characteristic is input to gradient decline tree Encryption Model and is matched, with the determination sample characteristics data whether with gradient The leaf node matching of decline tree Encryption Model;According to matching result, each characteristic matching of the sample characteristics data is determined Value;The leaf node quantity for declining tree Encryption Model according to gradient, determines the dimension of the sampling feature vectors;According to the sample The dimension of each the characteristic matching value and the sampling feature vectors of eigen data determines that the sample characteristics data are corresponding Sampling feature vectors further according to matching result, determine each characteristic matching value of the sample characteristics data, also wrap It includes: if the sample characteristics data are matched with the leaf node that the gradient declines tree Encryption Model, by the sample characteristics The characteristic matching value of data is determined as 1;If the leaf node of the sample characteristics data and gradient decline tree Encryption Model It mismatches, then the characteristic matching value of the sample characteristics data is determined as 0, thus convert sample spy for sample characteristics data Vector is levied, this cipher mode is not necessarily to third-party intervention, and other enterprises can not also push back according to the encryption data of sharing Former data ensure that the safety of inside data of enterprise.
204, using logic of propositions regression algorithm to the encryption data of each enterprise and its corresponding class label into Row training, to construct the logistic regression prediction model.
For the embodiment of the present invention, the prediction model is logistic regression prediction model, and step 204 specifically further includes utilizing Maximum likelihood estimation algorithm is trained the encryption data of each enterprise and its corresponding class label, obtains greatly seemingly So estimation prediction model;Convergence calculating is carried out to the Maximum-likelihood estimation prediction model using gradient descent algorithm, obtains institute Logistic regression prediction model is stated, for example, each cartel constructs personality prediction model, obtains 100 group encryption numbers of P1 enterprise According to 100 group encryption data Z2 of Z1 and P2 enterprise, which corresponds to unique personality label, by Z=[Z1, Z2] as pre- Training set is surveyed, according to the prediction training set construction logic regressive prediction model, structure forecast function first is as follows:
Wherein, anticipation function hθ(x) indicate that prediction result takes 1 probability, then for the characteristic to be predicted of input, Its classification results is respectively as follows: for the probability of classification 1 and classification 0
P (y=1 | x;θ)=hθ(x)
P (y=0 | x;θ)=1-hθ(x)
Wherein, y=1 represents classification results as male, and y=0 represents classification results as women, later according to anticipation function, It is as follows using maximum likelihood algorithm construction loss function:
Wherein, i indicates i-th of sample data, and m indicates number of samples, solves maximum likelihood using gradient descent algorithm and damages Parameter θ when function minimum is lost, the θ of solution is optimal parameter, according to optimal parameter θ, determines that final anticipation function is Logistic regression prediction model, due to the encryption data joint of different enterprises being used as pre- in the building of logistic regression prediction model Training set is surveyed, can be further improved the precision of prediction model.
The joint mapping method of another kind prediction model provided in an embodiment of the present invention, and needs third-party intervention at present Business data is encrypted, and is compared according to the mode that encryption data cartel models, the present invention can obtain each enterprise The sample characteristics data of industry and the corresponding label data of the sample characteristics data;It can be according to the sample characteristics data and institute Label data is stated, the Encryption Model of each enterprise is constructed;At the same time, the sample characteristics data difference of each enterprise is defeated Enter to corresponding Encryption Model and encrypted, obtains the encryption data of each enterprise;And according to the encryption number of each enterprise According to and its corresponding class label joint mapping prediction model, thus without third-party intervention, enterprise can pass through encryption Model encrypts internal data, colludes with so as to avoid third party and other enterprises, reveals inside data of enterprise, improves The safety of inside data of enterprise, while linear regression is applicable not only in such a way that Encryption Model is to business data encryption Prediction model and logistic regression prediction model, can be applicable to other prediction models.
Further, as the specific implementation of Fig. 1, the embodiment of the invention provides a kind of joint mapping of prediction model dresses It sets, as shown in figure 3, described device includes: acquiring unit 31, the first construction unit 32, encryption unit 33 and the second construction unit 34。
The acquiring unit 31 can be used for obtaining the sample characteristics data and the sample characteristics data pair of each enterprise The class label answered.The acquiring unit 31 is the sample characteristics data that each enterprise is obtained in the present apparatus and the sample characteristics The main functional modules of the corresponding class label of data.
First construction unit 32 can be used for according to the sample characteristics data and the class label, and building is each The Encryption Model of a enterprise.First construction unit 32 is in the present apparatus according to the sample characteristics data and the classification mark Label, construct the main functional modules and nucleus module of the Encryption Model of each enterprise.
The encryption unit 33 can be used for for the sample characteristics data of each enterprise being separately input into corresponding add Close model is encrypted, and the encryption data of each enterprise is obtained.The encryption unit 33 is in the present apparatus by each enterprise Sample characteristics data be separately input into corresponding Encryption Model and encrypted, obtain the main function of the encryption data of each enterprise It can module and nucleus module.
Second construction unit 34 can be used for encryption data and its corresponding classification mark according to each enterprise Sign joint mapping prediction model.Second construction unit 34 be in the present apparatus according to the encryption data of each enterprise and its The main functional modules of corresponding class label joint mapping prediction model.
For the embodiment of the present invention, the Encryption Model declines for gradient sets Encryption Model, first construction unit 32, It specifically can be used for declining tree algorithm using predetermined gradient and the sample characteristics data and the class label be trained, with Construct the gradient decline tree Encryption Model.
In addition, first construction unit 32 further include: initial training module 321, matching module 322, prediction module 323, determining module 324 and repetitive exercise module 325.
The initial training module 321 can be used for using default decision Tree algorithms to the sample characteristics data and institute It states class label and carries out initial training, obtain preliminary decision-tree model.
The matching module 322 can be used for matching the class label and the preliminary decision-tree model, obtain The true probability value of classification is corresponded in each leaf node of the preliminary decision-tree model to the sample characteristics attribution data.
The prediction module 323, can be used for for the sample characteristics data being input to the preliminary decision-tree model into Row class prediction obtains the sample characteristics attribution data in each leaf node of the preliminary decision-tree model and corresponds to classification Prediction probability value.
The determining module 324 can be used for the difference according to the true probability value and the prediction probability value, determine The residual error gradient drop-out value of preliminary repetitive exercise.
The repetitive exercise module 325, can be used for according to the residual error gradient drop-out value, the sample characteristics data and The class label is iterated training to the preliminary decision-tree model, and the step of computing repeatedly residual error gradient drop-out value.
The determining module 324 can be also used for when the residual error gradient drop-out value calculated being the decline of the smallest residual error gradient When value, the smallest residual error gradient drop-out value is corresponded to the decision-tree model of iteration level training, is determined as under the gradient Drop tree Encryption Model.
For the embodiment of the present invention, the encryption unit 33, comprising: encrypting module 331 and determining module 332.
The encrypting module 331 can be used for for the sample characteristics data of each enterprise being input under the gradient Drop tree Encryption Model is encrypted, and the corresponding sampling feature vectors of the sample characteristics data are obtained.
The determining module 332 can be used for for the sampling feature vectors being determined as the encryption number of each enterprise According to.
In addition, the detailed process of sampling feature vectors is converted into for sample characteristics data, the encrypting module 331, also It include: matched sub-block 3311 and determining submodule 3312.
The matched sub-block 3311 can be used for the sample characteristics data of each enterprise being input to the gradient Decline tree Encryption Model is matched, and whether declines the leaf section for setting Encryption Model with gradient with the determination sample characteristics data Point matching.
The determining submodule 3312 can be used for determining each spy of the sample characteristics data according to matching result Levy matching value.
The determining submodule 3312 can be also used for the leaf node quantity for declining tree Encryption Model according to gradient, really The dimension of the fixed sampling feature vectors.
The determining submodule 3312 can be also used for according to each characteristic matching value of the sample characteristics data and institute The dimension for stating sampling feature vectors determines the corresponding sampling feature vectors of the sample characteristics data.
In addition, the determination process of each characteristic value for sample characteristics data, the determining submodule 3312 specifically may be used If to be matched for the sample characteristics data with the leaf node that the gradient declines tree Encryption Model, the sample is special The characteristic matching value of sign data is determined as 1;If the leaf section of the sample characteristics data and gradient decline tree Encryption Model Point mismatches, then the characteristic matching value of the sample characteristics data is determined as 0.
For the embodiment of the present invention, second construction unit 34 specifically can be used for the encryption of each enterprise The sample characteristics data aggregate of data and its corresponding class label and enterprise is at prediction training set, and according to the prediction Training set constructs prediction model.
In addition, the prediction model is logistic regression prediction model, second construction unit 34 specifically be can be also used for The encryption data of each enterprise and its corresponding class label are trained using logic of propositions regression algorithm, with building The logistic regression prediction model.
Further, for the specific building process of logistic regression prediction model, second construction unit 34 is also wrapped It includes: training module 341 and computing module 342.
The training module 341 can be used for the encryption data using maximum likelihood estimation algorithm to each enterprise And its corresponding class label is trained, and obtains Maximum-likelihood estimation prediction model.
The computing module 342, can be used for using gradient descent algorithm to the Maximum-likelihood estimation prediction model into Row convergence calculates, and obtains the logistic regression prediction model.
It should be noted that each function involved by a kind of joint mapping device of prediction model provided in an embodiment of the present invention Other corresponding descriptions of module, can be with reference to the corresponding description of method shown in Fig. 1, and details are not described herein.
Based on above-mentioned method as shown in Figure 1, correspondingly, the embodiment of the invention also provides a kind of computer-readable storage mediums Matter is stored thereon with computer program, which performs the steps of the sample spy for obtaining each enterprise when being executed by processor Levy data and the corresponding class label of the sample characteristics data;According to the sample characteristics data and the class label, structure Build the Encryption Model of each enterprise;The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to carry out Encryption, obtains the encryption data of each enterprise;Combined according to the encryption data of each enterprise and its corresponding class label Construct prediction model.
Based on the embodiment of above-mentioned method as shown in Figure 1 and device as shown in Figure 3, the embodiment of the invention also provides one kind The entity structure diagram of computer equipment, as shown in figure 5, the computer equipment includes: processor 41, memory 42 and is stored in On memory 42 and the computer program that can run on a processor, wherein memory 42 and processor 41 are arranged at bus 43 The upper processor 41 performs the steps of the sample characteristics data for obtaining each enterprise and the sample when executing described program The corresponding class label of characteristic;According to the sample characteristics data and the class label, the encryption of each enterprise is constructed Model;The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, obtain each enterprise The encryption data of industry;According to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.
According to the technical solution of the present invention, the present invention can obtain the sample characteristics data and sample characteristics number of each enterprise According to corresponding label data;And according to sample characteristics data and class label, the Encryption Model of each enterprise is constructed;It is same with this When, the sample characteristics data of each enterprise are separately input into corresponding Encryption Model and are encrypted, adding for each enterprise is obtained Ciphertext data;And according to the encryption data of each enterprise and its corresponding class label joint mapping prediction model, thus without Third-party intervention, enterprise can encrypt internal data by Encryption Model, so as to avoid third party and other Enterprise colludes with, and reveals inside data of enterprise, improves the safety of inside data of enterprise, while by Encryption Model to enterprise's number It is applicable not only to Linear Regression Forecasting Model and logistic regression prediction model according to the mode of encryption, can be applicable to other predictions Model.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored It is performed by computing device in the storage device, and in some cases, it can be to be different from shown in sequence execution herein Out or description the step of, perhaps they are fabricated to each integrated circuit modules or by them multiple modules or Step is fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware and softwares to combine.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all include within protection scope of the present invention.

Claims (10)

1. a kind of joint mapping method of prediction model characterized by comprising
Obtain the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise;
According to the sample characteristics data and the class label, the Encryption Model of each enterprise is constructed;
The sample characteristics data of each enterprise are separately input into corresponding Encryption Model to encrypt, obtain each enterprise Encryption data;
According to the encryption data of each enterprise and its corresponding class label joint mapping prediction model.
2. the method according to claim 1, wherein the Encryption Model is gradient decline tree Encryption Model, institute It states according to the sample characteristics data and the class label, constructs the Encryption Model of each enterprise, comprising:
The sample characteristics data and the class label are trained using predetermined gradient decline tree algorithm, described in building Gradient decline tree Encryption Model;
The sample characteristics data by each enterprise are separately input into corresponding Encryption Model and encrypt, and obtain each The encryption data of enterprise, comprising:
The sample characteristics data of each enterprise are input to the gradient decline tree Encryption Model to encrypt, are obtained described The corresponding sampling feature vectors of sample characteristics data;
The sampling feature vectors are determined as to the encryption data of each enterprise.
3. according to the method described in claim 2, it is characterized in that, described decline tree algorithm to the sample using predetermined gradient Characteristic and the class label are trained, to construct the gradient decline tree Encryption Model, comprising:
Initial training is carried out to the sample characteristics data and the class label using default decision Tree algorithms, is tentatively determined Plan tree-model;
The class label and the preliminary decision-tree model are matched, obtain the sample characteristics attribution data in described Each leaf node of preliminary decision-tree model corresponds to the true probability value of classification;
The sample characteristics data are input to the preliminary decision-tree model and carry out class prediction, obtain the sample characteristics number The prediction probability value of classification is corresponded to according to each leaf node for belonging to the preliminary decision-tree model;
According to the difference of the true probability value and the prediction probability value, the residual error gradient decline of preliminary repetitive exercise is determined Value;
According to the residual error gradient drop-out value, the sample characteristics data and the class label to the preliminary decision-tree model It is iterated training, and the step of computing repeatedly residual error gradient drop-out value;
When the residual error gradient drop-out value of calculating is the smallest residual error gradient drop-out value, by the smallest residual error gradient drop-out value The decision-tree model of corresponding iteration level training is determined as the gradient decline tree Encryption Model.
4. according to the method described in claim 2, it is characterized in that, the sample characteristics data by each enterprise input It is encrypted to gradient decline tree Encryption Model, obtains the corresponding sampling feature vectors of the sample characteristics data, comprising:
The sample characteristics data of each enterprise are input to the gradient decline tree Encryption Model to match, to determine The leaf node whether sample characteristics data decline tree Encryption Model with gradient is stated to match;
According to matching result, each characteristic matching value of the sample characteristics data is determined;
The leaf node quantity for declining tree Encryption Model according to gradient, determines the dimension of the sampling feature vectors;
According to the dimension of each the characteristic matching value and the sampling feature vectors of the sample characteristics data, the sample is determined The corresponding sampling feature vectors of characteristic.
5. according to the method described in claim 4, determining the sample characteristics number it is characterized in that, described according to matching result According to each characteristic matching value, comprising:
If the sample characteristics data are matched with the leaf node that the gradient declines tree Encryption Model, by the sample characteristics The characteristic matching value of data is determined as 1;
If the leaf node of the sample characteristics data and gradient decline tree Encryption Model mismatches, and the sample is special The characteristic matching value of sign data is determined as 0.
6. the method according to claim 1, wherein the encryption data according to each enterprise and its right The class label joint mapping prediction model answered, comprising:
By the sample characteristics data aggregate of the encryption data of each enterprise and its corresponding class label and enterprise at pre- Training set is surveyed, and prediction model is constructed according to the prediction training set.
7. method according to claim 1-6, which is characterized in that the prediction model is that logistic regression predicts mould Type, the encryption data according to each enterprise and its corresponding class label joint mapping prediction model, comprising:
The encryption data of each enterprise and its corresponding class label are trained using logic of propositions regression algorithm, with Construct the logistic regression prediction model.
8. a kind of joint mapping device of prediction model characterized by comprising
Acquiring unit, for obtaining the sample characteristics data and the corresponding class label of the sample characteristics data of each enterprise;
First construction unit, for constructing the encryption mould of each enterprise according to the sample characteristics data and the class label Type;
Encryption unit is added for the sample characteristics data of each enterprise to be separately input into corresponding Encryption Model It is close, obtain the encryption data of each enterprise;
Second construction unit, for according to the encryption data of each enterprise and its prediction of corresponding class label joint mapping Model.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt The step of processor realizes method described in any one of claims 1 to 7 when executing.
10. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the computer program is realized described in any one of claims 1 to 7 when being executed by processor Method the step of.
CN201910319424.7A 2019-04-19 2019-04-19 Combined construction method and device of prediction model, storage medium and computer equipment Active CN110210233B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910319424.7A CN110210233B (en) 2019-04-19 2019-04-19 Combined construction method and device of prediction model, storage medium and computer equipment
PCT/CN2019/102911 WO2020211240A1 (en) 2019-04-19 2019-08-27 Joint construction method and apparatus for prediction model, and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910319424.7A CN110210233B (en) 2019-04-19 2019-04-19 Combined construction method and device of prediction model, storage medium and computer equipment

Publications (2)

Publication Number Publication Date
CN110210233A true CN110210233A (en) 2019-09-06
CN110210233B CN110210233B (en) 2024-05-24

Family

ID=67786051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910319424.7A Active CN110210233B (en) 2019-04-19 2019-04-19 Combined construction method and device of prediction model, storage medium and computer equipment

Country Status (2)

Country Link
CN (1) CN110210233B (en)
WO (1) WO2020211240A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728375A (en) * 2019-10-16 2020-01-24 支付宝(杭州)信息技术有限公司 Method and device for training logistic regression model by combining multiple computing units
CN111428887A (en) * 2020-03-19 2020-07-17 腾讯云计算(北京)有限责任公司 Model training control method, device and system based on multiple computing nodes
CN111738441A (en) * 2020-07-31 2020-10-02 支付宝(杭州)信息技术有限公司 Prediction model training method and device considering prediction precision and privacy protection
CN112199706A (en) * 2020-10-26 2021-01-08 支付宝(杭州)信息技术有限公司 Tree model training method and business prediction method based on multi-party safety calculation
CN112668016A (en) * 2020-01-02 2021-04-16 华控清交信息科技(北京)有限公司 Model training method and device and electronic equipment
CN112816898A (en) * 2021-01-26 2021-05-18 三一重工股份有限公司 Battery failure prediction method and device, electronic equipment and storage medium
WO2022088606A1 (en) * 2020-10-29 2022-05-05 平安科技(深圳)有限公司 Gbdt and lr fusion method and apparatus based on federated learning, device, and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727462A (en) * 2008-10-17 2010-06-09 北京大学 Method and device for generating Chinese comparative sentence sorter model and identifying Chinese comparative sentences
US20150032680A1 (en) * 2013-07-25 2015-01-29 International Business Machines Corporation Parallel tree based prediction
CN108615044A (en) * 2016-12-12 2018-10-02 腾讯科技(深圳)有限公司 A kind of method of disaggregated model training, the method and device of data classification
US20180349740A1 (en) * 2016-02-04 2018-12-06 Abb Schweiz Ag Machine learning based on homomorphic encryption
CN109002861A (en) * 2018-08-10 2018-12-14 深圳前海微众银行股份有限公司 Federal modeling method, equipment and storage medium
CN109299728A (en) * 2018-08-10 2019-02-01 深圳前海微众银行股份有限公司 Federal learning method, system and readable storage medium storing program for executing

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308418B (en) * 2017-07-28 2021-09-24 创新先进技术有限公司 Model training method and device based on shared data
CN108520181B (en) * 2018-03-26 2022-04-22 联想(北京)有限公司 Data model training method and device
CN109033854B (en) * 2018-07-17 2020-06-09 阿里巴巴集团控股有限公司 Model-based prediction method and device
CN109635462A (en) * 2018-12-17 2019-04-16 深圳前海微众银行股份有限公司 Model parameter training method, device, equipment and medium based on federation's study

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727462A (en) * 2008-10-17 2010-06-09 北京大学 Method and device for generating Chinese comparative sentence sorter model and identifying Chinese comparative sentences
US20150032680A1 (en) * 2013-07-25 2015-01-29 International Business Machines Corporation Parallel tree based prediction
US20180349740A1 (en) * 2016-02-04 2018-12-06 Abb Schweiz Ag Machine learning based on homomorphic encryption
CN108615044A (en) * 2016-12-12 2018-10-02 腾讯科技(深圳)有限公司 A kind of method of disaggregated model training, the method and device of data classification
CN109002861A (en) * 2018-08-10 2018-12-14 深圳前海微众银行股份有限公司 Federal modeling method, equipment and storage medium
CN109299728A (en) * 2018-08-10 2019-02-01 深圳前海微众银行股份有限公司 Federal learning method, system and readable storage medium storing program for executing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MOHAMED WALEED FAKHR: "Multiple Encrypted Random Forests using Compressed Sensing for Private Classification", 2018 INTERNATIONAL CONFERENCE ON INNOVATION AND INTELLIGENCE FOR INFORMATICS, COMPUTING, AND TECHNOLOGIES (3ICT), 20 November 2018 (2018-11-20), pages 1 - 7, XP033622822, DOI: 10.1109/3ICT.2018.8855770 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728375A (en) * 2019-10-16 2020-01-24 支付宝(杭州)信息技术有限公司 Method and device for training logistic regression model by combining multiple computing units
WO2021073234A1 (en) * 2019-10-16 2021-04-22 支付宝(杭州)信息技术有限公司 Method and device for jointly training logistic regression model by multiple computing units
CN110728375B (en) * 2019-10-16 2021-03-19 支付宝(杭州)信息技术有限公司 Method and device for training logistic regression model by combining multiple computing units
CN112668016A (en) * 2020-01-02 2021-04-16 华控清交信息科技(北京)有限公司 Model training method and device and electronic equipment
CN112668016B (en) * 2020-01-02 2023-12-08 华控清交信息科技(北京)有限公司 Model training method and device and electronic equipment
CN111428887A (en) * 2020-03-19 2020-07-17 腾讯云计算(北京)有限责任公司 Model training control method, device and system based on multiple computing nodes
CN111738441B (en) * 2020-07-31 2020-11-17 支付宝(杭州)信息技术有限公司 Prediction model training method and device considering prediction precision and privacy protection
CN111738441A (en) * 2020-07-31 2020-10-02 支付宝(杭州)信息技术有限公司 Prediction model training method and device considering prediction precision and privacy protection
CN112199706A (en) * 2020-10-26 2021-01-08 支付宝(杭州)信息技术有限公司 Tree model training method and business prediction method based on multi-party safety calculation
CN112199706B (en) * 2020-10-26 2022-11-22 支付宝(杭州)信息技术有限公司 Tree model training method and business prediction method based on multi-party safety calculation
WO2022088606A1 (en) * 2020-10-29 2022-05-05 平安科技(深圳)有限公司 Gbdt and lr fusion method and apparatus based on federated learning, device, and storage medium
CN112816898A (en) * 2021-01-26 2021-05-18 三一重工股份有限公司 Battery failure prediction method and device, electronic equipment and storage medium
CN112816898B (en) * 2021-01-26 2022-03-01 三一重工股份有限公司 Battery failure prediction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110210233B (en) 2024-05-24
WO2020211240A1 (en) 2020-10-22

Similar Documents

Publication Publication Date Title
CN110210233A (en) Joint mapping method, apparatus, storage medium and the computer equipment of prediction model
Cheng et al. Secureboost: A lossless federated learning framework
US20230078061A1 (en) Model training method and apparatus for federated learning, device, and storage medium
Zhu et al. Federated learning on non-IID data: A survey
CN110490738A (en) A kind of federal learning method of mixing and framework
CN112085159B (en) User tag data prediction system, method and device and electronic equipment
CN109767301A (en) Recommended method and system, computer installation, computer readable storage medium
CN113505882B (en) Data processing method based on federal neural network model, related equipment and medium
Chen et al. Secure social recommendation based on secret sharing
Liu et al. Keep your data locally: Federated-learning-based data privacy preservation in edge computing
CN114595396B (en) Federal learning-based sequence recommendation method and system
Ramadass et al. Evaluation of cloud vendors from probabilistic linguistic information with unknown/partial weight values
Rolf et al. Representation matters: Assessing the importance of subgroup allocations in training data
CN113609398A (en) Social recommendation method based on heterogeneous graph neural network
CN112084520B (en) Method and device for protecting business prediction model of data privacy through joint training of two parties
CN113065046A (en) Product defect detection equipment and method
CN114168988B (en) Federal learning model aggregation method and electronic device
Hussain et al. Federated learning: A survey of a new approach to machine learning
KR101522306B1 (en) A system and control method for a meta-heuristic algorithm utilizing similarity for performance enhancement
Jia et al. A multi-criterion group decision-making method based on regret theory under 2-tuple linguistic environment
CN117521102A (en) Model training method and device based on federal learning
CN112101609B (en) Prediction system, method and device for user repayment timeliness and electronic equipment
Liu et al. A cyber physical system crowdsourcing inference method based on tempering: an advancement in artificial intelligence algorithms
WO2023029324A1 (en) Marketing arbitrage underground industry identification method based on dynamic attention graph network
CN114463590A (en) Information processing method, apparatus, device, storage medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant