CN112634027A

CN112634027A - Self-adaptive federal parameter aggregation method for credit assessment of small and micro enterprises

Info

Publication number: CN112634027A
Application number: CN202011600934.0A
Authority: CN
Inventors: 詹士潇; 张帅; 黄方蕾; 汪小益; 吴琛; 胡麦芳; 张珂杰; 匡立中; 谢杨洁; 邱炜伟; 蔡亮; 李伟
Original assignee: Hangzhou Qulian Technology Co Ltd
Current assignee: Hangzhou Qulian Technology Co Ltd
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2021-04-09

Abstract

The invention discloses a self-adaptive federal parameter aggregation method for credit assessment of a small micro-enterprise, which specifically comprises the following steps: s1, constructing a network by the participants; s2, training the participants locally to obtain accuracy; s3, carrying out similarity detection on the participants by using the intelligent contract; s4, obtaining a training matrix and selecting participants to perform parameter aggregation; s5, returning the parameters to the participants, and retraining to obtain the accuracy; s6, judging the accuracy of the two times, and if the accuracy is improved, ending the training; otherwise, updating the weight and the aggregated parameters in the distance matrix; s7, carrying out parameter aggregation again according to the training matrix; and S8, until all participants are improved in result. The method can automatically select the participants for parameter aggregation, does not need to perform parameter aggregation on all the participants, adaptively updates the weight in the distance matrix according to the accuracy before and after training, does not need third party participation, is decentralized, and can ensure the privacy of data information.

Description

Self-adaptive federal parameter aggregation method for credit assessment of small and micro enterprises

Technical Field

The invention relates to the field of credit assessment of small and micro enterprises, in particular to a self-adaptive federal parameter aggregation method for credit assessment of small and micro enterprises.

Background

A common method currently used to address data islanding and data privacy is federal learning. Federal learning was originally proposed by ***, and their main idea was to build machine learning models based on data sets distributed across multiple devices while preventing data leakage. Recent improvements have focused on overcoming statistical challenges and improving security for federal learning. The Oerson bank is a mechanism which provides a 'federal study' which is a general solution for solving the problems of data island and data privacy protection in artificial intelligence landing and is provided by the first family in China.

Federal learning is a cryptographic distributed machine learning technique that allows users to train machine learning models using multiple data sets distributed over different locations, while preventing data leakage and adhering to strict data privacy regulations. Federal learning is a learning process in which data owners jointly train a model while returning trained parameters to the participants. In this process, any owner of the data will not expose its data to others.

For different data sets, the federal learning is divided into horizontal federal learning, vertical federal learning and federal transfer learning. In the case of more user feature overlap and less user overlap of two data sets, we divide the data sets according to the horizontal direction (i.e. user dimension), and extract the part of data with the same user feature but not identical user for training. This method is called horizontal federal learning. Longitudinal federated learning under the condition that users of two data sets overlap more and user features overlap less, the data sets are divided according to the longitudinal direction (namely feature dimension), and the data of which the users are the same and the user features are not completely the same is taken out for training. This method is called longitudinal federal learning. Federal transfer learning under the condition that users of two data sets are less in feature overlapping with the users, data are not segmented, and the condition that data or labels are insufficient is overcome by transfer learning. This method is called federal migration learning.

At present, federal learning adopts a centralized aggregation method, all participant data need to be aggregated into parameters, each party of data participating in aggregation cannot be ensured to have positive effect, and the problem of large communication burden exists in a centralized parameter aggregation process. In fact, a federated network may be composed of a large number of devices, such as millions of smartphones, and the communication speed in the network may be many orders of magnitude slower than local computing.

Disclosure of Invention

The invention aims to provide a self-adaptive federal parameter aggregation method for small-scale enterprise credit assessment, which aims to solve the problems in the prior art and achieve the best accuracy of credit assessment for small-scale enterprises.

In order to achieve the purpose, the invention provides the following scheme:

the invention provides a self-adaptive federal parameter aggregation method for credit assessment of small micro-enterprises, which comprises the following specific steps:

s1, acquiring the geographical position information of the participants, constructing a network by taking the geographical position information as a data set, and calculating a distance matrix of the nodes of the participants;

s2, constructing a training model, and locally training the position information of the participants to obtain the accuracy;

s3, carrying out similarity detection (divided into data and parameter similarity detection) on the participants by using the intelligent contract;

s4, obtaining a training matrix selection model and carrying out parameter aggregation;

s5, returning the parameters to the model, and retraining to obtain the accuracy;

s6, judging the accuracy of the two times, simultaneously updating the weight in the aggregation parameter,

s7, if the accuracy is improved, the training is finished; otherwise, updating the weight and the aggregated parameters in the distance matrix, and performing parameter aggregation again according to the training matrix;

and S8, until all participants are improved in result.

The detailed process is as follows:

and S1, building a network according to the geographical position of the participant to calculate a distance matrix A for the participant nodes, wherein the distance matrix is a matrix representing the distance relationship between the nodes and is similar to the definition of the adjacent matrix of the undirected graph. Assume that there are n participants, i.e., n nodes.

The distance matrix is defined as:

wherein w_ijFor weight, the weight is great when the distance between the nodes i and j is close, and the weight is small when the distance is far, A_[i，j]The distance value between the nodes i and j corresponds to a value in the distance matrix, and k is the number of the nodes between the nodes i and j;

s2, constructing a neural network model for each participantThe person performs model training locally to obtain the accuracy A of individual training_cci；

And S3, carrying out similarity detection on the participants by using the intelligent contract: the method comprises data similarity detection and parameter similarity detection.

1. And (3) detecting the data similarity:

vectorizing data, specifically including:

segmenting data, converting characters into one dimension in vector by word embedding, embedding words by combining numbers with words in front and back, and embedding words by Q_i＝τ_iQ_i-1+τ_i+1Q_i+1Wherein Q is_iEmbedding vectors, Q, for digital words_i-1One-dimensional, Q, of the transformed vector for the word preceding the number_i+1One-dimensional, tau, of the transformed vector of the word after the number_iAnd τ_i+1Are all weights, let τ be_iIs set to 0.6, tau_i+1Is set to 0.4, a vector E of a segment is obtained₁Each segment is then vectorized. Finally according to

Obtaining a data vector, and then carrying out data similarity detection;

carrying out data similarity detection, specifically comprising:

any two participants a and b are taken as vectors x and y after data vectorization, and x is assumed to be [ x ═ x [₁,x₂…x_n]^T,y＝[y₁,y₂…y_n]^T。

1) a, b inform each other of the modular length of the respective data vector, i.e. x₁And y₁And simultaneously a informs b of any hyperplane passing x: z is u₁z₁+u₂z₂+…+u_nz_nC, randomly selecting one of the hyperplanes which pass x, informing b, wherein c is the hyperplane definition, and the value is a constant;

2) b knowing a hyperplane in a, calculating the projection vector y of y on z_||And informs a. Due to the hyperplane methodThe vector is n ═ n₁,n₂…n_N]^TThen the projection vector of y on n is

Thus is provided with

3) a learning y_||Calculating the distance of similarity

And inform b of epsilon_iFor weighting, the initial value is set to 1, since y ═ y_||+y_⊥And x y_⊥0, so x y_||Then, reselecting a participant data vector, and calculating the similarity distance to obtain a data similarity matrix:

and (3) mechanism analysis:

1) a only knows the length y of y₁And y projection vector y on z_||Y cannot be completely recovered;

2) b knowing only the length x of x₁And a hyperplane z passing x, x cannot be completely recovered;

3) the privacy of the vector information can be ensured without the participation of a third party and decentralization.

2. Parameter similarity detection

Vectorizing the model parameters: and sequentially sampling parameters of each layer in the model trained by the participant according to a sampling rejection method, wherein the sampled parameters are used as one dimension of a parameter vector to obtain the parameter vector of the model, and the dimension with less layers is automatically supplemented by 0 to be consistent with the dimension with the length of the vector.

Detecting the parameter similarity: and (4) randomly taking the parameter vectors of the c model and the d model, and calculating the similarity distance, wherein the formula is shown as follows. And reselecting the two parameter vectors, and carrying out parameter similarity detection until all the models are detected, and finally obtaining a parameter similarity matrix.

Similarity distance:

parameter similarity matrix:

s3, combining the data similarity matrix and the parameter similarity matrix to obtain a training matrix: training matrix

Wherein T is a training matrix, P is a data similarity matrix, S is a parameter similarity matrix, x is a value between 0 and 4, the distance of the value is set to be 0.001, the distance of the value is set according to the size of a participant, the weight of the data similarity is large when the participant starts, and then the data similarity gradually decreases;

s4, selecting a model with a high value in the training matrix to carry out parameter aggregation according to the training requirement, and carrying out parameter calculation by adopting the following formula;

wherein sigma_iAs the weight, the initial value is set to 1;

s5, returning the aggregated parameters to the model, retraining and recording the accuracy A_ccj；

S6, judgment A_ccj>A_cciUpdating σ in aggregated parameters simultaneously_i：

Wherein sigma_iAn initial value is set to 1 for the weight, parameters are updated according to the result of training,

the weights after the update. Assuming that the training of the two models of c and d is finished, then obtaining the next training sigma of the model c according to the formula_iTaking the weight as the weight for parameter aggregation of c and other models;

s7, if A_ccj>A_cciThen the selection model training is finished. Otherwise, updating w of the training model in the distance matrix_ij:

Wherein eta is the update step length,

are the weights in the updated distance matrix. Updating the corresponding w in the distance matrix_ijCarrying out similarity detection again to obtain an updated training matrix, and reselecting a model with a high value in the training matrix for training until the accuracy is improved;

s8, reselecting the participants for training (excluding the model which is trained before);

s8, all model results are improved.

The invention discloses the following technical effects:

the method can automatically select the participants to perform parameter aggregation without parameter aggregation of all the participants, the parameters and the data similarity are combined to select the participants to perform parameter aggregation, meanwhile, the participants are reselected to perform parameter aggregation according to the weight in the distance matrix updated in a self-adaptive manner before and after training, and the method does not need participation of a third party, is decentralized, and can ensure the privacy of vector information and data information.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a flowchart of an adaptive federal parameter aggregation method for small micro-enterprise credit assessment.

Detailed Description

Reference will now be made in detail to various exemplary embodiments of the invention, the detailed description should not be construed as limiting the invention but as a more detailed description of certain aspects, features and embodiments of the invention.

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. Further, for numerical ranges in this disclosure, it is understood that each intervening value, between the upper and lower limit of that range, is also specifically disclosed. Every smaller range between any stated value or intervening value in a stated range and any other stated or intervening value in a stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included or excluded in the range.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although only preferred methods and materials are described herein, any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention. All documents mentioned in this specification are incorporated by reference herein for the purpose of disclosing and describing the methods and/or materials associated with the documents. In case of conflict with any incorporated document, the present specification will control.

It will be apparent to those skilled in the art that various modifications and variations can be made in the specific embodiments of the present disclosure without departing from the scope or spirit of the disclosure. Other embodiments will be apparent to those skilled in the art from consideration of the specification. The specification and examples are exemplary only.

As used herein, the terms "comprising," "including," "having," "containing," and the like are open-ended terms that mean including, but not limited to.

The "parts" in the present invention are all parts by mass unless otherwise specified.

Assuming that a small micro-enterprise S applies for loan from a business bank, there are now a credit A in the central bank, a business bank B, and a tax bureau C (the example is for brevity, and additional participants, such as agricultural banks, tax bureaus, statistics bureaus, etc.)

1. A network is first constructed based on the geographic location of A, B, C. Assuming the network is A-B-C, according to formula A_[xi，yi]＝w_ij(n-k) obtaining the distance of each node, wherein w_ijFor weight, the weight of the network node near is great, the weight of the node far is small, w is between A and B, and B and C_ijSet to 2, set to 1 between A and C, then calculate the distance between every two nodes, set to 0 node to node) get the distance matrix

2, A, B and C are trained locally to obtain accuracy acc of each evaluation on company credit₁，acc₂，acc₃Each having recorded training parameters of A: alpha_A＝0.13，β_A＝0.53，γ_A＝0.32；B：α_B＝0.26，β_B＝0.85，γ_B＝0.64；C：α_C＝0.42，β_C＝1.65，γ_C＝1.84

3. The similarity of participant data and parameters is detected using the intelligent contract as follows:

1) and (3) detecting the data similarity: the method is characterized in that the data of 50 ten thousand of deposits in an industrial and commercial bank and 5 ten thousand of taxes per year of C in the enterprise with good credit in the A are segmented and vectorized, and the specific method comprises the following steps:

segmenting data, converting characters into one dimension in vector by word embedding, embedding words by combining numbers with preceding and following words, and embedding words by Q_i＝τ_iQ_i-1+τ_i+1Q_i+1Wherein Q is_iEmbedding vectors, Q, for digital words_i-1One-dimensional, Q, of the transformed vector for the word preceding the number_i+1One-dimensional, tau, of the transformed vector of the word after the number_iAnd τ_i+1Are all weights, let τ be_iIs set to 0.6, tau_i+1Is set to 0.4, a vector E of a segment is obtained₁Then, each segment is vectorized as described above, assuming that three segment vectors of A are obtained

The vectors are generated from the same dictionary, so the dimensions are the same. Finally, obtaining the data vector of A

B data vector is obtained in the same way

And after the data vector is obtained, carrying out data similarity detection on the data:

a, B inform each other of the modular length of the respective data vector, i.e.

Same | E_B|＝2.73,|E_C0.74, while A tells B any one of E_AThe hyperplane of (a): z is u₁z₁+u₂z₂+…+u_nz_nWhen it is 0, assume that

Wherein x-passing hyperplanes are numerous, and one hyperplane is randomly selectedInforming the user of the information B;

(B) knowing a hyperplane in A, calculate E_BProjection vector y on z_||And informs a. E_BThe projection vector on n is

Thus is provided with

(iii) A knows y_||The similarity distance is calculated because EB ═ y_||+y_⊥And E is_A*y_⊥If equal to 0, there is E_A*E_B＝E_A*y_||。

Then calculate P_(A，C)＝0.62，P_(B，C)＝0.53

Fourthly, reselecting the data vector of the participant and calculating the similarity distance P_(A，C)＝0.62，P_(B，C)＝0.53

Fifthly, obtaining a data similarity matrix

And (3) mechanism analysis:

(a) knowing only the length y of y₁And y projection vector y on z_||Y cannot be completely recovered;

b knowing only the length x of x₁And a hyperplane z passing x, x cannot be completely recovered;

and thirdly, the privacy of the vector information can be ensured without the participation of a third party and without centralization.

2) Detecting the parameter similarity: and sequentially sampling trainable parameters of each layer of the model trained by the participants A, B and C according to a sampling rejection method, wherein the trainable parameters are used as one dimension of a parameter vector. And obtaining a parameter vector of the model, and automatically filling the dimension with less layer number to make the dimension consistent with the dimension with the length of the vector. Suppose A trained modelThe number of layers is 2, the first layer has 5 trainable parameters, the second layer has 7 trainable parameters, the model trained by B has 3 layers, the first layer has 6 trainable parameters, the second layer has 6 trainable parameters, and the third layer has 9 trainable parameters. A reject-sampling method is used for sampling each layer of a and B, assuming 3 trainable parameters, since B has one more layer, the parameter vector of a has one dimension of 0. Obtaining A parameter vector

Also, the same applies to

Detecting the similarity of the parameter vectors to obtain

Then, d (a, C) is calculated to be 0.23 and d (B, C) is calculated to be 0.56. Obtaining a similarity matrix

Training matrix

4. Assuming A is necessary, then 1.13<1.36, A and C are selected for parameter aggregation

5. Returning the aggregated parameters to A, B and C for retraining, simultaneously recording the accuracy Acc1, Acc2 and Acc3 respectively, and updating the aggregated parameters

Wherein sigma_iFor the weights, the initial value is set to 1,

the weights after the update. And updating the parameters according to the training result. Assuming that the training of the a model and the b model is finished, then obtaining the next training sigma of the a model according to the formula_iThis is used as a weight for parameter aggregation with other models.

6. If Accj is>Acci, then training of A and C is finished. Otherwise, updating w of the training model in the distance matrix_ij:

Wherein eta is the update step length,

are the weights in the updated distance matrix. Updating the corresponding w in the distance matrix_ijAnd carrying out similarity detection again to obtain an updated training matrix, and reselecting a model with a high value in the training matrix for training until the accuracy is improved.

7. Reselect participants A and B for training (exclude previously trained A and C)

8. All participants trained locally with improved accuracy of the assessment.

The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.

Claims

1. A self-adaptive federal parameter aggregation method for small micro-enterprise credit assessment is characterized in that: the method comprises the following steps:

s2, constructing a neural network model, and carrying out model training on each participant locally to obtain the accuracy of individual training;

and S3, carrying out similarity detection on the participants by using the intelligent contract: the method comprises the steps of dividing data similarity detection and parameter similarity detection, obtaining a data similarity matrix according to the data similarity detection, obtaining a parameter similarity matrix according to the parameter similarity detection, and obtaining a training matrix by combining the data similarity matrix and the parameter similarity matrix;

s4, selecting a model with a high value in the training matrix for parameter aggregation after the training matrix is obtained;

s5, returning the aggregated parameters to the model, retraining and recording the accuracy;

s6, judging the accuracy of the two times, updating the weight in the aggregation parameter, and ending the training if the accuracy is improved; otherwise, updating the weight and the aggregated parameters in the distance matrix, and carrying out similarity detection again;

s7, carrying out parameter aggregation again according to the training matrix;

and S8, until all participants are improved in result.

2. The method for aggregating adaptive federal parameters for small micro-enterprise credit assessment according to claim 1, wherein the method comprises the following steps: the data similarity detection in S3 includes data vectorization and data similarity detection, where the data vectorization is: segmenting data, converting characters into one dimension in vector by word embedding, embedding words by combining numbers with words in front and back, and embedding words by Q_iObtaining a vector of a segment by a formula, then vectorizing each segment, and finally obtaining a vector of a segment according to the formula

Carrying out data similarity detection after obtaining data vectors, wherein E is the total data vector, E_iIs a vector for each segment.

3. The method for aggregating adaptive federal parameters for small micro-enterprise credit assessment according to claim 2, wherein the method comprises the following steps: the data similarity detection comprises: vectors x and y after data vectorization of any two participants a and b are taken;

1) a and b inform each other of the modular length of the respective data vector, and a informs b of any hyperplane z passing x;

2) b, knowing a hyperplane in a, calculating a projection vector of y on z and informing a, knowing a normal vector n of the hyperplane, and obtaining a projection vector of y on n;

3) and a, acquiring a projection vector of y on z, calculating the similarity distance and informing b, then reselecting a participant data vector, and calculating the similarity distance to obtain a data similarity matrix.

4. The method for aggregating adaptive federal parameters for small micro-enterprise credit assessment according to claim 2, wherein the method comprises the following steps: said Q_iIs given by the formula Q_i＝τ_iQ_i-1+τ_i+1Q_i+1Wherein Q is_iEmbedding vectors, Q, for digital words_i-1One-dimensional, Q, of the transformed vector for the word preceding the number_i+1One-dimensional, tau, of the transformed vector of the word after the number_iAnd τ_i+1Are all weights, let τ be_iIs set to 0.6, tau_i+1Is set to 0.4.

5. The method for aggregating adaptive federal parameters for small micro-enterprise credit assessment according to claim 1, wherein the method comprises the following steps: the parameter similarity detection in S3 includes model parameter vectorization and parameter similarity detection, where the model parameter vectorization is: and sequentially sampling parameters of each layer in the model trained by the participant, wherein the sampled parameters are used as one dimension of a parameter vector to obtain the parameter vector of the model, and the dimension with less layers is automatically supplemented by 0 to be consistent with the dimension with the length of the vector.

6. The method for aggregating the adaptive federal parameters for small micro-enterprise credit assessment according to claim 5, wherein: the sampling is performed according to a rejection sampling method.

7. The method for aggregating adaptive federal parameters for small micro-enterprise credit assessment according to claim 1, wherein the method comprises the following steps: the parameter similarity in S3 is detected as: and (4) randomly taking the parameter vectors of the c model and the d model, calculating the similarity distance, reselecting the two parameter vectors, detecting the parameter similarity until all the models are detected, and finally obtaining a parameter similarity matrix.

8. The method for aggregating adaptive federal parameters for small micro-enterprise credit assessment according to claim 1, wherein the method comprises the following steps: the training matrix in S3 is:

wherein T is a training matrix, P is a data similarity matrix, S is a parameter similarity matrix, x is taken from 0 to 4, the distance between the values is set to be 0.001, the distance between the values is set according to the size of participants, the weight of the data similarity is large when the data similarity is started, and then the data similarity gradually decreases.