CN114969078A

CN114969078A - Method for updating expert research interest of federated learning through real-time online prediction

Info

Publication number: CN114969078A
Application number: CN202210653137.1A
Authority: CN
Inventors: 王书海; 彭浩; 陈扬; 刘欣; 王壬欢; 刘明瑞; 唐翊群
Original assignee: Shijiazhuang Tiedao University
Current assignee: Shijiazhuang Tiedao University
Priority date: 2022-06-09
Filing date: 2022-06-09
Publication date: 2022-08-30

Abstract

The invention discloses a method for updating real-time online prediction of research interest of experts in federated learning, which comprises the following steps: each participant locally extracts expert text information for modeling to obtain semantic information and time information; learning the expert characteristics by using a text convolution neural network model, and introducing a self-attention mechanism to strengthen semantic information and learning time information of a text; by adopting a transverse federal learning mode, after each participant trains a model locally, training parameters are submitted to a global server, optimization is carried out by using a federal matching average algorithm, a hierarchical matching and communication mechanism is added, the server carries out global aggregation of the parameters, returns the global parameters to each local model and updates the local model; and clustering the coding vectors obtained by training the text convolutional neural network model to obtain the research interest of experts. The method solves the problem of data isolated island, protects data privacy and improves the accuracy of prediction of expert research interest.

Description

Method for updating expert research interest of federated learning through real-time online prediction

Technical Field

The invention belongs to the technical field of data processing, and particularly relates to a method for updating real-time online prediction of research interests of experts in federated learning.

Background

At present, the quantity of relevant data resources of scientific research experts is huge, the data resources are scattered in distribution, and deep processing is not carried out, so that it is difficult for users to quickly and accurately obtain academic information with actual value. The research interest of experts is particularly prominent. In searching for experts in different fields, it is not accurate enough according to the subject filled by the experts, the experts may have more prominent research in other fields, and the cooperation between different fields is more and more intimate, so in some demands, for example, project review department wants to find suitable reviewers, scientific research user seeks collaborators, professional questions need to consult domain expert help, etc., it will take a lot of time and effort and the result of the query is not necessarily valuable. The research fields of the experts are extracted from the academic experiences of the experts, the papers are published, and the cooperation projects are participated in, and the research interest of the updated experts is predicted according to the time information, so that the method is greatly helpful for accurately searching the needed experts.

The heterogeneity and privacy of expert data pose a great challenge to expert information integration, and mainly face the following problems: data security problems, huge text quantity, data island problems and the like. The existing federal learning techniques are mainly of two types, namely, the horizontal federal learning: under the condition that the user features of the two data sets are overlapped more and the user overlap less, transversely segmenting the data sets; longitudinal federal learning: and under the condition that the users of the two data sets overlap more and the user features overlap less, longitudinally segmenting the data sets.

Due to the fact that user data are not independently and simultaneously distributed and the user data amount is unbalanced, a traditional federal learning technology has large deviation when local data are updated, meanwhile, a model automatically deviates to equipment with good network conditions, and the training result is influenced by inconsistency of a system and the data.

Disclosure of Invention

In order to solve the problems, the invention provides a method for updating the expert research interest of federated learning in real time on-line prediction, which is characterized in that a text convolutional neural network model and an attention mechanism are utilized to learn the characteristics of experts, the expert research interest is obtained from expert information through the text convolutional neural network, a transverse federated learning technology is introduced, a hierarchical average optimization is utilized, a communication mechanism training model is added, the problem of data island is solved, the data privacy is protected, and meanwhile, the accuracy of the prediction of the expert research interest is improved.

In order to achieve the purpose, the invention adopts the technical scheme that: a method for updating the real-time online prediction of research interests of experts in federated learning comprises the following steps:

s10, aiming at expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant respectively extracts expert text information locally for modeling to obtain semantic information and time information;

s20; learning the characteristics of the experts by using a text convolution neural network model, introducing a self-attention mechanism to strengthen semantic information of the text, and learning time information;

s30, adopting a transverse federal learning mode, submitting training parameters to a global server after each participant trains the model locally, optimizing by using a federal matching average algorithm, adding a hierarchical matching and communication mechanism, performing global aggregation of the parameters by the server, returning the global parameters to each local model, and updating the local model;

and S40, clustering the coding vectors obtained by training the text convolutional neural network model to obtain the research interest of experts.

Further, in step S10, facing expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant extracts expert text information locally for modeling, and obtains semantic information and time information, including the steps of:

s11, aiming at expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant respectively extracts expert text information including expert personal academic experience information, expert published paper information, expert participation project information and expert acquired prize information;

and S12, inputting the expert personal academic experience, the expert published paper information, the expert participation project information and the expert acquired reward information into a word vector model in a document form for training, and performing unified coding to obtain a word vector comprising a semantic information vector x and timestamp information t.

Further, the Word vector model adopts a Word2vec pre-training model.

Further, in step S20, learning the expert features by using the text convolutional neural network model includes the steps of:

s211, representing the text semantic information, dividing the text information into sentences, wherein each sentence is n words, each word is a k-dimensional vector, and the obtained word vector matrix is represented as:

wherein,

is a concatenation operator, let X _1:n Becomes a word X _i The series connection of (1);

s212, generating characteristic c for each word by using a filter _i Expressed as:

c _i ＝f(w·x _i:i+h-1 +b)；

wherein h is the height of the filter, b is a bias term, w is the weight, and f is a nonlinear function;

s213, generating sentence features from the obtained word features c as follows:

c＝[c ₁ ,c ₂ ,...,c _n-h+1 ]；

s214, regularizing the penultimate layer of the text convolution neural network by using dropout, and finally outputting:

where w is the regularization weight, z is the sentence feature set,

is the bitwise multiplication operator, r is the height of the bernoulli random variable filter, and b is a bias term.

Further, in step S20, the method for enhancing semantic information representation and learning time information by using a multi-head attention mechanism includes the steps of:

s221, for each semantic information vector x, a piece of timestamp information t corresponds to the semantic information vector x, and the timestamp information t is added into word vector semantics to obtain a new feature matrix Y (x, t);

s222, calculating a similarity coefficient between each sentence and an adjacent sentence, and normalizing to obtain an attention coefficient;

s223, fusing neighborhood information according to the obtained attention coefficient, and aggregating the neighborhood information with the original characteristics of the sentence to form new characteristics;

and S224, introducing a multi-head self-attention mechanism, simultaneously using a plurality of shared parameters W, combining the expressions of a plurality of newly generated sentences into one, and obtaining the information code with the time stamp.

Further, in step S30, the extracted hidden elements with similar features are matched and averaged by using a federal learning average algorithm, and a shared global model is built layer by layer.

Further, in each iteration, a corresponding global model is found according to a given weight matrix estimation, and then the global model is matched with local neurons on a client according to a Hungarian algorithm to obtain a new expanded global model, and the method comprises the following steps:

s321, finding the parameter arrangement before averaging the parameters of the neural network, and combining the weights of the neurons with similar characteristics;

s322, given an increasing function, optimizing the Nippon average algorithm to obtain an optimized average matching formula as follows:

wherein: l represents the number of nodes of the hidden layer, c (w) _jl ,θ _i ) Representing the similarity relationship between the ith neuron learned on the client j and the ith neuron in the global server model, f (i) representing an increasing function, σ representing a threshold,

represents an arrangement of parameters;

s323, given the weight W provided by J clients _j，1 ，W _j，2 Calculating the weight of the federal neural network:

and S324, layering and matching, and finally aggregating to obtain model updating parameters.

Further, if the cost of the optimal matching is larger than a set threshold, no matching is performed, and a new global neuron is created in the corresponding local neuron.

Further, the hierarchical matching scheme includes the steps of:

s331, the global server only collects the weight of the first layer from the client, and executes the single-layer matching described above to obtain the weight of the first layer of the federal model;

s332, the global server broadcasts the obtained first-layer weight of the federated model to the client, and the client continues to train all continuous layers on a data set of the client and keeps the federated matching layer in a frozen state;

s333, continuously repeating the process until the last layer is reached; and for the last layer, carrying out weighted average on each client data point according to the class proportion of the client data point to obtain a final parameter.

Further, a communication mechanism is added into the federal mean learning algorithm, and the method comprises the following steps:

and the local client receives the matched global model when a new round is started, and reconstructs the local model with the same size as the original local model on the basis of the previous round of matching result, so that the size of the global model is kept in a smaller range.

The beneficial effects of the technical scheme are as follows:

according to the invention, aiming at expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant extracts expert text information locally for modeling; learning the characteristics of the experts by using a text convolutional neural network model, introducing a self-attention mechanism to learn semantic information of an enhanced text, learning time information, and obtaining codes; and (3) adopting a transverse federal learning mode, carrying out model training locally by each participant, submitting training parameters to a global server, optimizing by using a federal matching average algorithm, and adding a hierarchical matching and communication mechanism. The server carries out global aggregation of the parameters, returns the global parameters to each local model and updates the local models; and clustering the coding vectors obtained by the network model by the training text convolution spirit to obtain the research interest of experts. According to the method, the semantic and structural information in the expert information is fully utilized, the expert research interest is obtained from the expert information through the text convolution neural network, the horizontal federal learning technology is introduced, the problem of data islanding is solved, the data privacy is protected, and meanwhile, the accuracy of prediction of the expert research interest is improved.

The invention fully mines the information of expert data from different sources in order to more effectively utilize various elements in the expert information; the elements and the time stamps in the expert information are vectorized, a feature vector containing text features and time features is constructed, different elements in the expert information and the relation between the elements can be effectively expressed, and therefore more knowledge is obtained, and the model effect is superior to that of a common information representation method.

The text convolutional neural network model is introduced to encode information, and has the greatest advantages of simple network structure and fewer parameters, so that the calculation efficiency is higher than that of other models. And adding a dropout regularization method to prevent the overfitting of the whole body from optimizing the updated model of the parameters by using a gradient method. By utilizing a multi-head self-attention mechanism, semantic information is strengthened, time information is learned, and the influence of the time information on research interest is considered. Meanwhile, a federal learning method is introduced, so that each party of data is trained on the premise of protecting privacy, the data volume is increased, and the accuracy of a prediction result is improved.

The invention focuses on the situation of data deviation, and each domain is regarded as a client, so that the local model is not influenced by aggregated data deviation, and the meaningful relation between functions and classes is learned. Algorithms can then be used to learn a good global model without bias. After considering the limitation that the parameters of the local model are evenly distributed according to the element average and the weight is in proportion to the size of the client data set in the prior federal learning algorithm (limited by the size of the client data set), the weights of the neurons with similar characteristics are combined based on the replacement invariance of the neurons, and a global sharing model is constructed. The training result of the text convolution neural network is optimized, and the communication efficiency is improved.

In order to be more suitable for a real online environment, a communication mechanism is added in a horizontal federal learning algorithm, the whole data cannot be trained due to the privacy of the data in the real environment, the communication mechanism is added in a global server and various local clients, wherein the local client receives a matched global model when a new round begins, and reconstructs a local model with the size equal to that of an original local model based on a matching result of the previous round. This process allows keeping the size of the global model small. Thereby reducing the performance requirements on the local client network. When a newly added client side is used for training, parameters of previous training can be directly transmitted to a newly added client side model, then new training transmission is carried out, and newly added data can be more conveniently integrated into overall data. Therefore, in a heterogeneous data scenario, our algorithm outperforms other federal learning methods.

Drawings

FIG. 1 is a schematic flow chart of an update method for real-time online prediction of research interests of experts in federated learning according to the present invention;

FIG. 2 is a block diagram of an expert research interest prediction model according to the present invention;

FIG. 3 is a schematic diagram of the principle framework for lateral federal learning of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described with reference to the accompanying drawings.

In this embodiment, referring to fig. 1 and fig. 2, the present invention provides a real-time online prediction updating technique for research interests of experts in bang learning, including the steps of:

s10, aiming at expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant respectively extracts expert text information locally for modeling to obtain a semantic information vector X and a timestamp vector t;

s20; learning expert features by using a text convolution neural network model, introducing a self-attention mechanism to strengthen semantic information of a text, and learning time information;

s30, training models locally by each participant in a horizontal federal learning mode, submitting training parameters to a global server, optimizing by using a federal matching average algorithm, adding a hierarchical matching and communication mechanism, performing global aggregation on the parameters by the server, returning the global parameters to each local model, and updating the local models;

and S40, clustering the coding vectors obtained by the network model by the training text convolution spirit to obtain the research interest of experts.

As an optimization scheme 1 of the above example, in step S10, in the face of expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant respectively extracts expert text information locally for modeling, and obtains semantic information and time information, including the steps of:

s11, aiming at expert information data of heterogeneous and non-public talent expert libraries in various regions, each participant respectively extracts expert text information including expert personal academic experience information, expert published paper information, expert participation project information and expert acquired award information locally;

and S12, inputting the expert personal academic experience, the expert published paper information, the expert participation project information and the award information obtained by the expert into a word vector model in a document form for training, and performing unified coding to obtain a semantic information vector x and timestamp information t.

As an optimization scheme 2 of the above example, in the step S20, learning the expert features by using the text convolutional neural network model includes the steps of:

c _i ＝f(w·x _i:i+h-1 +b)；

where h is the height of the filter, b is a bias term, w is the weight, and f is a non-linear function.

And S213, generating sentence characteristics according to the obtained word characteristics c, wherein the sentence characteristics are as follows:

c＝[c ₁ ,c ₂ ,...,c _n-h+1 ]；

where w is the regularization weight, z is the sentence feature set,

The pre-training Word vector model of the optimization scheme adopts a Word2vec pre-training model.

As the optimization scheme 3 of the above example, in the step S20, the method for enhancing semantic information representation and learning time information by using a multi-head self-attention mechanism includes the steps of:

s222, then calculating a similarity coefficient between each sentence and an adjacent sentence, and obtaining an attention coefficient after normalization;

and S224, introducing a multi-head self-attention mechanism, simultaneously using a plurality of sharing parameters W, and combining the expressions of a plurality of newly generated sentences into one to obtain a code.

As an optimization scheme 4 of the above example, as shown in fig. 3, in step S30, S321, before averaging the parameters of the neural network, find the arrangement of the parameters, and combine the weights of neurons with similar features;

wherein: l represents the number of nodes of the hidden layer, c (w) _jl ,θ _i ) Representing the similarity relation between the ith neuron learned on the client j and the ith neuron in the global server model; f (i) represents an increasing function, σ represents a threshold value,

representing the arrangement of the parameters.

and S324, layering and matching, and finally polymerizing to obtain the required parameters.

As the optimization scheme 4 of the above example, a hierarchical matching scheme is adopted, which includes the steps of:

s331, the global server collects only the first-layer weights from the client and performs the single-layer matching described above to obtain the first-layer weights of the federated model.

S332, the global server broadcasts the weights to the client, the client continues to train all continuous layers on the data set of the client, and the federated matching layer is kept in a frozen state.

S333, this process is repeated until the last layer. For the last layer, weighted average is carried out on each client data point according to the class proportion of the client data point to obtain a final parameter.

Finally, a communication mechanism is added into the federal mean learning algorithm, and the method comprises the following steps:

and the local client receives the matched global model when a new round is started, and reconstructs a local model with the same size as the original local model on the basis of the previous round of matching result. Thereby keeping the size of the global model within a small range.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, and such changes and modifications are within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for updating the real-time online prediction of the research interest of experts in the United nations study is characterized by comprising the following steps:

2. The method for updating the real-time online prediction of the research interests of the experts in federated learning according to claim 1, wherein in step S10, in the face of expert information data of heterogeneous and unpublished talent expert bases in various regions, each participant respectively extracts expert text information locally for modeling to obtain semantic information and time information, comprising the steps of:

3. The method of claim 2, wherein the Word vector model is a Word2vec pre-training model.

4. The method for updating the real-time online prediction of the research interest of the expert in federated learning according to claim 1, wherein in step S20, learning the expert characteristics using a text convolutional neural network model includes the steps of:

wherein,

c _i ＝f(w·x _i:i+h-1 +b)；

c＝[c ₁ ,c ₂ ,...,c _n-h+1 ]；

where w is the regularization weight, z is the sentence feature set,

5. The method for updating real-time online prediction of research interest of experts in federated learning according to claim 1 or 4, wherein in step S20, semantic information representation and learning time information are enhanced by a multi-head self-attention mechanism, comprising the steps of:

6. The method for updating the real-time online prediction of the research interests of the experts in federated learning according to claim 1, wherein in step S30, the extracted hidden elements with similar features are matched and averaged by using a federated learning averaging algorithm, and a shared global model is constructed layer by layer.

7. The method for updating the real-time online prediction of the research interests of the experts in the federated learning according to claim 6, wherein in each iteration, a corresponding global model is found according to a given weight matrix estimation, and then the global model is matched with local neurons on a client according to the Hungarian algorithm to obtain a new expanded global model, comprising the steps of:

wherein: l represents the number of nodes of the hidden layer, c (w) _jl ,θ _i ) Representing the similarity relationship between the ith neuron learned on client j and the ith neuron in the global server model, f (i) representing an increasing function, σ representing a threshold,

represents an arrangement of parameters;

8. The method of claim 7, wherein if the cost of optimal matching is greater than a set threshold, no matching is performed, and a new global neuron is created in the corresponding local neuron.

9. The method for updating the expert research interest of federated learning on-line in real time prediction according to claim 7 or 8, wherein the hierarchical matching scheme comprises the steps of:

s332, the global server broadcasts the obtained first-layer weight of the federated model to the client, the client continues to train all continuous layers on a data set of the client, and the federated matching layer is kept in a frozen state;

10. The method for updating the real-time online prediction of the research interests of the experts in federated learning according to claim 7, wherein a communication mechanism is added to the federated average learning algorithm, comprising the steps of: