CN110851491A

CN110851491A - Network link prediction method based on multiple semantic influences of multiple neighbor nodes

Info

Publication number: CN110851491A
Application number: CN201910985752.0A
Authority: CN
Inventors: 王博; 宋美贤; 胡清华
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2020-02-28
Anticipated expiration: 2039-10-17
Also published as: CN110851491B

Abstract

The invention discloses a network link prediction method based on multiple semantic influences of multiple neighbor nodes, relates to data mining and topological structure analysis, and belongs to a research problem in the field of social computing. The method comprises the following steps: and analyzing data, namely analyzing interest characteristics of the nodes and network structure characteristics based on node behaviors and node relation data in the social network. And (3) model training, wherein the model combines multiple semantic influences of multiple neighbor nodes to obtain an embedded vector of each node. Predictive analysis uses the similarity between the embedded vectors of node pairs to measure the probability that a buddy link exists. The invention does not use the constant influence scores of the neighbors, but simulates the special semantic influence of each neighbor on the node. The method jointly simulates the local level semantic influence and the global level semantic influence of the neighbor nodes in the network embedding training, and trains the joint embedding vector based on the semantic influence of all the neighbor nodes for each node.

Description

Network link prediction method based on multiple semantic influences of multiple neighbor nodes

Technical Field

The invention relates to data mining and topological structure analysis, and belongs to a research problem in the field of social computing. A network link prediction method combining multiple semantic effects of multiple neighbor nodes is provided.

Background

Among the many tasks in social networks, link prediction is of great importance. This task includes two problems: the first is to infer social links that may be generated in the social network in the future, and the other is to recreate existing links that are missing from the current snapshot of the social network. The aim of the invention is to solve the latter, i.e. to reconstruct missing links in a social network.

To implement link prediction, topology information of a network is widely used in a conventional link prediction method, which is referred to as a topology-based method. Topology-based link prediction only considers the structural information of the social network. Inspired by Network Embedding (NE) technology, a large number of topology-based models have been proposed in recent years for learning node Embedding vectors and further for link prediction. For example, Deepwalk^[1]And (4) learning the embedded vector representation of the node by considering the node string obtained by random walk as a sentence and combining a Skip-Gram method. Topology-based methods ignore node attributes that are actually useful for link prediction. By jointly modeling topological and semantic information, the hybrid approach can provide better performance. E.g. TADW^[2]Deepwalk-based matrix decomposition is improved in conjunction with textual information.

The invention predicts the probability of social links between two people by embedding different types of attributes into a uniform space and calculating the similarity of the embedded vectors. The idea of predicting the connection with similarity is closely related to the theory of homogeneity in sociology. To explain the similarities between individuals in social networks, the theory of homogeneity proposes two principles: selection and influence. Selection principle explains the similarity of social connections by assuming people are similar to others, and the influencing principle assumes that similarity stems from the fact that people become more similar to their friends over time. Compared with the influence principle, the selection principle is more intuitive and is widely applied to the current link prediction research: people tend to select friends that are similar to themselves in structural or semantic attributes.

However, influence also plays an important role in establishing social connections. The theory of homogeneity in sociology indicates that people influence each other in existing relationships. By this way of influencing, the neighbourhood of a person will influence the selection of a new friend of a person. Psychological studies also support a co-role in the influence and selection in human selection behavior. In psychology, the difference between influence and choice can be understood as two causes: intrinsic and extrinsic motivations, which together drive selective behavior^[3]. The intrinsic motivation is determined by the intrinsic interest of the person, and the extrinsic motivation comes from the extrinsic influence.

In the invention, the influence of the neighbors is introduced into the link prediction task. For this reason, there are two main challenges:

(1) user nodes in a social network often have different impacts on different neighbor nodes. However, in the conventional method, a user node has only a constant influence score, and when the user node influences different neighbor nodes around him/her, a slightly different influence cannot be obtained. Thus, if one wants to know how a given user node is affected by different neighbor nodes during social link establishment, one needs to model the pairwise impact between friend nodes.

(2) The impact between interpersonal relationships is usually semantic, such as research interest or political standpoints. Such semantics may exist at different language levels. In one aspect, the local level semantic impact describes the interaction of two user nodes in some specific term semantics. On the other hand, global level semantic impact refers to the semantic impact of the overall interest of neighboring nodes.

[ reference documents ]

[1]Bryan Perozzi,Rami Al-Rfou,and Steven Skiena.2014.DeepWalk:onlinelearning of social representations.In Proceedings of the 20th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining,KDD’14,NewYork,NY,USA-August 24-27,2014.701–710.

[2]Cheng Yang,Zhiyuan Liu,Deli Zhao,Maosong Sun,and EdwardY.Chang.2015.Network Representation Learning with Rich Text Information.InProceedings of the Twenty-Fourth International Joint Conference on ArtificialIntelligence,IJCAI2015,Buenos Aires,Argentina,July 25-31,2015.2111–2117.

[3]Richard M.Ryan and Edward L.Deci.2000.Intrinsic and ExtrinsicMotivations:Classic Definitions and New Directions.Contemporary EducationalPsychology 25,1(2000),54–67.

Disclosure of Invention

The invention designs a network link prediction method combining multiple semantic influences of multiple neighbor nodes. Namely, link prediction is carried out based on a network embedding method, and each neighbor node of each node has multi-level semantic influence. The invention aims to predict the probability of friend links between certain node pairs based on relevant topological information and interest text information.

The invention provides a network link prediction method based on multiple semantic influences of multiple neighbor nodes, which comprises the following steps:

step one, data analysis is used for analyzing node behavior data and relationship data among nodes in a social network; analyzing related attribute vectors from interest attributes of the nodes and friend attributes of the nodes respectively; obtaining node interest characteristics and network structure characteristics;

secondly, model training is carried out, and the model training is used for constructing a model for obtaining node embedded vectors in the social network; based on the node interest characteristics and the network structure characteristics obtained by the data analysis module, the model models multiple semantic influences of multiple neighbor nodes to obtain an embedded vector of each node;

and step three, prediction analysis, namely measuring the probability of friend links between corresponding node pairs by using the similarity between the embedding vectors of the node pairs.

Furthermore, in the first step of the network link prediction method based on multiple semantic influences of multiple neighbor nodes, the method is to perform the following stepsA social network is denoted as G ═ (N, E, S), nodes in the social network all have text attributes that imply interest information, where N ═ { u ═ i₁，u₂，...，u_nThe method comprises the steps that A, a node set of a social network is obtained, E is a friend link set in the social network, and S is a text attribute set of the node; node u_iIs represented as a word sequence S_i＝(w₁，w₂，...，w_n) Wherein w is_tIs a sequence of words S_iThe t-th word in (1).

In the second step of the network link prediction method based on multiple semantic influences of multiple neighbor nodes, the training target is to obtain a network embedded matrix V ═ V₁，v₂，...，v_n]V is formed by the combination of the embedded vectors of all nodes, whereIs node u_iThe embedded vector of (2); to train the embedded vector for each node in the network, the sum of the probabilities of all known edges is maximized, as follows:

wherein L (e) is a topology-based objective function L_T(e) And an objective function L based on influence_I(e) Topology-based and impact-based embedding are mapped into the same representation space;

L(e)=αL_T(e)+(1-α)L_I(e)

wherein the target function is based on topology

Impact-based objective function

w_ijIs a weight of an edge in a social network to represent the strength or polarity of a friend's relationship, which makes the present invention applicable to various networks;

when an influence-based embedded vector of a node is obtained in a model training process, semantic influence of each neighbor of the node is modeled by semantics of each neighbor and an interest text of the node; the semantic influence is modeled at local and global levels respectively and is combined into a combined embedded vector based on influence; the local semantic influence is used for capturing text semantic influence of a local area, and the text of the local area can be interpreted by using certain term vocabularies in the interest text; capturing the influence caused by the global interest semantics of the neighbor, namely the semantic influence caused by the global semantics described by the whole semantics of the interest text;

all neighbors are paired to node u_iIs averaged to generate a finalAs follows:

where m represents node u_iThe number of the neighbor nodes of (1),

represents a neighbor node u_kTo node u_iInfluence-based embedding of (1); embedding by connecting local level semantic impactsAnd global level semantic impact embedding

Obtaining neighbor node u_kTo node u_iThe impact-based embedding of (1), namely:

wherein

And is

In the second step, the embedded vector training based on the local semantic influence is based on a Convolutional Neural Network (CNN) and an Attention Mechanism (Attention Mechanism); the training comprises the following steps: obtaining a pair of friend nodes u_i，u_kText information sequence S_i，S_kBased on the search layer, the convolution layer, the attention layer and the output layer, the final embedded vector based on the local semantic influence is obtained;

based on a text information sequence S_iObtaining a text embedding matrix X ═ X by the lookup layer₁，x₂，...，x_n]Then, based on the following convolution formula, a local feature matrix C is obtained⁽ⁱ⁾＝[c₁，c₂，...，c_n-h+1]；

c_i=f(W_cx_i：i+h-1+b)

In the same manner, node u is acquired_i，u_kLocal feature matrix C of⁽ⁱ⁾，

Coupling local semantic relevance of a group of friend nodes by combining an attention mechanism, and generating an attention vector for each of two local feature matrixes, so that local semantic information from a neighbor node directly influences an embedded vector of the node; when generating the attention vector, first, the local feature matrix C is used⁽ⁱ⁾，C^(k)A semantic matching matrix M for local semantic influence is constructed, the goal is to obtain semantic matching signals, and the calculation mode is as follows:

wherein the wordsSemantic matching matrix

M_xyRepresents the x row and y column elements of the matrix M; performing mean pooling and softmax operation on the semantic matching matrix M to generate an attention vector, wherein the calculation mode is as follows:

a⁽ⁱ⁾=softmax(mean_row(M))

a^(k)=softmax(mean_col(M))

wherein, a⁽ⁱ⁾，Are respectively local feature matrix C⁽ⁱ⁾，C^(k)Attention vector of (1), mean_row(. and mean)_col(. cndot.) represents mean pooling of the matrix in the row and column directions, respectively;

node u_kTo node u_iEmbedded vector based on local level semantic influenceThe calculation is as follows:

in the same manner, node u is calculated_iTo node u_kEmbedded vector based on local level semantic influence

In the second step, the embedded vector training based on the global semantic influence is to obtain the global semantic influence by using a Bi-GRU model (Bi-GRU), and the method comprises the following steps:

given node u_iFirst, obtain node u_iThe corresponding text embedding matrix X, the tth hidden state component of the GRU model (Gated reinforced RecurrentUnit, GRU) is calculated as follows:

r_t=σ(W_xrx_t+W_hrh_t-1)

z_t＝σ(W_xzx_t+W_hzh_t-1)

obtaining node u_iForward hidden state of

And a backward hidden state

And will be

And

obtaining the hidden layer context state of the Bi-GRU model after connection

Applying mean pooling for all historical hidden states, i.e.:

the size of the vector is mapped to the corresponding dimension as follows:

wherein, the matrix

Is a projection matrix; vector quantity

Is node u_kTo node u_iBased on global level semantic impact. In the same manner, node u is calculated_iTo node u_kEmbedded vector based on global level semantic influence

In step two, model optimization is performed on the embedded vector of each node in the training network, and the model optimization comprises the following steps:

the original target function is accelerated by adopting a negative sampling algorithm, namely each known edge (u)_i，u_k) The following objective functions are specified:

wherein K represents the number of corresponding negative sampling edges; σ (-) denotes the sigmoid function.

In the third step of the network link prediction method based on multiple semantic influences of multiple neighbor nodes, the probability of friend links existing between corresponding node pairs is measured by using the similarity between embedded vectors of the node pairs; when predictive analysis is performed, node u in social network_iAnd u_jForm a link e_ijThe probability of (c) is:

wherein v is_i，v_jE is respectively node u_i,u_jThe embedded vector of each node is a topology-based embedded vector

And the node's embedded vector based on influence

A combination of (a):

compared with the prior art, the invention has the following advantages:

(1) in the method of the invention, a joint embedding vector with the semantic influence of his/her neighbors is trained for each user by using the observed neighbor relations and the text attributes of the user. Rather than using a constant influence score for a neighbor, the present invention models the specific influence of each neighbor on the user. The impact is modeled based on the neighbors and the text attributes of the user. Finally, for any pair of nodes that are not connected in the current network, the missing links between the pair of nodes are predicted by calculating the similarity between their embedded vectors.

(2) In the invention, the local level semantic influence and the global level semantic influence of the neighbor nodes in the network embedding training are jointly simulated. Semantic influence is modeled on multiple levels, so that the semantic influence relation between friend user pairs can be more fully modeled, and the link prediction accuracy and robustness can be improved.

Drawings

FIG. 1 is a schematic diagram of the network link prediction based on multiple semantic effects of multiple neighboring nodes according to the present invention;

FIG. 2 is a diagram of a network link prediction framework based on multiple semantic effects of multiple neighboring nodes according to the present invention.

FIG. 3 is a framework diagram of the module for modeling multiple semantic influences in step two of the present invention.

Detailed Description

The technical solutions of the present invention are further described in detail with reference to the accompanying drawings and specific embodiments, which are only illustrative of the present invention and are not intended to limit the present invention.

The network link prediction method based on multiple semantic influences of multiple neighbor nodes comprises the following three steps: data parsing, model training, and predictive analysis.

1. Data analysis: for analyzingAnalyzing related attribute vectors from the interest attributes of the users and the friend attributes of the users respectively according to the user behaviors and the user relationship data in the social network; and obtaining interest characteristics of the nodes and network structure characteristics. The social network is represented as G ═ (N, E, S), and nodes in the social network all have text attributes that imply interest information. Wherein N ═ { u ═₁，u₂，...，u_nIs the set of nodes of the social network. E is the set of friend links in the social network. S is the text attribute set of the node. Node u_iIs represented as a word sequence S_i＝(w₁，w₂，...，w_n) Wherein w is_tIs a sequence of words S_iThe t-th word in (1).

2. Model training: the model is used for constructing and obtaining node embedded vectors in the social network; based on the node interest characteristics and the network structure characteristics obtained by the data analysis module, the model models multiple semantic influences of multiple neighbor nodes to obtain an embedded vector of each node; the training objective is to obtain the network embedding matrix V ═ V₁，v₂，...，v_n]V is formed by the combination of the embedded vectors of all nodes, whereIs node u_iThe embedded vector of (2). To train the embedded vector for each node in the network, the sum of the probabilities of all known edges is maximized.

Wherein L (e) is a topology-based objective function L_T(e) And an objective function L based on influence_I(e) The topology and impact embedding are mapped into the same representation space. As shown in the following formula:

L(e)＝αL_T(e)+(1-α)L_I(e) (2)

wherein the target function is based on topology

Impact-based objective function

w_ijIs a weight of an edge in a social network to represent the strength or polarity of a friend's relationship, which makes the present invention applicable to various networks.

Further, when obtaining an impact-based embedded vector for a node in the training process, the impact of each neighbor of the node is considered (an example of the impact in a social network is shown in FIG. 1). The impact is modeled with semantics of each neighbor and the interest text of the node. Semantic effects are modeled at the local and global levels, respectively, and merged into a joint effect-based embedded vector (shown in the left and middle portions of FIG. 2). Local level semantic effects may capture specific semantic effects that may be interpreted with the semantics of certain terms in the text of interest. While the global level semantic impact will capture the semantic impact caused by the entire text of interest of the neighbor (the semantic impact modeling process is shown in figure 3).

All neighbors are paired to node u_iIs averaged to generate a final

Embedding by connecting local level semantic impacts

And global level semantic impact embedding

Obtaining neighbors u_kFor u is paired_iThe impact-based embedding of (1), namely:wherein

And is

Next, the details of the training of the influence-based embedding vector will be described.

2.1 Embedded vectors based on local level semantic impact

When the embedded vector based on the local semantic influence is obtained, the embedded vector is mainly based on a convolutional neural network and an attention mechanism. Obtaining a pair of friend nodes u_i，u_kText information sequence S_i，S_kBased on the search layer, the convolutional layer, the attention layer and the output layer, the final embedded vector based on the local semantic influence can be obtained.

Based on a text information sequence S_iObtaining a text embedding matrix X ═ X by the lookup layer₁，x₂，...，x_n]Then, based on the following convolution formula, a local feature matrix C is obtained⁽ⁱ⁾＝[c₁，c₂，...，c_n-+1]。

c_i=f(W_cx_i：i+h-1+b) (4)

In combination with an attention mechanism, the local semantic relevance of a group of friend nodes is coupled, and an attention vector is generated for each of two local feature matrices, so that the local semantic information from a friend node can directly influence the embedded vector of the node.

To obtain the attention vector, first a local feature matrix C is used⁽ⁱ⁾，C^(k)And constructing a semantic matching matrix M for local semantic influence, wherein the aim is to acquire a semantic matching signal. It calculatesThe method is as follows:

wherein the semantic matching matrix

M_xyRepresenting the x-th row and y-th column elements of the matrix M.

And performing mean pooling and softmax operation on the semantic matching matrix M to generate an attention vector. The calculation method is as follows:

a⁽ⁱ⁾=softmax(mean_row(M)) (6)

a^(k)=softmax(mean_col(M)) (7)

wherein a is⁽ⁱ⁾，

Are respectively local feature matrix C⁽ⁱ⁾，C^(k)The attention vector of (1). mean is a measure of_row(. and mean)_col(. cndot.) denotes mean pooling of the matrix in the row and column directions, respectively.

With node pair u_i，u_kFor example, node u_kTo node u_iLocal semantic influence of embedding vector

The calculation is as follows:

node u_iTo node u_kLocal semantic influence of embedding vector

The calculation method of (2) is the same as the above formula.

2.2 Embedded vectors based on Global level semantic impact

Bi-GRU models (Bi-GRU) are commonly used to capture global level semantics and have been successfully applied to various NLP tasks. It models context dependencies using forward GRUs and backward GRUs. Thus, two hidden representations can be obtained, and then the forward hidden state and the backward hidden state of each word are concatenated. Given a node ui, first obtain its text embedding matrix X, the computing mode of the t-th hidden state component of the GRU model is as follows:

r_t=σ(W_xrx_t+W_rh_t-1) (9)

z_t=σ(W_xzx_t+W_zh_t-1) (10)

obtaining node u_iForward hidden state of

And a backward hidden state

And connecting the two to obtain the hidden layer context state of the Bi-GRU model

In the present invention, instead of simply using the hidden state representation in the final state as global semantics, mean pooling is applied to all historical hidden states, i.e.:

to match the pooled vector dimensions with the target dimensions, the size of the vector is mapped to the corresponding dimensions:

wherein, the matrix

Is a projection matrix; vector quantity

Is node u_kTo node u_iBased on global level semantic impact. Node u_iTo node u_kEmbedded vector based on global level semantic influence

The calculation method of (2) is the same as the above process.

2.3 model optimization

The present invention aims to maximize each known edge (u)_i，u_k) Conditional probability in between. In order to reduce the calculation cost, the original target function is accelerated by adopting a negative sampling algorithm. I.e. each known edge (u)_i，u_k) The following objective functions are specified:

where K represents the number of corresponding negative sampling edges. σ (-) denotes the sigmoid function.

3. And (3) prediction analysis: the similarity between the embedded vectors of node pairs is used to measure the probability that a buddy link exists between the respective node pairs.

The probability is measured based on the similarity between the embedded vectors of a pair of user nodes, and a link prediction is performed (as shown in the right part of fig. 2). For example, node u in a social network_iAnd u_jForm a link edge e_ijThe probability of (c) is:

wherein the content of the first and second substances,

is node u_i,u_jEach user's embedding vector is a combination of a topology-based embedding vector and the user's impact-based embedding vector, namely:

the experimental material of the invention has four social network data sets, and these types of data sets are widely used in related research, which are respectively: the Cora citation network, the HepTh citation network, the Twitter social network, and the Coauthorship corporate network. The diversity of the data sets helps to verify the robustness of the present invention. Table 1 summarizes the relevant information for the four data sets.

Table 1 data set information statistics

Through a link prediction algorithm, similarity scores between embedded vectors of each pair of nodes in the network can be obtained after prediction work. Although the higher the similarity score is, the higher the possibility that a link exists between nodes is, a corresponding evaluation index is also required to evaluate the feasibility and the accuracy of the link prediction algorithm. To test the accuracy of the algorithm, the link edges in the network are typically divided into a test set and a training set, and the edges in the test set and the edges in the network that are not present are referred to as unknown edges. After calculation by the link prediction algorithm, each unknown edge has a similarity score, and the higher the score is, the higher the possibility that the edge exists is.

The currently commonly used index for evaluating the accuracy of the link prediction algorithm is AUC. AUC refers to the area under the ROC curve, and the effect of the classifier is often evaluated in the theory of signal detection. The traditional AUC measure requires that the AUC value be determined by plotting an ROC curve and calculating its area. When AUC is used as an index for evaluating the accuracy of the link prediction algorithm, it can be understood that the probability that an edge that does not exist is randomly selected in the network is lower than the probability that an edge is randomly selected in the test set.

When using AUC evaluation index, each time taking one edge from nonexistent edge neutralization test set, if the fraction value of nonexistent edge is less than the fraction value of edge in test set, adding 1 point; if the two fraction values are equal, 0.5 points are added. When n times are compared independently, if there are n 'times plus 1 point, and n' times plus 0.5 point, then the value of AUC is defined as:

when the data set is separated into a training set and a test set, subsets of different proportions, namely 20%, 40%, 60% and 80%, are randomly selected from the data set to the training network. For each part of the training set, the embedding vector is first trained with the training set. The remaining instances are then used as test data sets for testing the network for evaluating the performance of the link prediction method. Tables 2, 3, 4 and 5 show the experimental effects of the present invention on four actual data sets, and compare the corresponding effects with the performance effects of the existing DeepWalk and TADW models.

TABLE 2 AUC Performance indicators based on the Coautahorship Co-partner network dataset

TABLE 3 AUC Performance indicators based on Cora citation network datasets

TABLE 4 AUC Performance index based on HepTh citation network dataset

TABLE 5 AUC Performance indicators based on Twitter social network dataset

From the performance evaluation results, the invention achieves significant improvement over the baseline model under different data sets and different proportions.

The above embodiments are merely illustrative, and not restrictive, and various changes and modifications may be made by those skilled in the art without departing from the spirit and scope of the invention, and all equivalent technical solutions are intended to be included within the scope of the invention.

Claims

1. A network link prediction method based on multiple semantic influences of multiple neighbor nodes is characterized by comprising the following steps:

2. The method according to claim 1, wherein in step one, the social network is represented as G ═ (N, E, S), nodes in the social network all have text attributes that imply interest information, where N ═ u ═ S₁，u₂，...，u_nIs a set of nodes of the social network,e is a friend link set in the social network, and S is a text attribute set of the node; node u_iIs represented as a word sequence S_i＝(w₁，w₂，...，w_n) Wherein w is_tIs a sequence of words S_iThe t-th word in (1).

3. The method for predicting network links based on multiple semantic influences of multiple neighboring nodes according to claim 1, wherein in step two, the training objective is to obtain a network embedding matrix V ═ V-₁，v₂，...，v_n]V is formed by the combination of the embedded vectors of all nodes, where

Is node u_iThe embedded vector of (2); to train the embedded vector for each node in the network, the sum of the probabilities of all known edges is maximized, as follows:

L(e)＝αL_T(e)+(1-α)L_I(e)

wherein the target function is based on topology

Impact-based objective function

w_ijIs a weight of an edge in a social network to represent the strength or polarity of a friend's relationship, which makes it selfThe invention is applicable to various networks;

all neighbors are paired to node u_iIs averaged to generate a final

As follows:

where m represents node u_iThe number of the neighbor nodes of (1),represents a neighbor node u_kTo node u_iInfluence-based embedding of (1); embedding by connecting local level semantic impacts

And global level semantic impact embedding

wherein

And is

4. The method of claim 3, wherein the network link prediction method based on multiple semantic effects of multiple neighboring nodes comprises: the embedded vector training based on the local semantic influence is based on a convolutional neural network and an attention mechanism; the training comprises the following steps: obtaining a pair of friend nodes u_i，u_kText information sequence S_i，S_kBased on the search layer, the convolution layer, the attention layer and the output layer, the final embedded vector based on the local semantic influence is obtained;

c_i＝f(W_cx_i：i+h-1+b)

wherein, the semantic matching matrix

a⁽ⁱ⁾＝softmax(mean_row(M))

a^(k)＝softmax(mean_col(M))

5. The method of claim 3, wherein the network link prediction method based on multiple semantic effects of multiple neighboring nodes comprises: the embedded vector training based on the global semantic influence utilizes a Bi-GRU model to obtain the global semantic influence, and comprises the following steps:

given node u_iFirst, obtain node u_iThe corresponding text embedding matrix X, the tth hidden state component of the GRU model is calculated as follows:

r_t＝σ(W_xrx_t+W_hrh_t-1)

z_t＝σ(W_xzx_t+W_hzh_t-1)

obtaining node u_iForward hidden state ofAnd a backward hidden state

And will be

And

obtaining the hidden layer context state of the Bi-GRU model after connection

Applying mean pooling for all historical hidden states, i.e.:

the size of the vector is mapped to the corresponding dimension as follows:

wherein, the matrixIs a projection matrix; vector quantityIs node u_kTo node u_iBased on global level semantic impact; in the same manner, node u is calculated_iTo node u_kEmbedded vector based on global level semantic influence

6. The method of claim 3, wherein the network link prediction method based on multiple semantic effects of multiple neighboring nodes comprises: model optimization is carried out on the embedded vector of each node in the training network, and the model optimization comprises the following steps:

7. The method for network link prediction based on multiple semantic effects of multiple neighboring nodes according to claim 1, wherein in step three, the similarity between the embedded vectors of node pairs is used to measure the probability of friend links existing between corresponding node pairs; when predictive analysis is performed, node u in social network_iAnd u_jForm a link e_ijThe probability of (c) is:

wherein v is_i，

Are respectively node u_i，u_jThe embedded vector of each node is a topology-based embedded vector

And the node's embedded vector based on influenceA combination of (a):