CN115759234A

CN115759234A - Rumor source detection method based on space-time convolution neural network

Info

Publication number: CN115759234A
Application number: CN202211510609.4A
Authority: CN
Inventors: 吴锡濠; 倪秋芬; 蔡煜; 黄杰彬
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2022-11-29
Filing date: 2022-11-29
Publication date: 2023-03-07

Abstract

The invention discloses a rumor source detection method based on a space-time convolution neural network, which comprises the following steps: s1: constructing snapshot maps of different batches infected by propagation according to different rumor propagation processes, wherein the snapshot map of each batch comprises a plurality of snapshots in different time periods; s2: determining a rumor propagation model, and performing node label distribution according to the infection condition of the user node; s3: inputting the constructed multiple batches of snapshot graphs into a node feature matrix processing algorithm to obtain a node feature matrix; s4: inputting a preset social network diagram and the obtained node feature matrix into a time-space convolution module to extract time-space features to form a time-space feature matrix; s5: reducing the dimension of the extracted time-space characteristic matrix to an integer; s6: and taking the node with the highest node label value in the result obtained in the step S5 as the predicted rumor source node of the batch. The method improves the accuracy of rumor source detection, and avoids the problems of over-smoothening and overlong training time of the traditional model by introducing the improved space-time convolution module.

Description

Rumor source detection method based on space-time convolution neural network

Technical Field

The invention relates to the technical field of space-time convolution neural networks, in particular to a rumor source detection method based on a space-time convolution neural network.

Background

Social networks have evolved rapidly over the past decades and have dramatically changed people's lifestyles. For example, in the past, people were often in the location of information recipients in social networks due to laggard communication technologies and the inconvenience of circulating information between them. Nowadays, social networks such as twitter, facebook, and surf microblog have become a part of people's lives. In a social network, people can easily share information. Due to the rapid development of communication technology, people can communicate and learn through social networks at low cost. However, the rapid development of social networks has some disadvantages, and part of information on social networks may be rumors and is not real. Rumors and other error messages can be rapidly and exponentially propagated on social networks, which can have devastating consequences for individuals, organizations, and even society. Therefore, for such large-scale rumor propagation conditions, in order to achieve effective rumor source detection, it is important to perform structured processing by using big data analysis and artificial intelligence. In the early period of rumor propagation, a rumor source infection snapshot map is constructed according to the rumor infection process, and the characteristics of the rumor source snapshot are extracted and analyzed to obtain the rumor source, so that the network security personnel can be effectively helped to rapidly control the rumor.

Most of the conventional rumor source detection methods use a single fast graph of a rumor diffusion graph as an input to estimate the rumor source nodes, and ignore the time characteristics of rumor propagation. Thanks to the development of technology, it is now possible to obtain multiple snapshots observed at different stages of rumor propagation, and these independent snapshots can help to reveal the temporal characteristics of rumor propagation. Recent research efforts have begun to emerge on spatio-temporal convolution models that use multiple snapshots as inputs. However, the traditional space-time convolution model still has defects, and the accuracy rate of rumor detection is not ideal because the input is a single-batch multi-snapshot map; convolution is performed by using a traditional GCN at a spatial convolution layer, which brings about overlong model training time and over-smoothing problem.

The prior art discloses a rumor detection method and a rumor detection device based on graph neural network feature aggregation, wherein the method comprises the following steps: acquiring a first event source text graph; inputting a first event source text graph and a training label into a preset first graph neural network model for training so as to determine a graph neural network prediction model; and inputting the first event source text graph to be detected into a graph neural network prediction model so as to carry out rumor detection on the event source text and the response tweet in the Internet. The device is used for executing the method. The scheme uses a graph neural network, rumor discrimination is carried out from text level granularity and word level granularity, and the problems that the traditional detection model is slow in training speed and over-smoothening exists are not solved.

Disclosure of Invention

The invention provides a rumor source detection method based on a space-time convolution neural network, which improves the accuracy of rumor source detection and simultaneously solves the problems of slow training speed and over-smoothening of the traditional detection model.

The primary objective of the present invention is to solve the above technical problems, and the technical solution of the present invention is as follows: a rumor source detection method based on a space-time convolution neural network comprises the following steps:

s1: constructing snapshot maps of different batches infected by propagation according to different rumor propagation processes, wherein the snapshot map of each batch comprises a plurality of snapshots in different time periods;

s2: determining a rumor propagation model, and performing node label distribution according to the user node infection condition;

s3: inputting the constructed multiple batches of snapshot graphs into a node feature matrix processing algorithm to obtain a node feature matrix;

s4: inputting a preset social network diagram and the obtained node characteristic matrix into a time-space convolution module to extract time-space characteristics to form a time-space characteristic matrix;

s5: reducing the dimension of the extracted time-space characteristic matrix to an integer;

s6: and taking the node corresponding to the highest value of the node label in the result obtained in the step S5 as the predicted rumor source node of the batch.

Further, the snapshots of the different batches correspond to different and independent rumor-transmitted infection processes; the several snapshots at different time periods correspond to the infection status of the user nodes in the graph at different time periods of the rumor propagation process.

Further, the snapshot map acquisition method for different batches comprises the following steps: selecting any node in the snapshot as a source node in each batch, then iteratively propagating infection for multiple times, and recording the infection states of all nodes in the snapshot after each iterative infection in the snapshot;

the acquisition modes of a plurality of snapshots at different time periods are as follows: the partial snapshots are randomly selected among the multiple snapshots under each batch.

Further, determining a rumor propagation model, and performing node label allocation according to the user node infection condition specifically comprises the following steps: selecting an SIR (Signal to interference ratio) propagation model as a rumor propagation model, distributing nodes according to infection conditions according to the types of user nodes, marking uninfected people as susceptable (S) classes and representing users who do not receive rumor information; the infected people are marked as infested (I) class, representing that the user has forwarded or released rumors; the population recovering after infection is marked as recovery (R) class, which represents the users who find the contents of the rumors after forwarding or releasing the rumors and delete the rumors; node labels are assigned according to SIR conditions, and the S-type label is assigned as-1,I and is assigned as +1,R and is assigned as-1.

Further, the specific process of step S3 is:

taking an infection state matrix X, a degree matrix D, an adjacent matrix A and a super parameter alpha as the input of a node characteristic matrix processing algorithm, wherein the super parameter alpha represents the percentage of label values aggregated by a node from neighbor nodes;

constructing a relation graph and a matrix according to the position relation of nodes in the rumor propagation model

A is the adjacency matrix of the relationship graph, where D is a diagonal matrix whose (i, i) elements are equal to the sum of the ith row of A;

setting the fast graph batch as Q, the number of snapshots as T and the number of user nodes of the social network graph as N, obtaining the node infection state matrix of all the snapshots of all the batches according to the node label distribution realized in the step S2

Since the matrix operation is applied to the two-dimensional matrix, the dimension of the matrix X is reduced to the two-dimensional matrix through the reshape function

Then defining the matrix assignment of P and O as the two-dimensional matrix

The P matrix assigns a value of-1 to the element in the P matrix as 0,O and assigns a value of +1 to the element in the P matrix as 0;

in order to obtain the node rumor centrality feature and the source probability feature, 4 matrixes a, b, c and d are defined, wherein the four matrixes a, b, c and d are respectively a = X ₁ ，b＝(1-α)(I-αU) ^-1 X ₁ ，c＝(1-α)(I-αU) ^-1 P，d＝＝(1-α)(I-αU) ^-1 O；

Where the P and O matrices are defined to obtain the c and d matrices. Wherein, the b matrix can capture the sourceprominence characteristic of the node, and the c and the d can capture the rumourtry characteristic of the node; then splicing the four matrixes of a, b, c and d through a concatenate function, and restoring the matrixes into a 4-dimensional matrix Y through a reshape function ⁰ I.e. the node signature matrix.

Further, the node feature matrix is expressed as QxT × NxC _in Wherein Q is the number of snapshot image batches, T is the amount of time snapshots collected under each batch, N is the number of observed user nodes, C _in Is the input dimension.

Further, the space-time convolution module includes: the time-space convolutional code comprises a first time convolutional layer, a first space convolutional layer and a second time convolutional layer, wherein the output end of the first time convolutional layer is connected with the input end of the first space convolutional layer, the output end of the first space convolutional layer is connected with the input end of the second time convolutional layer, the first time convolutional layer and the second time convolutional layer are both composed of a CNN and an activation function GLU, and the expressions of the first time convolutional layer and the second time convolutional layer are as follows:

in equation (1.1), X represents the input matrix of the time convolution layer, and the input X of the first time convolution layer of the space-time convolution module corresponds to the node feature matrix Y obtained in S3 ⁰ Let CNN have a convolution kernel size of K _t Then each time a layer of time convolution layer is passed, the time sequence of nodes is reduced by K _t -1, wherein C _out1 Representing an output dimension;

the first space convolution layer adopts APPNP to extract the space characteristics of the nodes, and the first space convolution layer decouples the message propagation formula of the graph convolution into a neighborhood aggregation formula and a characteristic conversion formula; when the first space convolution layer adds a hyper-parameter control node to a neighborhood aggregation formula to perform neighborhood aggregation, aggregating partial original characteristics of the node, wherein the expression of the first space convolution layer is as follows:

H ⁽⁰⁾ ＝Z＝f _θ (Y ₁ ) (1.2)

equation (1.2) is a feature transformation equation, where f _θ The neural network with the parameter set theta is used for predicting the class of the node by using the formula; the formula (1.3) is a neighborhood aggregation formula, and the characteristics of the neighbor nodes of each node can be aggregated by the formula (1.3); wherein deltaRepresenting the percentage of label information aggregated by any node v from the original characteristics of the node v for the set hyper-parameter, wherein Z is the initial characteristics of the node; when neighborhood aggregation is carried out, the over-smoothening problem is solved by aggregating part of original features Z; wherein

A is an adjacent matrix, I is an identity matrix, H ^(l+1) An activation matrix representing the (l + 1) th layer; in equation (1.4), L represents the number of power iteration steps. Wherein l is E [0,L-2](ii) a Extracting and aggregating the spatial characteristics of the nodes through formulas (1.2), (1.3) and (1.4);

time series through a time convolution layer node will reduce K _t -1, a space-time convolution module contains two time convolution layers, so that the time sequence of nodes is reduced by 2K for each time block _t -2；

Set to adopt L ₁ The time sequence of the node feature matrix is reduced by 2L when the space-time convolution module extracts the features ₁ K _t -2L ₁ Then pass through L ₁ Output matrix after space-time convolution module

Wherein C is _out Representing the output dimension of the last time convolution layer.

Further, reducing the dimensions of the extracted spatio-temporal features to integers is completed in an output layer of the rumor propagation model, wherein the output layer comprises: a time convolution layer through which the time sequence of the node feature matrix is reduced to 1, and a full link layer through which the output dimension C of the last time convolution layer is reduced _out The reduction is 1 because the time series is reduced to 1,C _out After the dimension is reduced to 1, the dimension can be reduced through a reshape function, and the dimension of the 4-dimensional matrix is reduced to a two-dimensional matrix

Wherein the value of the ith row and j column in the matrix Y represents that the node j is the rumor source under the infection process of the corresponding ith batchThe probability of a node, the output layer, is expressed as the following formula:

wherein FC represents a full connection layer, in order to realize that the sum of the label values under each batch is 1, the output of the node is normalized by using a Softmax function, and finally a matrix Y after normalization processing is obtained ₄ The normalized node label value corresponds to the probability that the node is the rumor source node.

Further, the highest value of the node labels in the result obtained in S5 is used as a rumor source, and the specific process is as follows: using the argmax function to obtain the node with the highest label value of each batch in the Q batches as the predicted rumor source node of the batch, where the expression is as follows:

Y＝arg max(Y ₄ ) (1.6)

finally, Y can determine the predicted rumor source nodes in all batches.

Further, using a cross-entry function as a loss function in step S6, the source node detection task is described as follows:

wherein V represents any node in the graph, V represents a set of all nodes in the graph, G represents the social network graph, S represents the input snapshot graph, and P (V | (G, S)) represents the probability that the node V becomes the source node.

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

the node characteristics are extracted from a plurality of snap maps in different batches, the accuracy of rumor source detection is improved, and the problems of over-smoothing and overlong training time of the traditional space-time convolution model are solved by introducing the improved space-time convolution module.

Drawings

Fig. 1 is a flowchart illustrating a rumor source detection method based on a spatio-temporal convolutional neural network according to an embodiment of the present invention.

FIG. 2 is a schematic view of an infection model according to an embodiment of the present invention.

Fig. 3 is a process diagram of a node feature matrix processing algorithm according to an embodiment of the present invention.

FIG. 4 is a schematic diagram of a space-time convolution module according to an embodiment of the present invention.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention, taken in conjunction with the accompanying drawings and detailed description, is set forth below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.

Example 1

As shown in fig. 1, a rumor source detection method based on a space-time convolutional neural network includes the following steps:

it should be noted that the snapshots of the different batches correspond to different and independent rumor-transmitted infection processes; the several snapshots at different time periods correspond to the infection status of the user nodes in the graph at different time periods of the rumor propagation process.

The method for acquiring the snapshot maps of different batches comprises the following steps: selecting any node in the snapshot as a source node in each batch, then iteratively propagating infection for multiple times, and recording the infection states of all nodes in the snapshot after each iterative infection in the snapshot;

In one embodiment, the multiple snapshot is constructed by collecting rumor propagation data on the social network, and recording the infection status of all nodes of the social network at different periods of propagation of a single rumor source according to the collected rumor propagation process data. By recording different rumor propagation processes, each rumor propagation process is recorded as a batch, random screening is performed corresponding to the multiple snapshots under all the batches, namely, a snapshot recorded in a part of each batch is selected as input, so that the over-smoothening probability of the model is reduced. All the propagation processes are divided into a training set, a testing set and a verification set. In the invention, a simulated rumor propagation process is realized by adopting simulated infection, a social network graph is given, any node in the graph is selected as a rumor source node for infection, and the fast graph of the state of the node in the infection process is recorded and stored. The process is a single batch multiple infection snapshot generating process.

it should be noted that rumor transmission is similar to viral transmission, and therefore the present invention uses SIR infection as a transmission model. And giving a label corresponding to each node to each user node in the rumor infection quick map according to the infection condition of the user node, wherein if the user node is an infected person, namely S, the label is assigned to be +1, if the user node is an uninfected person, namely I, the label is assigned to be-1, and if the user node is a healed person after infection, namely R, the label is assigned to be +1.

Infection models are currently more common in three categories: the Susceptible-fed (SI) model, the Susceptible-fed (SIS) model and the Susceptible-fed-recovery (SIR) model have the advantages that the rumor propagation process is more similar to the propagation mode of the SIR model, and the SIR model is adopted in the invention. Shown in FIG. 2, the nodes in the graph represent social network users and the edges represent friendships between users. The susceptable node represents a susceptible user, the infested node represents an infected user, and the recovery node represents a healing user. Rumors can only spread from infected to susceptible nodes and require the presence of two usersFriend relationship (i.e., wired connection), and each infected user will recover with a certain probability to become a recovering user, and the recovering user will not be infected again, nor will it infect or cure neighboring users. For example v in FIG. 2 ₁ The node will be v with probability p ₂ The node is infected. And v is ₁ The probability of a node existing q becomes a healing node. The population entities are classified into three categories of SIR according to rumor practice and are given labels.

the specific process is as follows:

Then defining the matrix assignment of P and O as the two-dimensional matrix

The P matrix assigns a value of-1 to the element in its matrix as 0,O the matrix assigns a value of +1 to the element in the matrix0；

In order to obtain the rumorectity feature and the sourcepromience feature of the node, 4 matrixes a, b, c and d are defined, wherein the four matrixes a, b, c and d are respectively a = X ₁ ，b＝(1-α)(I-αU) ^-1 X ₁ ，c＝(1-α)(I-αU) ^-1 P，d＝＝(1-α)(I-αU) ^-1 O；

Where the P and O matrices are defined to obtain the c and d matrices. The b matrix can capture the source probability characteristic of the node, and c and d can capture the rumor center probability characteristic of the node; then splicing the four matrixes of a, b, c and d through a concatenate function, and restoring the matrixes into a 4-dimensional matrix Y through a reshape function ⁰ I.e., a node feature matrix, which further improves the rumor source detection accuracy of the model by capturing the source contribution and rumor centrolitity features of the nodes as input to the model.

S4: inputting a preset social network diagram and the obtained node feature matrix into a time-space convolution module to extract time-space features to form a time-space feature matrix;

it should be noted that the model input of the present invention is different batches of snapshots constructed based on the social network diagram, and the time convolution layer in the spatio-temporal convolution model is based on the node feature matrix Y obtained in S3 ⁰ The time characteristics of node propagation are extracted, so a node characteristic vector matrix is required as input. Meanwhile, the spatial convolution layer in the spatio-temporal convolution model extracts the spatial characteristics of the nodes based on the node connectivity of the social network diagram, so that the social network diagram also needs to be used as the input of the model.

More specifically, the node characteristic matrix is represented as QXT × N × C _in Wherein Q is the number of snapshot image batches, T is the number of time snapshots collected under each batch, N is the number of observed user nodes, C _in Is the input dimension.

It should be noted that the space-time convolution module includes: the time-space convolutional code comprises a first time convolutional layer, a first space convolutional layer and a second time convolutional layer, wherein the output end of the first time convolutional layer is connected with the input end of the first space convolutional layer, the output end of the first space convolutional layer is connected with the input end of the second time convolutional layer, the first time convolutional layer and the second time convolutional layer are both composed of a CNN and an activation function GLU, and the expressions of the first time convolutional layer and the second time convolutional layer are as follows:

in equation (1.1), X represents the input matrix of the time convolution layer, and the input X of the first time convolution layer of the first spatio-temporal convolution module corresponds to the node feature matrix Y obtained in S3 ⁰ Let CNN have a convolution kernel size of K _t The time sequence of nodes is reduced by K for each layer of time convolution layer _t -1, wherein C _out1 Representing an output dimension;

H ⁽⁰⁾ ＝Z＝f _θ (Y ₁ ) (1.2)

equation (1.2) is a feature transformation equation, where f _θ The neural network with the parameter set theta is used for predicting the class of the node by using the formula; the formula (1.3) is a neighborhood aggregation formula, and the characteristics of the neighbor nodes of each node can be aggregated by the formula (1.3); wherein, δ is a set hyper-parameter representing the percentage of label information aggregated by any node v from the original characteristics of the node v, and Z is the initial characteristics of the node; by aggregating parts of primitive characters when performing neighborhood aggregationZ is characterized so as to overcome the over-smoothening problem; wherein

time series through a time convolution layer node will reduce K _t -1, a space-time convolution module contains two time convolution layers, so that the time sequence of nodes is reduced by 2K for each time block passes _t -2；

Set to adopt L ₁ The time sequence of the node feature matrix is reduced by 2L when the space-time convolution module extracts the features ₁ K _t -2L ₁ Then go through L ₁ Output matrix after space-time convolution module

Example 2

The present embodiment specifically explains the determination of the spatio-temporal feature matrix and the rumor source node based on the above process.

it should be noted that, in step S4, the spatio-temporal feature matrix has been extracted, and then the dimensionality reduction two-dimensional matrix is needed, and the compression features are further aggregated, so as to obtain the label value of each node as the probability that the node is the rumor source. Reducing the dimensions of the extracted spatio-temporal features to integers is completed in an output layer of the rumor propagation model, wherein the output layer comprises: a time convolution layer through which the time sequence of the node feature matrix is reduced to 1, and a full link layer through which the output dimension C of the last time convolution layer is reduced _out The reduction is 1 because the time series is reduced to 1,C _out After reducing to 1, the gas can pass through reshReducing the dimension of the ape function to a two-dimensional matrix from a 4-dimensional matrix

Wherein, the value of the ith row and j column in the matrix Y represents the probability that the node j is the rumor source node in the corresponding ith batch infection process, and the output layer is expressed by the following formula:

It should be noted that, in this step, the node with the largest tag value can be selected through the argmax function. And (4) selecting node values in the N-dimensional vectors according to a source centrality principle, wherein the selected node list is a possible rumor source calculated by a trained algorithm. In this step, the two-dimensional matrix Y obtained in S5 is used ₄ The predicted rumor source nodes for each batch are finally obtained as input to the argmax function.

The specific process is as follows:

using the argmax function to obtain the node with the highest label value of each batch in the Q batches as the predicted rumor source node of the batch, where the expression is as follows:

Y＝arg max(Y ₄ ) (1.6)

finally, Y can determine predicted rumor source nodes in all batches.

Using the cross-entry function as a loss function in step S6, the source node detection task is described by the following formula:

Example 3

The embodiment performs verification and analysis through specific data.

S1: in the present invention, given a social network graph G, the number of nodes in the graph is 1000. SIR infection process was simulated by ndlib. With the social network diagram as input, 2000 batches of infection processes were generated in the present invention, each infection process being 30 iterative infections. Therefore, a Random function is used to give an array of 2000 Random numbers, the number range is (0,N ], N is the node number of the social network graph, the array is used as a rumor source under 2000 infection batches, 30 times of iterative infection is carried out, the infection rate and the recovery rate are respectively set to be 0.25 and 0.1, the node state after each iteration is recorded to form a snapshot, 30 infection snapshots are generated corresponding to each infection batch, then the infection snapshot under each batch is randomly selected, 16 infection snapshots are randomly selected as input, the final input is 2000 infection process batches, 16 infection process snapshots under each batch are used as input, 1600 infection process snapshots are used as a training set, 200 infection processes are used as a test set, 200 infection processes are used as a verification set, and the training batches are simultaneously set to be 50 times.

S2: and setting a circulation setting label corresponding to the initial condition node, setting the label of an infected node as +1, setting the labels of an uninfected node and a healed node after infection as-1, and finishing setting the node label corresponding to each snapshot of the infection process of 2000 batches according to the infection condition. Then, a node infection state matrix is constructed according to the node label condition

S3: the purpose of this step is to obtain the sourceprominence and rumourtry characteristics of the nodes. A flow chart of this algorithm is shown in fig. 3. And according to the node state matrix X obtained in the S2, simultaneously obtaining a corresponding adjacency matrix A through the social network diagram G in the S1, and obtaining a degree matrix D from the adjacency matrix. To avoid that too high aggregation ratio would lead to overfitting problems, the hyper-parameter α is set to 0.5 and X, a, D, α is taken as input to the multi-batch multi-snapshot input algorithm. Fig. 3, step 1, obtains the normalization matrix U by the following formula:

in step 2 of FIG. 3, the three-dimensional matrix X is reduced to a two-dimensional matrix by a reshape function

P and O are defined and initialized next in steps 3 through 12 of the algorithm of fig. 3. And then modifying the value of the label-1 in P into 0 and modifying the value of the label-1 in O into 0 through two-time circulation. And acquiring a matrix c and a matrix d through P and O. The matrices a, b, c, d are obtained in steps 13 to 16. And acquiring the source probability characteristic of the node through the matrix b, and acquiring the rumor center characteristic of the node through the matrix c and the matrix d. The matrices a, b, c, d are all equal to the matrix X ₁ The dimensions are consistent. And then splicing the matrixes a, b, c and d through a concatenate function to obtain the matrix

Finally, generating a matrix through a reshape function

S4: extracting time-space characteristics from a node characteristic matrix Y space-time convolution module corresponding to the social network diagram G and S3 to form a time-space characteristic matrix;

s4: the structure of the model is shown in fig. 4. And the space-time convolution module corresponds to the ST-Conv Block in the graph.

S4.1, setting training parameters: number of space-time convolution kernel modulesSet to 2, the input and output dimensions of the first layer of temporal convolution layer corresponding to the first spatio-temporal convolution module are 4, 64, respectively. The input and output dimensions of the second layer of spatial convolution layers are 64, respectively. The third layer time convolution layer is 64, 128. The input dimension and output dimension settings in the second space-time convolution module are respectively: the first layer is 128, 64 and the second layer is 64, 64. The third layer is 64, 72. Convolution kernel size K corresponding to time convolution layer at the same time _t Set to 3, the convolution kernel size of the spatial convolution layer is set to 5.

S4.2, setting a loss function: since the problem addressed by the present invention is a multi-classification problem, a cross-entropy function is employed as the loss function. Batch gradient descent was performed using a RMSProp optimizer. The learning rate starts at 0.001 and the decay rate is 0.7 per 5 training batches.

S4.3, feature extraction and training: time convolution kernel size K of model _t The node feature matrix dimension is set to be 3 after the time features of the nodes are extracted and fused by the first layer of time convolution layer, and is 2000 multiplied by 14 multiplied by 1000 multiplied by 64, and then the dimension of the node feature matrix after the spatial features are extracted and fused by the second layer of spatial convolution layer is kept unchanged and is still 2000 multiplied by 14 multiplied by 1000 multiplied by 64. And after the last layer of time convolution layer, the time sequence of the matrix is further compressed and fused, and the dimension of the node characteristic matrix is reduced to 2000 multiplied by 12 multiplied by 1000 multiplied by 128. The second layer space-time convolution module is similar to the first layer. Final node feature matrix after two layers of time convolutional layer and one layer of space convolutional layer

S5: constructing a layer of time convolution layer and a layer of full connection layer as output layers, and taking the node characteristic matrix Y in S4 as an output layer ₃ As input to the output layer. Y is ₃ And performing convolution compression of the time characteristics through the first layer of time convolution layer, and fusing the characteristics. Compressing the time series of nodes to 1 by the time convolution layer to obtain

Matrix arrayThen, feature fusion is carried out on the output dimensionality through the full connection layer to be compressed into 1 to obtain a matrix

Finally, the matrix Y is formed ₃ Matrix obtained after dimensionality reduction

However, since the sum of the label values of each batch is not 1, in order to obtain the probability that each node under each batch is called a rumor source node, the invention normalizes through the Softmax function to obtain the final probability matrix Y.

S6: according to the principle of source centrality, the node with the highest label value among the N nodes is the predicted rumor source corresponding to each batch infection process. The screened node list is the possible rumor source calculated by the trained algorithm. In this step, the node with the largest tag value can be screened out through the argmax function. And taking the final matrix Y as the input of the argmax function to obtain the node with the maximum rumor source probability under each batch.

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A rumor source detection method based on a space-time convolution neural network is characterized by comprising the following steps:

s1: constructing snapshot graphs of different batches of spreading infection according to different rumor spreading processes, wherein the snapshot graph of each batch comprises a plurality of snapshots in different time periods;

s6: and taking the node corresponding to the highest value of the node label in the result obtained in the step (S5) as the predicted rumor source node of the batch.

2. The method of claim 1, wherein said different batches of snapshots correspond to different and independent rumor-transmitted infection processes; the several snapshots at different time periods correspond to the infection status of the user nodes in the graph at different time periods of the rumor propagation process.

3. The method of claim 2, wherein the different batches of snapshot map acquisition methods comprise: selecting any one node in the snapshot map as a source node in each batch, then propagating infection through multiple iterations, and recording the infection states of all nodes in the map after each iteration infection in the snapshot map;

4. The method of claim 1, wherein a rumor propagation model is determined, and label assignment of nodes according to user node infection is performed by: selecting an SIR (Signal to interference ratio) propagation model as a rumor propagation model, distributing nodes according to infection conditions according to the types of user nodes, marking uninfected people as susceptable (S) classes and representing users who do not receive rumor information; the infected people are marked as infested (I) class, representing that the user has forwarded or released rumors; the population recovering after infection is marked as recovery (R) class, which represents the users who find the contents of the rumors after forwarding or releasing the rumors and delete the rumors; node labels are assigned according to SIR conditions, and the S-type label is assigned as-1,I and is assigned as +1,R and is assigned as-1.

5. The method of claim 1, wherein the specific process of step S3 is as follows:

constructing a matrix according to the position relation construction relationship diagram of nodes in the rumor propagation model

An adjacency matrix that is a relational graph, where D is a diagonal matrix whose (i, i) elements are equal to the sum of the ith row of A;

Then defining the matrix assignment of P and O as the two-dimensional matrix

The P matrix assigns a value of-1 to the element in its matrix as 0,O the matrix assigns a value of +1 to the element in the matrixIs 0;

in order to obtain the node rumor central feature and the source probability feature, 4 matrixes a, b, c and d are defined, wherein the four matrixes a, b, c and d are respectively a = X ₁ ，b＝(1-α)(I-αU) ^-1 X ₁ ，c＝(1-α)(I-αU) ^-1 P，d＝＝(1-α)(I-αU) ^-1 O；

Defining P and O matrixes to obtain c and d matrixes, wherein the b matrix can capture the source probability characteristic of the node, and the c and d can capture the rumor center probability characteristic of the node; then splicing the four matrixes of a, b, c and d through a concatenate function, and restoring the matrixes into a 4-dimensional matrix Y through a reshape function ⁰ I.e. the node signature matrix.

6. The method of claim 1, wherein the node feature matrix is represented as Q x T x N x C _in Wherein Q is the number of snapshot image batches, T is the amount of time snapshots collected under each batch, N is the number of observed user nodes, C _in Is the input dimension.

7. The method of claim 1, wherein the spatio-temporal convolution module comprises: the time-space convolutional code comprises a first time convolutional layer, a first space convolutional layer and a second time convolutional layer, wherein the output end of the first time convolutional layer is connected with the input end of the first space convolutional layer, the output end of the first space convolutional layer is connected with the input end of the second time convolutional layer, the first time convolutional layer and the second time convolutional layer are both composed of a CNN and an activation function GLU, and the expressions of the first time convolutional layer and the second time convolutional layer are as follows:

in equation (1.1), X represents the input matrix of the time convolution layer, and the input X of the first time convolution layer of the spatio-temporal convolution module corresponds to the node feature obtained in S3Matrix Y ⁰ Let CNN have a convolution kernel size of K _t The time sequence of nodes is reduced by K for each layer of time convolution layer _t -1, wherein C _out1 Representing an output dimension;

the first space convolution layer adopts APPNP to extract the space characteristics of the nodes, and the first space convolution layer decouples the message propagation formula of the graph convolution into a neighborhood aggregation formula and a characteristic conversion formula; when the first space convolution layer adds a hyper-parameter control node to a neighborhood aggregation formula in the first space convolution layer for neighborhood aggregation, the first space convolution layer expression comprises the following parts:

H ⁽⁰⁾ ＝Z＝f _θ (Y ₁ ) (1.2)

equation (1.2) is a feature transformation equation, where f _θ The neural network with the parameter set theta is used for predicting the class of the node by using the formula; the formula (1.3) is a neighborhood aggregation formula, and the characteristics of the neighbor nodes of each node can be aggregated by the formula (1.3); wherein, δ is a set hyper-parameter representing the percentage of label information aggregated by any node v from the original characteristics of the node v, and Z is the initial characteristics of the node; when neighborhood aggregation is carried out, the over-smoothening problem is solved by aggregating part of original features Z; wherein

A is an adjacent matrix, I is an identity matrix, H ^(l+1) An activation matrix representing the (l + 1) th layer; in equation (1.4), L represents the number of power iteration steps, where L ∈ [0,L-2](ii) a Extracting and aggregating the spatial characteristics of the nodes through formulas (1.2), (1.3) and (1.4);

convolutional layer node over timeTime series of (2) will decrease K _t -1, a space-time convolution module contains two time convolution layers, so that the time sequence of nodes is reduced by 2K for each time block passes _t -2；

8. The method of claim 1, wherein the dimensionality reduction of the extracted spatio-temporal features to an integer output layer of the rumor propagation model is performed, said output layer comprising: a time convolution layer through which the time sequence of the node feature matrix is reduced to 1, and a full link layer through which the output dimension C of the last time convolution layer is reduced _out The reduction is 1 because the time series is reduced to 1,C _out After the dimension is reduced to 1, the dimension can be reduced through a reshape function, and the dimension of the 4-dimensional matrix is reduced to a two-dimensional matrix

wherein FC represents a full connection layer, in order to realize that the sum of the label values under each batch is 1, the output of the node is normalized by using a Softmax function, and finally a matrix Y after normalization processing is obtained ₄ Normalized node labelsThe value corresponds to the probability that the node is the rumor source node.

9. The method of claim 1, wherein a node corresponding to a highest node label value in the result obtained in S5 is used as a rumor source node, and the method comprises the following steps:

using the argmax function to obtain the node with the highest label value of each batch of the Q batches as the predicted rumor source node of the batch, where the expression is as follows:

Y＝argmax(Y ₄ ) (1.6)

finally, Y can determine the predicted rumor source nodes in all batches.

10. The rumor source detection method based on spatio-temporal convolutional neural network of claim 1, wherein cross-entropy function is used as loss function in step S6, and the source node detection task is described as following formula: