CN114826678A

CN114826678A - Network propagation source positioning method based on seepage process and evolutionary computation

Info

Publication number: CN114826678A
Application number: CN202210321271.1A
Authority: CN
Inventors: 刘洋; 汪小琦; 王震; 王茜; 李学龙
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2022-07-29
Anticipated expiration: 2042-03-24
Also published as: CN114826678B

Abstract

The invention provides a network propagation source positioning method based on a seepage process and evolutionary computation. Firstly, inputting a network data set to obtain the node and edge attribute of the network data set, and initializing propagation model parameters; then, based on the seepage and evolutionary computation correlation theory and method, adopting an AEF algorithm to iteratively update an initial observation point sequence to obtain a final sequence, and arranging observation points in a certain proportion in the network according to the sequence; secondly, randomly selecting a propagation source in an infection state to start a propagation process, and stopping the propagation process until detection reaches a certain outbreak range; searching a target connection sheet according to the information captured by the observation point to obtain a sub-image, and starting an RIS algorithm on the sub-image to detect a propagation source; finally, the neighbors within the fixed hop count of the detected propagation source are added into a candidate set, and the candidate set can be used as a range for subsequently searching the real propagation source. The invention can realize the rapid propagation source positioning of the large-scale network, thereby controlling the malicious information propagation in time and reducing the loss caused by the malicious information propagation.

Description

Network propagation source positioning method based on seepage process and evolutionary computation

Technical Field

The invention belongs to the technical field of network information propagation, and particularly relates to a network propagation source positioning method based on a seepage process and evolutionary computation.

Background

Various complex networks such as social networks, power networks, road traffic networks and the like exist in the current world, and the high interconnectivity and cohesion of the complex networks facilitate information exchange between nodes and increase the chances of various risks in the networks. For example, rumors spread rapidly in social networks, computer viruses infect large numbers of hosts in a short time, and outbreaks of infectious diseases among people. Therefore, the area where the propagation source is located is quickly positioned, and the influence brought by the point spread of the propagation source is controlled, so that the method has very important research value and significance.

The main task of the propagation source localization problem is to design an estimator that can infer the propagation source, where the most desirable estimator is one that can find the true source. However, due to the complexity of the node communication pattern and the uncertainty of the diffusion model, even if the underlying network is a tree network, the designed estimator is almost impossible to infer the true source in theory. Thus, the error distance is developed and used as a criterion to evaluate the performance of an estimator: one estimator is said to be better than the other if the corresponding inferred source is closer in distance to the real source. Based on different assumptions of known information, researchers have developed different methods to minimize the error distance. However, in practice, we face the problems of: after obtaining an estimator with a smaller error distance, how to trace to the source? Indeed, one can perform more intensive detection of the estimated vicinity of the propagation source, eventually achieving the localization of the real source. In this scenario, for a network with a relatively simple structure, a small error distance usually indicates that we only need to perform further more intensive detection on a small number of nodes to find out the true propagation source. However, since most real world networks are heterogeneous, i.e., where a node may be directly connected to a plurality of nodes, the size of the more densely detected nodes in the neighborhood may be proportional to the size of the network, which is obviously not feasible in practice.

To date, there has been a great deal of research directed to locating dissemination sources in complex networks, and more algorithms are proposed to detect dissemination sources in networks that carry false or malicious information. The algorithms for propagation source localization can be generally classified into three major categories: 1) the method is based on a complete observation graph, namely, a researcher obtains state information and infection time information of all nodes of a network to detect propagation sources, such as a rumor centrality, a minimum description length method and a source identification method of node dynamic ages. 2) The method based on network snapshot observation, that is, the condition that a researcher obtains the condition that each node in the network receives and spreads information in a unit time, is easier to satisfy compared with a completely observed graph. Such as the Jordan center method, dynamic message propagation method. 3) The method based on sensor observation is that researchers arrange a certain number of observation points in a network as sensors to acquire infection information of a specific node to detect a propagation source in the network. Pinto et al first proposed such a method in 2012, which is based on two assumptions, namely that the network propagation delay obeys gaussian distribution, and that the propagation path of information is a deep traversal tree with nodes as roots. And estimating a propagation source by using a maximum likelihood estimation method by monitoring the time of the initial change of the observation point state and the direction of an information source. The node centrality method is a feasible means for analyzing network attributes, and some algorithms also adopt various centrality methods to identify propagation sources, such as degree centrality, approach centrality and betweenness centrality.

The propagation source node detection problem is significant, but at present, some problems still exist. On one hand, most current methods are based on tree structure network design, and most networks in practice are complex networks. Therefore, the propagation source detection is directly performed on a general network by using or expanding a tree network-based method, and the problems of reduced detection efficiency, difficulty in ensuring accuracy and the like generally exist. On the other hand, most of the existing methods are designed for small networks, the computational complexity is high, and the methods are difficult to be practically applied to general large-scale networks.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a network propagation source positioning method based on a seepage process and evolutionary computation. Firstly, inputting a network data set to obtain the node and edge attribute of the network data set, and initializing propagation model parameters; then, based on the seepage and evolutionary computation correlation theory and method, adopting an AEF algorithm to iteratively update an initial observation point sequence to obtain a final sequence, and arranging observation points in a certain proportion in the network according to the sequence; secondly, randomly selecting a propagation source in an infection state to start a propagation process, and stopping the propagation process until detection reaches a certain outbreak range; finding a target connection piece according to observation point capture information to obtain a sub-graph V' _c In subfigure V' _c The upper starting RIS algorithm detects the propagation source; finally, the neighbors within the fixed hop count of the detected propagation source are added into a candidate set, and the candidate set can be used as a range for subsequently searching the real propagation source. The invention can realize the rapid propagation source positioning of the large-scale network, thereby controlling the malicious information propagation in time and reducing the loss caused by the malicious information propagation.

A network propagation source positioning method based on a seepage process and evolutionary computation is characterized by comprising the following steps:

step 1: inputting an experimental network data set G (V, E), wherein V represents a network node set, and E represents an edge set in the network; edge infection rate beta for initializing fixed propagation model _uv Node recovery ratio gamma _u Infection rate beta _uv Has a value range of [0,1 ]]Node recovery rate γ _u Has a value range of [0,1 ]](ii) a Determining the explosion rate epsilon, wherein the value range of the explosion rate epsilon is [0, 1%](ii) a Initializing all nodes in a network to be in a susceptible state;

step 2: constructing and obtaining an initial graph whole node sequence S by adopting a random ordering or node degree ordering method, updating the node sequence S by adopting an AEF algorithm, and selecting the updated node sequence according to the sequence from front to backUsing q nodes as observation points in proportion to form an observation point set O, marking the observation points on the network, recording the absolute time of the observation points infected, and randomly determining the proportion R from the observation point set O _d ＝|O _d I/O I observation points form a set O _d Recording the infected direction information of the observation point; q has a value range of [0,0.2 ]]，R _d Has a value range of [0.001,1 ]]；

And step 3: at the moment t is 0, a propagation source V is randomly selected from all nodes in the data set G (V, E) _s In an infected state, starting a transmission process; during the propagation, the observation point records the absolute time of its own infection, set O _d The observation point in (1) also records the infected direction information, and the node in the infected state has the side infection rate beta _uv Spreading virus to neighbor nodes in susceptible state, and simultaneously, the nodes in infected state can recover the rate gamma _u Entering a recovery state, enabling the infected node to enter an infection state to become a new infection node, continuing the propagation behavior of all nodes in the infection state until the number n1 of the infection nodes and the number n2 of the recovery nodes in the network meet (n1+ n2)/n ≧ epsilon, stopping the propagation process, and forming an infection graph G by the obtained node infection situation distribution graph _I (ii) a Wherein n represents the number of nodes contained in the network G;

and 4, step 4: taking the observation point set O in the step 2 as a removed node set V _r And other nodes except the observation points in the data set G form a residual node set V _o (ii) a For removing node set V _r Remove set V _o Node in (1) and V _r A plurality of communicating sheets with different sizes and larger than 1 are obtained after the edges connected by the middle nodes and are marked as c _i Denotes the ith communication piece, wherein i is 1,2, …, C denotes the total number of communication pieces; according to

Determination of communication piece c _i Is limited by

Wherein u represents the set of removed nodes V _r Arbitrary node in (1), v tableCommunication sheet c _i Arbitrary node in (b), e _uv Shows the infection pattern G _I An edge connecting node u and node v; according to

Determining to remove node set V _r Wherein Γ (u) represents the node u in the infection map G _I Set of neighbor nodes in c _i (v) Indicating the connected slice c to which the node v belongs _i ；

And 5: selecting observation points with the earliest infected time to form an infected observation point subset O'; according to the formula

Construct subgrade V' _c Where x represents any observation point in the subset O', and α (x) represents observation point x in the infection map G _I The communication piece covered area where the neighbor node is located;

step 6: according to t' _x ＝t _x -t _min Calculating the relative time of infection t' _x Wherein, t _x Indicating the time at which observation point x was infected,

in subfigure V' _c The propagation source is found and obtained by adopting the RIS algorithm

And 7: will propagate the source

Neighbor nodes within a fixed order of V add to the candidate set V _c With the relative size of the candidate set phi ═ V _c The | n is used as an evaluation index, and the smaller phi represents the smaller range of inhibiting infection; wherein, | V _c I represents the candidate set V _c The number of the contained nodes, and the fixed order is first order or second order.

Further, the specific process of updating the node sequence S by using the AEF algorithm described in step 2 is as follows:

step a: according to

Randomly determining a segment length parameter n for sequence segmentation _s Segmenting the node sequence S from front to back, wherein each segment comprises n _s Each node, the kth segment sequence is recorded as S _k N is the total number of nodes contained in the sequence S, and K is 1, 2.

Step b: for each segment sequence S _k Updating in parallel according to the following process to obtain an updated segmented sequence S _k Wherein the number of updates is

Step S1: initializing parameters, and making cyclic counting variable j equal to n _s Initial intermediate node sequence S' _k ＝S _k Construction of initial subgraph G' _k (V′ _k ,E′ _k ) In which the order

Represents the sequence S _k Splicing sequence of all sequences thereafter, V' _k Set of nodes being subgraphs, containing sequences

All nodes, E' _k ＝E∩(V′ _k ×V′ _k ) Is a set of edges, V ', constituting a subgraph' _k ×V′ _k Representing the set of all possible connected edges between the node sets; randomly determining a selection time parameter delta and a node selection parameter x, delta epsilon [1,50 ∈ ]]，x∈(0,1]；

Step S2: from set { S' _k (z),z∈[max(j-x×n _s ,1),j]Randomly selecting nodes, selecting for delta times, and forming a candidate set by the selected nodes

Wherein, S' _k (z) represents the sequence S' _k The z-th node in (a);

step S3: by candidate sets

Is selected to satisfy

Wherein y is the candidate set

Xi (y) represents the size of the connection piece of the node y according to

Or

Calculated, c (y) represents a connected slice set comprising the node y, c' _i Represents any connected piece, | c' _i L represents a linking piece c' _i The number of nodes of (c);

step S4: according to V' _k ←V′ _k U { r } update sub-graph G' _k Node set of V' _k And then according to the updated node set V' _k According to

Update subgraph G' _k Side set E 'of' _k Where { r } denotes a set including the node r, and V 'denotes an updated node set V' _k Arbitrary node of middle non-node r, e _rv′ Represents original image G' _k The edge connecting the middle node r with the node v'; if is at sequence S' _k Wherein is present of S' _k (z) ═ r node S' _k (z), exchange S' _k (j) And S' _k (z)；

Step S5: if j is greater than 0, returning to step S2; otherwise, if

Then S _k ＝S′ _k Let us order

Returning to step S1, when

Then, the sequence S obtained _k I.e. the updated sequence; where F denotes a correlation evaluation index function with respect to the sequence, in terms of F ═ Σ _q |c″ _max Calculated as | c | "/n _max I represents the node number of the maximum connection piece of the sequence under different q values, n represents the total node number, and the value range of q is [1/n ] _s ,1]Step length of 1/n _s ，

A splice sequence representing both sequences;

step c: let T _p ＝T _p -1, returning to step a, when counting variable T _p When 0, the final sequence S is obtained; wherein, the number of nodes n is less than or equal to 10 ⁵ Network T of _p 5000, node number 10 ⁵ ＜n≤10 ⁶ Network T of _p 2500, the number of nodes n > 10 ⁶ Network T of _p ＝500。

Further, the specific process of finding the propagation source by using the RIS algorithm in step 6 is as follows:

step a: initializing the node set Lambda as an empty set; for subgrade V' _c Let G ' (V ', E ') be its inverse network, satisfying | V ' | ═ V ' _c And if edge e _ba Is contained in subpicture V' _c Middle, side e _ab E 'where, | V' _c L represents sub-diagram V' _c The number of nodes in the reverse network is expressed by V ' and E ', the node set and the edge set of the reverse network are respectively expressed by | V ' | expressing the number of nodes in the reverse network;

step b: randomly selecting a node m from the infected observation point subset O ', and obtaining t ″ -t' ₀ +t′ _m Calculating the random walk step length t' of the nodeOf medium to t' _m Denotes the relative time of infection of node m, t' ₀ Is a slave interval

The random number of (a) and (b),

has a value range of [0,20 ]]；

Step c: taking the node m as a random walk starting point, starting random walk to one of random neighbors of the node m, and then changing the state of the node m into recovery;

step d: the walking lasts for t' step, v represents the last node of random walking, and the node set Lambda is updated according to Lambda ═ Lambda { v };

step e: repeating the steps b-d for T _Λ And obtaining a final updated node set Lambda, wherein the node with the most occurrence times in the final updated node set Lambda is the propagation source

T _Λ Is taken as value of 10 ⁶ 。

The invention has the beneficial effects that: due to the adoption of a method based on the combination of the network seepage process and the evolutionary computation, the observation point sequence arranged in the network is optimized, and the connected piece model based on the observation point removal set in the network is inhibited, so that fewer observation points can be set to realize the positioning of the propagation source, and the network protection cost is reduced; by combining the relevant strategy of the network immunity problem, a few nodes are isolated by using the observation point information to control the spread of epidemic diseases, and the spread source positioning range and the real spread source searching range are reduced. The invention provides technical support for restraining malicious propagation under resource limitation, and can be used for solving the problem of propagation source positioning in a large network.

Drawings

FIG. 1 is a flow chart of the network propagation source positioning method based on the seepage process and evolutionary computation according to the present invention;

FIG. 2 is a schematic diagram of the process of determining a propagation source localization sub-graph according to the present invention;

in the figure, (a) -infection profile obtained by the transmission process; (b) figure 1 is illustrated for a sub-graph of infection graph (a); (c) figure 2 is illustrated for a sub-graph of infection graph (a);

FIG. 3 is a graphical representation of the results of candidate set ratios for different infection rates obtained using different methods in four different networks;

in the figure, (a) -ER model network result schematic diagram; (b) -a schematic diagram of SF model network results; (c) -PG network result graph; (d) -SCM network result graph;

FIG. 4 is a graph showing the results of candidate set ratios for different observation point ratios using different network immunization methods in two networks;

in the figure, (a) -ratio R is set in LOCG network _d Results are shown schematically as 0; (b) -setting a duty ratio R in a LOCG network _d Schematic of results 1; (c) -setting a ratio R in a WG network _d Results are schematic 0; (d) -setting a ratio R in a WG network _d Results are shown schematically as 1.

Detailed Description

The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.

As shown in fig. 1, the present invention provides a network propagation source positioning method based on a percolation process and evolutionary computation, which is implemented as follows:

step 1: inputting an experimental network data set G (V, E), wherein V represents a network node set, and E represents an edge set in a network; edge infection rate beta for initializing fixed propagation model _uv Node recovery ratio gamma _u Infection rate beta _uv Has a value range of [0,1 ]]Node recovery rate γ _u Has a value range of [0,1 ]](ii) a Determining the explosion rate epsilon, wherein the value range of the explosion rate epsilon is [0, 1%](ii) a Initializing all nodes in a network to be in a susceptible state;

step 2: the problem of localization of the propagation source can be seen as a network immunity problem, the objective function aims to control the propagation of epidemics by isolating few nodes, and the problem turns into: can a connectivity piece be decomposed by a network of a few nodes so that the size of the connectivity piece is small? To pairIn each network, wherein q _c And q is a critical threshold of q, wherein q is the ratio of the connected component observation point set obtained in the network (the observation point set is a removed node set): (1) if q < q _c The probability of having large connected components in the graph is high; (2) if q > q _c The probability of not having a large connected component in the graph is high. Generally, the configuration of the observation point set plays an important role in the inhibition of the size of the connection piece, so that obtaining a better node sequence is the focus of the problem of positioning the propagation source. The method adopts a random ordering or node degree ordering method to construct and obtain an initial graph overall node sequence S, and adopts an AEF algorithm based on an evolutionary framework to update the node sequence S, wherein the AEF specific process comprises the following steps:

step a: according to

Step b: for each segment sequence S _k Updating in parallel (the sequences are independent and do not influence each other) according to the following process to obtain an updated segmented sequence S _k Wherein the number of updates is

Wherein, S' _k (z) represents the sequence S' _k The z-th node in (a);

step S3: by candidate sets

Is selected to satisfy

Wherein y is a candidate set

Xi (y) represents the size of the connection piece of the node y according to

Or

Step S5: if j is greater than 0, returning to step S2; otherwise, if

Then S _k ＝S′ _k Let us order

Returning to step S1, when

Then, the sequence S obtained _k I.e. the updated sequence; where F denotes a correlation evaluation index function with respect to the sequence, in terms of F ═ Σ _q |c″ _max Calculated as | c ″, | _max I represents the node number of the maximum connection piece of the sequence under different q values, n represents the total node number, and the value range of q is [1/n ] _s ,1]Step length of 1/n _s ，

A spliced sequence representing two sequences is shown,

the same process is carried out;

For the updated node sequence, selecting q nodes as observation points in the order from front to back, forming an observation point set O, marking the observation points on the network, recording the absolute time of the infection of the observation points, and randomly determining the occupation ratio R from the observation point set O _d Forming a set O of observation points _d Recording the infected direction information of the observation point; q has a value range of [0,0.2 ]]，R _d Has a value range of [0.001,1 ]]；

And step 3: at the moment when t is 0, randomly selecting a propagation source V from all nodes in the data set G (V, E) _s In an infected state, starting a transmission process; during the propagation, the observation point records the absolute time of its own infection, set O _d The observation point in (1) also records the infected direction information, and the node in the infected state has the side infection rate beta _uv Spreading virus to neighbor nodes in susceptible state, and simultaneously, the nodes in infected state can recover the rate gamma _u Entering a recovery state, enabling the infected node to enter an infection state to become a new infection node, continuing the propagation behavior of all nodes in the infection state until the number n1 of the infection nodes and the number n2 of the recovery nodes in the network meet (n1+ n2)/n ≧ epsilon, stopping the propagation process, and forming an infection graph G by the obtained node infection situation distribution graph _I (ii) a Wherein n represents the number of nodes contained in the network G;

and 4, step 4: taking the observation point set O in the step 2 as a removed node set V _r And other nodes except the observation points in the data set G form a residual node set V _o (ii) a For removing node set V _r Remove set V _o Node in (1) and V _r A plurality of communicating sheets with different sizes and larger than 1 are obtained after the edges connected by the middle nodes and marked as c _i Denotes the ith communication piece, wherein i is 1,2, …, C denotes the total number of communication pieces; according to

Determination of communication piece c _i Is limited by

Wherein u represents the set of removed nodes V _r V represents a connection piece c _i Arbitrary node of (1), e _uv Shows the infection pattern G _I An edge connecting node u and node v; according to

Fig. 2 is a schematic diagram of a process for determining a propagation source positioning subgraph, wherein nodes are divided into three types, namely susceptible nodes, infected nodes and recovery nodes according to infected conditions, the three types are sequentially represented as three colors with different gray levels in the graph, the propagation source is represented by "star-shaped" nodes, and the observation point is represented by "cross" nodes. FIG. (a) is an infection chart in which a point t is observed _i The indicia indicating when the observation point i is infected, e.g. t ₁ Indicating the time at which observation point 1 was infected. Graph (b) shows the time when t is reached ₁ When the infection time is the earliest, O ' ═ 1}, the connected coverage area union of the set O ' is subgraph V ' _c Indicated in the figure as hatched; (c) when t is shown ₁ ＝t ₂ Is the earliest time of infection, O '═ 1,2, subfigure V' _c Is a shaded portion in the figure.

in subfigure V' _c The invention adopts the RIS algorithm proposed by Borgs et al to find and obtain the propagation source

The method comprises the following specific steps:

step a: initializing the node set Lambda as an empty set; for subgrade V' _c Let G ' (V ', E ') be its inverse network, satisfying | V ' | ═ V ' _c And if edge e _ba Is contained in subfigure V' _c Middle, side e _ab E 'where, | V' _c L represents sub-diagram V' _c The number of nodes in the reverse network is expressed by V ' and E ', the node set and the edge set of the reverse network are respectively expressed by | V ' | expressing the number of nodes in the reverse network;

step b: randomly selecting a node m from the infected observation point subset O ', and obtaining t ″ -t' ₀ +t′ _m Calculating a node random walk step length t ', wherein t' _m Denotes the relative time of infection of node m, t' ₀ Is a slave interval

The random number of (a) and (b),

has a value range of [0,20 ]]；

T _Λ Is taken as value of 10 ⁶ 。

And 7: will propagate the source

To verify the validity of the method of the present invention, experiments were performed on model networks and real networks, and the experimental network data are shown in table 1.

TABLE 1

Data set	Number of nodes	Number of edges
			ER	10000	35000
SF	10000	40000
			PG	4941	6594
SCM	7228	24784
			LOCG	196591	950327
WG	875713	4322051

In the experiment, a propagation source localization algorithm JC (Jordan Center) method, a CI (Collective Influence) method in the field of network immunity, an MSRG (Min-sum and Reverse-greedy, minimum sum and inverse greedy) method, a FINDER (fine key planes in Networks through DEep learning to find key nodes) method are adopted as comparison methods. The JC algorithm considers all infected nodes and recovery nodes to realize the positioning of the propagation source, and simultaneously, a candidate set is constructed through node sequencing. The relevant parameters of the method are set as follows:

for the number of nodes n is less than or equal to 10 ⁵ Network T of _p 5000, node number 10 ⁵ ＜n≤10 ⁶ Network T of _p 2500, the number of nodes n > 10 ⁶ Network T of _p 500; in the RIS algorithm T _Λ ＝10 ⁶ (ii) a Setting the edge infection rate beta in the network in the experiment _uv Same as a fixed value beta, recovery rate gamma _u Similarly, the fixed value γ is 0.1, and the explosion rate ∈ is 0.1.

Fig. 3 shows a schematic diagram of the results of candidate set ratios for different infection probabilities obtained by different methods in four different networks. Wherein JC denotes JC algorithm, Hubs _ s denotes observation point selection method based on node degree sorting, and PrEF (R) _d ) Denotes with respect to a particular R _d PrEF algorithm of values, e.g. PrEF (0)) Representing no directional information, while pcef (1) indicates that all directional information is known. The abscissa is the infection rate β and the ordinate is the candidate set occupation ratio φ. As can be seen from (a) and (b) of fig. 3, if the propagation process is symmetric (when the probability of infection is large), the JC algorithm is an efficient propagation source location estimator, but the performance decreases as the probability of infection decreases. In contrast, the method of the present invention exhibits more stable performance against a whole range of variation of the infection probability, and is superior to the JC algorithm when the infection probability is low, such as when β is 0.1, in SF network, Φ (praf (1)) is0.0004, and Φ (JC) is 0.0721. In addition, the present invention, praf (0), clearly performs better in ER networks than in SF networks, indicating that the more severe nodes in SF networks have an impact on the performance of the praf (0) algorithm. The real networks in fig. 3 (c) and (d) further confirm this conclusion, that the Hubs _ s algorithm works better than the JC algorithm in the PG network, but only the present method, pcef (1), works in the SCM network, while the other methods fail.

Fig. 4 is a schematic diagram showing candidate set ratio results obtained by different methods in two large networks and related to different observation point ratios, wherein CI represents CI algorithm, MSRG represents MSRG algorithm, filter represents filter algorithm, and praf represents the method of the present invention. The abscissa is the observation point ratio q, the ordinate is the candidate set ratio phi, and the comparison experiment fixes beta to 0.5. Graphs (a) and (b) are LOCG networks, graphs (c) and (d) are WG networks, and graphs (a) and (c) are set to R _d R is set in fig. (b) and (d) when 0 _d 1. It can be seen that as the observation point ratio (removal ratio) approaches 0, the candidate set occupancy also approaches 1; one particular method is in R _d When R is 0, the expression effect is better _d When 1, the performance is also better; aiming at a specific q value, compared with the algorithms CI, MSRG and FINDER, the PrEF algorithm of the invention has smaller candidate set range, and reduces the search range of a propagation source, especially in a WG network.

In summary, the present invention realizes network propagation source positioning, wherein the size of the observation point set, the observation point direction information acquisition ratio, and the strategy generated by the observation point set all play a crucial role in narrowing the propagation source search range and improving the positioning efficiency. Particularly, the method of the invention has better performance in the value range of the q value and has stronger robustness in different propagation models. The invention combines the network immunity problem, realizes the idea of positioning the propagation source after decomposing the network, shows effectiveness, high efficiency and stability, and is suitable for positioning the propagation source in a large-scale network.

Claims

1. A network propagation source positioning method based on a seepage process and evolutionary computation is characterized by comprising the following steps:

step 2: constructing and obtaining an initial graph overall node sequence S by adopting a random sequencing or node degree sequencing method, updating the node sequence S by adopting an AEF algorithm, selecting q nodes as observation points according to the sequence from front to back for the updated node sequence, forming an observation point set O, marking the observation points on the network, recording the absolute time of the observation points being infected, and randomly determining the occupation ratio R from the observation point set O _d ＝|O _d I/O I observation points form a set O _d Recording the infected direction information of the observation point; q has a value range of [0,0.2 ]]，R _d Has a value range of [0.001,1 ]]；

And 3, step 3: at the moment t is 0, a propagation source V is randomly selected from all nodes in the data set G (V, E) _s In an infected state, starting a transmission process; during the propagation, the observation point records the absolute time of its own infection, set O _d The observation point in (1) also records the infected direction information, and the node in the infected state has the side infection rate beta _uv The virus is transmitted to the neighbor nodes in a susceptible state,at the same time, the node in the infected state recovers at a recovery rate γ _u Entering a recovery state, enabling the infected node to enter an infection state to become a new infection node, continuing the propagation behavior of all nodes in the infection state until the number n1 of the infection nodes and the number n2 of the recovery nodes in the network meet (n1+ n2)/n ≧ epsilon, stopping the propagation process, and forming an infection graph G by the obtained node infection situation distribution graph _I (ii) a Wherein n represents the number of nodes contained in the network G;

Determination of communication piece c _i Is limited by

Wherein u represents the set of removed nodes V _r V represents a connection piece c _i Arbitrary node of (1), e _uv Shows an infection chart G _I An edge connecting node u and node v; according to

And 5: selecting the observation point with the earliest infected time to form an infected observation point subset O'; according to the formula

And 7: will propagate the source

2. The method for positioning network propagation sources based on the seepage process and the evolutionary computation as claimed in claim 1, wherein: the specific process of updating the node sequence S by using the AEF algorithm in step 2 is as follows:

step a: according to

Step b: for each oneA segmentation sequence S _k Updating in parallel according to the following process to obtain an updated segmented sequence S _k Wherein the number of updates is

Wherein, S' _k (z) represents the sequence S' _k The z-th node in (a);

step S3: by candidate sets

Is selected to satisfy

Wherein y is the candidate set

Xi (y) represents the size of the connection piece of the node y according to

Or

Step S5: if j is greater than 0, returning to step S2; otherwise, if

Then S _k ＝S′ _k Let us order

Returning to step S1, when

When the temperature of the water is higher than the set temperature,the resulting sequence S _k I.e. the updated sequence; where F denotes a correlation evaluation index function with respect to the sequence, in terms of F ═ Σ _q |c″ _max Calculated as | c | "/n _max I represents the node number of the maximum connection piece of the sequence under different q values, n represents the total node number, and the value range of q is [1/n ] _s ,1]Step length of 1/n _s ，

A spliced sequence representing two sequences;

3. The method for positioning network propagation sources based on the seepage process and the evolutionary computation as claimed in claim 1, wherein: the specific process of finding the propagation source by adopting the RIS algorithm in the step 6 is as follows:

step a: initializing the node set Lambda as an empty set; for subgraph V _c ', let G ' (V ', E ') be its reverse network, satisfying | V ' | | V | _c ' |, and if edge e _ba Included in subfigure V _c In this case, the edge e _ab E 'where, | V' _c L represents sub-diagram V' _c The number of nodes in the reverse network is expressed by V ' and E ', the node set and the edge set of the reverse network are respectively expressed by | V ' | expressing the number of nodes in the reverse network;

The random number of (a) and (b),

has a value range of [0,20 ]]；

T _Λ Is taken as value of 10 ⁶ 。