CN116032828A

CN116032828A - Medium number centrality approximate calculation method and device

Info

Publication number: CN116032828A
Application number: CN202310167081.3A
Authority: CN
Inventors: 王怀习; 束妮娜; 马涛; 王晨; 冯也来; 黄郡; 沈培佳; 杨成武
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2023-02-27
Filing date: 2023-02-27
Publication date: 2023-04-28
Anticipated expiration: 2043-02-27
Also published as: CN116032828B

Abstract

The invention discloses a method and a device for approximate calculation of medium centrality. The method comprises the following steps: computing networkGIs characterized by the feature vector centrality of (1); calculating a network by using the feature vector centralityGThe number of the intermediate nodes is counted again to construct a multiple networkG ₂ The method comprises the steps of carrying out a first treatment on the surface of the From the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setSThe method comprises the steps of carrying out a first treatment on the surface of the Computing the set of non-heavy sample nodesSCenter of bettery, get networkGThe median centrality approximation. Therefore, the invention selects the sample nodes of the medium number centrality calculation based on the characteristic vector centrality as the basis based on the advantage of low complexity of the characteristic vector centrality calculation, and the shortest paths among the sample nodes can better represent all the shortest paths among the network nodes by calculating the sample nodesThe shortest path between the points obtains the approximate median centrality value of the network, thereby realizing the rapid calculation of the median centrality of the large-scale network.

Description

Medium number centrality approximate calculation method and device

Technical Field

The invention relates to the technical field of networks, in particular to a method and a device for approximate calculation of betweenness centrality.

Background

Since the 21 st century, human society information has been rapidly developed, mobile internet popularization and application and internet of things deployment have rapidly landed, the number of network devices has been significantly increased, and the complexity of the number of physical network devices and the connection relationship has increased. The real networks in the different fields of power network, telecommunication network, traffic network, social network, communication network, internet of things and the like form the network world with all the inclusive sense.

The real network is a plentiful research object for network science, the abstract network is usually embodied with the same network attribute by abstracting different real networks, and the system research of abstract network property and rule forms complex network science with abundant connotation. The main research topics of network science cover network centrality measure and global network characteristics, network model structure and function analysis, network link prediction and recommendation algorithm, network dynamics, network control and optimization and the like, and the network centrality measure research plays a fundamental role in complex network science. In order to characterize the importance of nodes and edges in a network, researchers have proposed various network centrality measures that evaluate the centrality of nodes and edges in a network from various angles. The network centrality measure mainly characterizes the importance of nodes and edges in a network, and provides theoretical support for researching the identification of key nodes and key edges in a real network.

The betweenness centrality is an index for describing the importance of a certain node or a certain side in a network from the perspective of network connectivity, and is also an important research point of a complex network, and the research of a rapid calculation method has profound practical significance. For example, according to the core node in the social network, an influence ranking list is given, users are attracted, and targeted network marketing is carried out on the influence ranking list; by protecting the key server in the network, the key server can be prevented from being attacked by viruses or hackers, so that the whole network can normally operate; by isolating the infectious source, the transmission and spread of infectious viruses and the like can be effectively prevented, and in the practical application, the importance degree of each node in the network needs to be known so as to find out the key nodes in the network. However, as the number of nodes in a network increases and the connection relationship between the nodes is dense, the topology structure of the network is also more complex, and the conventional medium number centrality calculation method is difficult to completely meet the requirement of large-scale medium number centrality calculation in the network. It is highly desirable to design a fast calculation method that achieves the centrality of the bets.

Disclosure of Invention

In view of the above-mentioned problems, the present invention aims to provide a method and a device for approximate calculation of medium centrality, which are based on the advantage of low complexity of calculation of characteristic vector centrality, and based on the characteristic vector centrality, sample nodes for medium centrality calculation are selected, the shortest paths between the sample nodes can better represent all the shortest paths between network nodes, and then approximate medium centrality values are obtained by calculating the shortest paths between the sample nodes, so that the method and the device are favorable for realizing rapid calculation of medium centrality in a large-scale network.

To achieve the above object, in a first aspect of the present invention, an application task scheduling method is disclosed, where the method includes:

computing networkGThe characteristic vector centrality is obtained; the feature vector centrality

The method comprises the steps of carrying out a first treatment on the surface of the The saidnRepresentation vector->

Component numbers of (2); the saidnEqual netCollateralsGTotal number of medium nodes. />

Calculating a network by using the feature vector centralityGThe number of the nodes again, construct the multiple networkG ₂ 。

From the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setS。

Computing the set of non-heavy sample nodesSCenter of bettery, get networkGThe median centrality approximation.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the calculating network uses the feature vector centralityGNode weight sequence of (a) to construct a multiple networkG ₂ Comprising:

calculating the mean value of each component of the centrality of the feature vector to obtain a component mean value

。

The said

The method comprises the steps of carrying out a first treatment on the surface of the Said->

。

Judging the component mean value

Whether or not it is smaller than a preset average weight numbercAnd obtaining a judging result.

The average weight numbercThe range of the values is as follows

The method comprises the steps of carrying out a first treatment on the surface of the The range of the average weight value sufficiently ensures the universality and the variability of the sampled samples based on the centrality of the feature vector, wherein ∈ >

For networksGIs the average degree of the node.

According to the judging result, utilizing a preset directionCalculating a model of the vector coefficients to obtain the vector coefficients

。

Centering the feature vector with the vector coefficients

Multiplying to obtain a second feature vector; said feature vector centrality->

The method comprises the steps of carrying out a first treatment on the surface of the Said second eigenvector->

。

And rounding up each component in the second feature vector to obtain a node weight vector.

The node weight vector

Is a non-negative integer; the i-th component in the node weight vector +.>

Characterizing a networkGI node->

Corresponding weight number.

Using node weight vectors for the networkGProcessing to obtain multiple networksG ₂ 。

The said

。

In an optional implementation manner, in a first aspect of the embodiment of the present invention, according to the determination result, a vector coefficient is obtained by using a preset vector coefficient calculation model, where the method includes:

when the judgment result is yes, calculating the minimum integer for enabling the preset first vector coefficient calculation model to be established

The minimum integer +.>

As vector coefficient->

Is a value of (2).

The first vector coefficient calculation model is

。

Wherein, the said

；/>

Representing an upward rounding; the saidn=Network systemGThe total number of the middle nodes; the said

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

When the judgment result is NO, calculating the maximum integer for enabling the preset second vector coefficient calculation model to be established

The maximum integer +.>

Is the reciprocal of the vector coefficient->

Is a value of (2).

The second vector coefficient calculation model is

。/>

Wherein, the said

；/>

Representing an upward rounding; the saidn=Network systemGThe total number of the middle nodes; said->

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the multiple network is selected from the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setSComprising:

from the multiple networksG ₂ In the method, the method is selected according to uniform random probability distribution

Obtaining a first sample node set by the nodes; said->

Characterizing a predetermined sampling rate, +.>

The value range is +.>

The sampling proportion is selected by comprehensively considering the representativeness of the sampling sample and the efficiency of an approximate calculation method.

And judging whether the number of the repeated nodes exists in the first sample node set or not to obtain a second judging result.

If the second judgment result is yes, deleting repeated nodes in the first sample node set, selecting new nodes from a method of uniform random probability distribution in a multiple network, adding the new nodes into the first sample node set, and enabling the total number of the nodes in the first sample node set to reach

And triggering and executing the judgment whether the first sample node set has the repeated nodes or not to obtain a second judgment result.

If the second judgment result is negative, the first sample node set is confirmed to be a non-heavy sample node setS。

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the computing networkGFeature vector centrality, resulting in feature vector centrality, comprising:

constructing a networkGAdjacent matrix a of (a);

the said

If the network isGMiddle node->

And node->

With edges in between, then->

The method comprises the steps of carrying out a first treatment on the surface of the Otherwise

。

And constructing a first characteristic equation based on the adjacency matrix A.

The first characteristic equation is

。

Where A is the adjacency matrix of the network,

is characteristic value (I)>

Is a feature vector.

And carrying out calculation processing on the first characteristic equation, calculating to obtain a characteristic vector corresponding to the maximum characteristic value, and taking the characteristic vector corresponding to the maximum characteristic value as characteristic vector centrality.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the computing the set of non-heavy sample nodesSCenter of bettery, get networkGA median centrality approximation comprising:

using a median centrality calculation model for the non-heavy sample node set SAnd processing to obtain the medium centrality of the sample node set S.

The medium centrality calculation model is as follows:

/>

in the method, in the process of the invention,

representing node->

Center of betting, ->

Representing edge->

Center of betting, ->

For node->

To node->

Is>

For node->

To node->

Through node->

The number of shortest paths of (a); />

For node->

To node->

Pass by edge->

The number of shortest paths of (a);Srepresenting a set of non-heavy sample nodesS。

Determining the medium centrality of the sample node set S as a networkGIs approximated to the median centrality of (a) to obtain a networkGIs a median centrality approximation of (c).

In a second aspect of an embodiment of the present invention, there is disclosed a medium centrality approximation calculation apparatus, the apparatus including:

a first computing module for computing a networkGThe characteristic vector centrality is obtained; the feature vector centrality

Component numbers of (2); the saidnEqual to the networkGThe total number of the middle nodes;

a first network construction module for calculating a network using the feature vector centralityGThe number of the nodes again, construct the multiple networkG ₂ ；

A second network construction module for constructing a network from the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node set S；

A second calculation module for calculating the non-heavy sample node setSCenter of bettery, get networkGThe median centrality approximation.

In a second aspect of the embodiment of the present invention, the first computing module computes a networkGThe characteristic vector centrality is obtained by the specific way that:

constructing a networkGIs a contiguous matrix a of (a).

The said

If the network isGMiddle node->

And node->

With edges in between, then->

。

The first characteristic equation is

。

Where A is the adjacency matrix of the network,

is characteristic value (I)>

Is a feature vector.

In a second aspect of the embodiment of the present invention, the first network construction module calculates a network using the feature vector centralityGThe number of the nodes again, construct the multiple networkG ₂ The method specifically comprises the following steps:

。/>

The said

。

Judging the component mean value

The average weight numbercThe range of the values is as follows

The method comprises the steps of carrying out a first treatment on the surface of the The range of the average weight value sufficiently ensures the universality and the variability of the sampled samples based on the centrality of the feature vector, wherein ∈>

For networksGIs the average degree of the node.

According to the judgment result, a preset vector coefficient calculation model is utilized to obtain a vector coefficient

。

Centering the feature vector with the vector coefficients

Multiplying to obtain a second feature vector; said feature vector centrality->

。

The node weight vector

The method comprises the steps of carrying out a first treatment on the surface of the The said

Is a non-negative integer; the i-th component in the node weight vector +.>

Characterizing a networkGI node->

Corresponding weight number.

The said

。

In a second aspect of the embodiment of the present invention, the first network construction module calculates a model by using a preset vector coefficient according to the determination result to obtain a vector coefficient, and specifically includes:

The minimum integer +.>

As vector coefficient->

Is a value of (2).

The first vector coefficient calculation model is

。

Wherein, the said

；/>

Representing an upward rounding; saidn=Network systemGThe total number of the middle nodes; said->

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

The maximum integer +.>

Is the reciprocal of the vector coefficient->

Is a value of (2).

The second vector coefficient calculation model is

。

Wherein, the said

；/>

Is an integer greater than 0; said->

Representing a preset average weight +.>

。/>

As an optional implementation manner, in the second aspect of the embodiment of the present invention, the second network building module is configured from the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setSThe method specifically comprises the following steps:

Obtaining a first sample node set by the nodes; said->

Representing an upward rounding; the said n=Network systemGThe total number of the middle nodes; said->

Characterizing a predetermined sampling rate, +.>

The value range is +.>

When the second judgment result is yes, deleting repeated nodes in the first sample node set, selecting new nodes from a method of uniform random probability distribution in a multiple network, and adding the new nodes into the first sample node set to enable the total number of the nodes in the first sample node set to reachsAnd triggering and executing the judgment whether the first sample node set has the repeated nodes or not to obtain a second judgment result.

When the second judgment result is no, the first sample node set is confirmed to be a no-heavy sample node setS。

As an optional implementation manner, in the second aspect of the embodiment of the present invention, the second calculation module calculates the set of non-heavy sample nodesSCenter of bettery, get networkGThe median centrality approximation value specifically comprises:

The medium centrality calculation model is as follows:

in the method, in the process of the invention,

representing node->

Center of betting, ->

Representing edge->

Center of betting, ->

For node->

To node->

Is>

For node->

To node->

Through node->

The number of shortest paths of (a); />

For node->

To node->

Pass by edge->

Determining the medium centrality of the sample node set S as a networkGIs approximated to the median centrality of (a) to obtain a networkGThe median centrality approximation.

Another aspect of the invention discloses another medium number centrality approximation calculation device, the device comprising:

a memory storing executable program code;

a processor coupled to the memory;

the processor invokes the executable program code stored in the memory to perform some or all of the steps in the method for median centering approximation calculation disclosed in the first aspect of the embodiment of the present invention.

A fourth aspect of the present invention discloses a computer storage medium storing computer instructions for performing part or all of the steps in the method for central approximation of a betweenness disclosed in the first aspect of the present invention when called.

The invention has the beneficial effects that:

the invention relates to a method for approximate calculation of betweenness centrality, which uses a calculation networkGThe characteristic vector centrality is obtained; calculating a network by using the feature vector centralityGThe number of the intermediate nodes is counted again to construct a multiple networkG ₂ The method comprises the steps of carrying out a first treatment on the surface of the From the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setSThe method comprises the steps of carrying out a first treatment on the surface of the Computing the set of non-heavy sample nodesSCenter of bettery, get networkGThe median centrality approximation. It can be seen that the inventionBased on the advantage of low complexity of feature vector centrality calculation, sample nodes of the medium centrality calculation are selected based on feature vector centrality, shortest paths among the sample nodes can better represent all shortest paths among network nodes, and network approximate medium centrality values are obtained by calculating the shortest paths among the sample nodes, so that the medium centrality of a large-scale network is rapidly calculated.

Drawings

FIG. 1 is a flow chart of a method for calculating a median centrality approximation in accordance with an embodiment of the present invention;

FIG. 2 is a schematic diagram of an exemplary three-layer routing network according to one embodiment of the present invention;

FIG. 3 is a schematic diagram of a medium-center approximation calculation device according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another medium-center approximation calculation apparatus according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

For the sake of easy understanding of the embodiments of the present application, the following will briefly introduce related concepts:

the median centrality. The betweenness centrality is generally divided into node betweenness centrality and edge betweenness centrality, and betweenness centrality index is an important centrality index in key node and key link identification. Given network

Represents node set and edge set, respectively,/-of the network>

Representing the number of nodes in the network, < >>

Representing the number of edges in the network.

Node bets are defined as the ratio of the number of all shortest paths through the node to the total number of shortest paths in the network. Node in network

The betweenness of (a) is defined as:

wherein the method comprises the steps of

For node->

To node->

Is>

For node->

To node->

Through node->

Is used for the number of shortest paths of the network.

Edge betweenness is defined as the ratio of the number of all shortest paths through the edge to the total number of shortest paths in the network. Edge(s)eThe betweenness of (a) is defined as:

wherein the method comprises the steps of

For node->

To node->

Pass by edge->

Is used for the number of shortest paths of the network.

For ease of research, node and edge betweenness centrality in a network is often normalized, and normalized betweenness centrality in an undirected network is defined as:

the closer the normalized value is to 1, the higher the frequency of the node on the shortest path between the network nodes is, and the more important is; when the normalized value is 0, the node is not present on the shortest path between other nodes, and the importance is very low.

Theorem: in a connected network, the sum of node normalization medians is equal to the average shortest distance of the network

Subtracting 1; the sum of the edge normalized medians is equal to the average shortest distance +.>

I.e.

The relationship between the mid-distance centrality and the edge mid-distance centrality in the network and the average shortest distance of the network is given by the mid-distance centrality identity, which reveals the inter-implication relationship between the mid-distance centrality and the average shortest path of the network, and establishes the relationship between the mid-distance centrality and the small world network. The medium center identity is established not only on the connected network but also on the general network.

The betweenness centrality calculation usually adopts a Brandes algorithm, and the algorithm uses a depth-first search algorithm when calculating the betweenness centrality of an unauthorized network, wherein the calculation complexity is that

The method comprises the steps of carrying out a first treatment on the surface of the In calculating the betweenness centrality of the weighted network, the Dijkstra algorithm is used, and the calculation complexity is +.>

. When the network isGIn case of dense network, the->

The median centrality computational complexity is +.>

The computational complexity is high and cannot be applied to a large-scale network.

Feature vector centrality. Feature vector centrality is an important measure of node centrality in a network, the feature vector centrality is related to the number of adjacent nodes and the importance of each adjacent node in the nodes, and node information of the centrality is more abundant than the number of the centrality.

Order the

Representing a networkGIs described as (1) if the node is->

And node->

With edges in between

The method comprises the steps of carrying out a first treatment on the surface of the Otherwise->

. Node->

The feature vector centrality of (1) may be defined as: />

Wherein,,

is node->

A set of all neighboring nodes,>

is a constant.

Feature vector centrality definitions may be expressed as matrix-vector tokens,

constant->

Is an adjacency matrixAIs>

Is characteristic value +.>

Corresponding feature vectors. Typically, an adjacency matrix AThere are a number of different characteristic values +.>

There is also a corresponding feature vector. However, the fact that all elements of the matrix are positive means that only the largest eigenvalues will produce the required centrality measure. The eigenvectors of the adjacency matrix are +.>

The individual components give the nodes +.>

Is used for the feature vector centrality value of (1). Since the eigenvector point multiplied by an arbitrary constant is still eigenvalue +.>

The patent introduces an average weight parameter to determine feature vector values when constructing multiple networks based on feature vector centrality. The feature vector centrality temporal complexity is +.>

The computational complexity is low.

Example 1

Referring to fig. 1, fig. 1 is a schematic flow chart of a method for calculating a median centrality approximation according to an embodiment of the present invention. The method described in fig. 1 is applicable to an information network, a social network, an internet of things and a traffic network, and the embodiment of the invention is not limited. As shown in fig. 1, the method for calculating the median centrality approximation of the feature vector centrality may include the following operations:

101. computing networkGAnd obtaining the characteristic vector centrality.

In the embodiment of the invention, the characteristic vector centrality

Component numbers of (2); the saidnRepresenting a networkGTotal number of medium nodes.

102. Computing networks using feature vector centralityGThe number of the nodes again, construct the multiple networkG ₂ 。

103. From multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setS。

104. Computing a set of non-duplicate sample nodesSCenter of bettery, get networkGThe median centrality approximation.

Therefore, by implementing the method for approximate calculation of the betweenness centrality described by the embodiment of the invention, the centrality of the feature vector is obtained by calculating the centrality of the feature vector of the original network, the non-heavy sample node set is selected based on the centrality of the feature vector, and the value of the betweenness centrality of the non-heavy sample node set is calculated to obtain the value of the approximate betweenness centrality of the original network, so that the calculation is simplified, the complexity is reduced, and the rapid calculation of the betweenness centrality of the large-scale network is realized.

In an alternative embodiment, the computing network is calculated in step 101 aboveGFeature vector centrality, resulting in feature vector centrality, comprising:

constructing a networkGIs a contiguous matrix a of (a).

The said

If the network isGMiddle node->

And node->

With edges in between, then->

。

The first characteristic equation is

。

In the method, in the process of the invention,

for the adjacency matrix of the network,>

is characteristic value (I)>

Is a feature vector; />

。

According to the Perron-Frobenius theorem,

. Based on this characteristic, feature vector centrality calculation is often developed by a matrix exponentiation method, i.e., an initial vector +.>

Then iterate to calculate +.>

When the number of iterationstWhen a certain threshold is reached, the person is allowed to go (I)>

The value is close to the maximum characteristic value

The corresponding feature vector, i.e. feature vector centrality.

It can be seen that the value of the centrality of the feature vector characterizes the networkGThe importance of a node in the set determines the probability that the node is selected as a sample node.

In another alternative embodiment, the feature vector centrality is used in step 102 to calculate a networkGNode weight sequence of (a) to construct a multiple networkG ₂ Comprising:

。

The said

。

Judging the component mean value

The average weight numbercThe range of the values is as follows

The method comprises the steps of carrying out a first treatment on the surface of the Wherein->

For networksGIs the average degree of the node.

。

Centering the characteristic vector and vector coefficient

Multiplying to obtain a second feature vector; the characteristic vector centrality->

The method comprises the steps of carrying out a first treatment on the surface of the The second feature vector->

。

And rounding up each component in the second characteristic vector to obtain a node weight vector.

The node weight vector

The method comprises the steps of carrying out a first treatment on the surface of the Above->

The method comprises the steps of carrying out a first treatment on the surface of the Above-mentioned

Is a non-negative integer; the i-th component in the node weight vector +.>

Characterizing a networkGI node->

Corresponding weight number.

Using node weight vector to networkGProcessing to obtain multiple networksG ₂ 。

The said

。

Therefore, the universality and the diversity of the sampled samples based on the characteristic vector centrality are fully ensured by reasonably taking the average weight, the weight of the network node is determined according to the average value of each component of the characteristic vector centrality and the average weight, the weight of the node in the multiple network is in direct proportion to the characteristic vector centrality of the node, and the more important node becomes the sample node with higher probability due to the construction of the multiple network.

In yet another optional embodiment, the calculating a model according to the determination result by using a preset vector coefficient to obtain a vector coefficient includes:

The minimum integer is +.>

As vector coefficient->

Is a value of (2).

The first vector coefficient calculation model is as follows

。

Wherein, the above

；/>

Representing an upward rounding; above-mentionednRepresenting the number of components in the centrality of the feature vector; above->

Is an integer greater than 0; above->

Representing a preset average weight +.>

。

The maximum integer +.>

Is the reciprocal of the vector coefficient->

Is a value of (2).

The second vector coefficient calculation model is as follows

。

Wherein, the above

；/>

Representing an upward rounding; above-mentionednRepresenting the number of components in the centrality of the feature vector; said->

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

It can be seen that the value of the centrality of the feature vector represents the importance degree of different nodes, and the average weight is used to control the value of the vector coefficient to construct multiple networksG ₂ Thereby ensuring that important nodes appear in the multiple network with higher numbers of weights and that secondary nodes appear in the multiple network with lower numbers of weights while the total number of weights of the multiple network is controlled within a reasonable range.

In yet another alternative embodiment, the step 103 is performed from multiple networks G ₂ Sample nodes are selected to obtain a non-heavy sample node setSComprising:

from multiple networksG ₂ In the method, the method is selected according to uniform random probability distribution

Obtaining a first sample node set by the nodes; above->

Characterizing a predetermined sampling rate, +.>

The value range is +.>

Judging whether the first sample node set has the number of repeated nodes or not to obtain a second judging result;

if the second judgment result is yes, deleting repeated nodes in the first sample node set, selecting new nodes from a method of uniform random probability distribution in the multiple networks, adding the new nodes into the first sample node set, and enabling the total number of the nodes in the first sample node set to reach

And triggering and executing the judgment whether the first sample node set has the repeated node or not, and obtaining a second judgment result.

If the second judgment result is no, the first sample node set is confirmed to be a non-heavy sample node setS。

It can be seen that the sampling proportion is preset by comprehensively considering the representativeness of the sampling samples and the efficiency of the approximate calculation method, so that the node set without the heavy samples SScale of (a) is compared with the initial networkGScale between 0.1 and 0.2.

In yet another alternative embodiment, a set of no-heavy sample nodes is calculated in step 104 aboveSCenter of bettery, get networkGA median centrality approximation comprising:

for the node set without heavy sample by using the medium number centrality calculation modelSAnd processing to obtain the medium centrality of the sample node set S.

The above-mentioned medium centrality calculation model is:

in the method, in the process of the invention,

representing node->

Center of betting, ->

Representing edge->

Center of betting, ->

For node->

To node->

Is>

For node->

To node->

Through node->

The number of shortest paths of (a); />

For node->

To node->

Pass by edge->

The number of shortest paths of (a)SRepresenting a set of non-heavy sample nodesS。

Determining the medium centrality of the sample node set S as a networkGThe medium number centrality approximation of the network G node is obtained.

It can be seen that the pass is required in defining the median centrality accuratelyReplacing the shortest path set between any node of the calendar with only traversing the no-duplicate sample node setSThe shortest path set among the inner nodes greatly reduces the scale of path search, and meanwhile, the approximate value of the centrality of the betweenness keeps the magnitude sequence relation of the centrality values of the betweenness of different nodes and edges as much as possible. Node set due to no heavy sample SIs only between 0.1 and 0.2 of the initial node set size, so as to have no heavy sample node setSThe calculation complexity of the approximation of the computation of the betweenness centrality is equivalent to 0.01 to 0.04 of the complexity of the accurate calculation method, so that the betweenness calculation complexity of the large-scale network is rapidly reduced.

To specifically illustrate the method of the present embodiment, a typical three-layer routing network is used

An explanation is given.

The above-mentioned typical three-layer routing network structure is shown in fig. 2, and numerals 1 to 31 in fig. 2 respectively denote routing networks

CollateralsG31 router nodes in (1), 31 router nodes constitute a router node setV，Connection of routers

Forming edges, all edges forming an edge setE。

Specifically, the network

Middle router node set +.>

Sum of edges->

The following are provided:

。

in order to calculate the betweenness centrality of nodes in the network, an approximate calculation method based on the feature vector centrality is performed as follows:

step 1: computing networkGThe feature vector values of the nodes, and the feature vector centrality of the nodes are shown in table 1:

table 1 feature vector centrality for each node

Step 2: and calculating the weight of the network G node by utilizing the centrality of the feature vector, and constructing a multiple network G2.

Setting average weight number, due to average degree of node

Selecting average weight- >

；

The node weight distribution is determined from the distribution of feature vector centrality as shown in table 2 below:

table 2 node weight distribution

Constructing multiple networks based on node reconstruction numbersG ₂ As shown in table 2,G ₂ including 14 nodes 1,9 nodes 2, … …,3 nodes 31.

Step 3: from multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node set S;

according to the node weight distribution and the method of uniform random probability distribution, the node sampling probability distribution can be calculated as shown in table 3:

TABLE 3 node sampling probability distribution

Setting sampling proportion

=0.2。/>

According to sampling probability

Sample node set can be knownSScale of +.>

Calculated no-heavy sample node set +.>

。

Step 4: calculating the centrality of S bets of node sets without heavy samples to obtain a networkGThe median centrality approximation is shown in table 4:

table 4 networkGIntermediate center approximation

Since the approximation and the exact value of the median centrality adopt the same normalization coefficient, but the node sets involved between the approximation and the exact value are respectivelySAndVtherefore, there is no practical significance in comparing the actual numerical values between the two. The selection of key nodes and key edges mainly depends on the order of magnitude of the median centrality values. From the approximate distribution of the betweenness centrality, the importance of the nodes is as follows: 1,4,6,3,5,2,7,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31.

To analyze the effect of the median center approximation calculation method, table 5 gives the exact value distribution of the median center, and it is known that the node importance degree is: 1,2,3,4,5,6,7,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31.

Table 5 networkGCenter of median precision value

By comparing the approximate value and the accurate value sequencing result of the centrality of the medians, the accuracy rate reaches more than 90 percent except that the importance degree of the nodes 2,3 and 5 is changed and the sequences of other nodes are maintained.

When the node of the node set without the heavy sample is selected, the network needs to be traversed for accurate calculation of the medium number centralityGThe shortest paths between any pair of nodes are counted, the frequency of each shortest path passing through the nodes and the edges is counted, and the medium centrality of the nodes and the edges is calculated. Because the upper limit of the number of node pairs in the network is

，nIs the number of nodes in the network. Time complexity of accurate calculation of the median centrality +.>

The computational complexity is high and cannot be applied to a large-scale network. In->

Representing the number of nodes in the network, < >>

Representing the number of edges in the network. In order to solve the problem of central rapid calculation of medium numbers in a large-scale network, the network is replaced by a node set without heavy samples in the method SThe number of node pairs is from->

Reduced to->

The workload of shortest path traversal can be reduced to +.>

. Meanwhile, sample nodes generated based on feature vector centrality sampling represent important nodes in a network, shortest paths among the sample nodes represent typical shortest paths in the network, the relative sequence of the betweenness centrality accurate values of the nodes and the edges can be well reserved based on the approximate value of the betweenness centrality calculated by the shortest paths among the sample nodes, the betweenness centrality relative sequence of the nodes and the edges is a problem which is focused on in a real network application scene, and the order keeping performance of the betweenness centrality approximate value can effectively solve the real network application problem.

Example two

Referring to fig. 3, fig. 3 is a schematic diagram of a median center approximation calculation apparatus according to an embodiment of the invention. The device described in fig. 3 is applicable to an information network, a social network, an internet of things and a traffic network, and the embodiment of the invention is not limited. As shown in fig. 3, the apparatus may include:

a first computing module for computing a networkGThe characteristic vector centrality is obtained; center of the feature vector

The method comprises the steps of carrying out a first treatment on the surface of the Above-mentionednRepresentation vector->

Component numbers of (2); above-mentionednEqual to the networkGThe total number of the middle nodes;

a first network construction module for calculating a network using feature vector centralityGThe number of the nodes again, construct the multiple networkG ₂ ；

A second network construction module for constructing multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setS；

A second calculation module for calculating a set of non-heavy sample nodesSCenter of bettery, get networkGThe median centrality approximation.

Therefore, by implementing the intermediate number centrality approximation calculation device described in fig. 3, the intermediate number centrality of the non-heavy sample node set can be used as the intermediate number centrality approximation value of the original network by reasonably selecting the non-heavy sample node set, so that the scale of path search can be greatly reduced, and meanwhile, the intermediate number centrality approximation value also maintains the magnitude sequence relation of intermediate number centrality values of different nodes and edges as much as possible.

In another alternative embodiment, as shown in FIG. 3, the first computing module computes a networkGThe characteristic vector centrality is obtained by the specific way that:

constructing a networkGIs a contiguous matrix a of (a).

Above-mentioned

If the network isGMiddle node->

And node- >

With edges in between, then->

。

Based on the adjacency matrix A, a first characteristic equation is constructed.

The first characteristic equation is

。

In the method, in the process of the invention,

for the adjacency matrix of the network,>

is characteristic value (I)>

Is a feature vector.

In yet another alternative embodiment, as shown in FIG. 3, the first network construction module calculates the network using feature vector centralityGThe number of the nodes again, construct the multiple networkG ₂ The method specifically comprises the following steps:

。

Above-mentioned

。

Judging the component mean value

The average weight numbercThe range of the values is as follows

For networks GIs the average degree of the node.

；

Centering the feature vector with the vector coefficients

；

The node weight vector

Is a non-negative integer; the i-th component in the node weight vector +.>

Characterizing a networkGI node->

Corresponding weight number.

Above-mentioned

。

In yet another optional embodiment, the first network construction module obtains the vector coefficient by using a preset vector coefficient calculation model according to the determination result, and specifically includes:

The minimum integer is +.>

As vector coefficient->

Is a value of (2).

The first vector coefficient calculation model is as follows

。

Wherein, the above

；/>

Representing an upward rounding; above-mentionednThe number of components in the centrality of the feature vector; said->

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

The maximum integer +.>

Is the reciprocal of the vector coefficient->

Is a value of (2).

The second vector coefficient calculation model is as follows

。

Wherein, the above

；/>

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

In yet another alternative embodiment, as shown in FIG. 3, the second network building block is derived from multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setSThe method specifically comprises the following steps:

Obtaining a first sample node set by the nodes; above->

Representing an upward rounding; above-mentionednRepresenting the number of components in the centrality of the feature vector; above-mentioned

Characterizing a predetermined sampling rate, +.>

The value range is +.>

In yet another alternative embodiment, as shown in FIG. 3, the second calculation module calculates a set of no-heavy sample nodesSCenter of bettery, get networkGThe median centrality approximation value specifically comprises:

using the medium number centrality calculation model to collect the non-heavy sample nodesSAnd processing to obtain the medium centrality of the sample node set S.

The above-mentioned medium centrality calculation model is:

/>

in the method, in the process of the invention,

representing node->

Center of betting, ->

Representing edge->

Center of betting, ->

For node->

To node->

Is>

For node->

To node->

Through node->

The number of shortest paths of (a); />

For node->

To the node/>

Pass by edge->

Aggregating the sample nodesSIs determined as a networkGIs approximated to the median centrality of (a) to obtain a networkGNode betweenness centrality approximation.

Example III

Referring to fig. 4, fig. 4 is a schematic structural diagram of another intermediate center-to-center approximation calculation apparatus according to an embodiment of the present invention. The device described in fig. 4 is applicable to an information network, a social network, an internet of things and a traffic network, and the embodiment of the invention is not limited. As shown in fig. 4, the apparatus may include:

a memory 401 storing executable program codes;

a processor 402 coupled with the memory 401;

the processor 402 invokes executable program code stored in the memory 401 for performing the steps in the medium centrality approximation calculation method of feature vector centrality described in embodiment one.

Example IV

The embodiment of the invention discloses a computer-readable storage medium storing a computer program for electronic data exchange, wherein the computer program causes a computer to execute the steps in the medium centrality approximation calculation method characterized by the feature vector centrality described in the embodiment.

Example five

The embodiment of the invention discloses a computer program product, which comprises a non-transitory computer readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute the steps in the method for calculating the centrality approximation of the betweenness of the centrality vectors described in the embodiment.

Finally, it should be noted that: the embodiment of the invention discloses a median centrality approximate calculation method, which is disclosed by the embodiment of the invention only for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A method for central approximation calculation of a betweenness, comprising:

computing networkGThe characteristic vector centrality is obtained; the number of components in the centrality of the characteristic vector is equal to the networkGThe total number of the middle nodes;

calculating a network by using the feature vector centralityGThe number of the intermediate nodes is counted again to construct a multiple networkG ₂ ；

From the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setS；

2. The method of claim 1, wherein the computing network uses the feature vector centralityGNode weight sequence of (a) to construct a multiple networkG ₂ Comprising:

calculating the mean value of each component of the centrality of the feature vector to obtain a component mean value;

judging whether the component mean value is smaller than a preset average weight numbercObtaining a judgment result;

；

Centering the feature vector with the vector coefficients

Multiplying to obtain a second feature vector;

rounding up each component in the second feature vector to obtain a node weight vector;

3. The method for central approximation calculation of betweenness according to claim 2, wherein the calculating the model using the predetermined vector coefficients according to the determination result to obtain the vector coefficients comprises:

when the judgment result is yes, calculating a minimum integer for enabling a preset first vector coefficient calculation model to be established

The minimum integer +.>

As vector coefficient->

Is a value of (2);

the first vector coefficient calculation model is

Wherein, the said

Any component of the feature vector centrality; />

Representing an upward rounding; the saidThe saidnA component number in the centrality of the feature vector; said->

Is an integer greater than 0; said->

Representing a preset average weight +.>

；

When the judging result is NO, calculating a maximum integer for enabling a preset second vector coefficient calculation model to be established

The maximum integer +.>

Is the reciprocal of the vector coefficient->

Is a value of (2);

the second vector coefficient calculation model is

；

Wherein, the said

Any component of the feature vector centrality; />

Representing an upward rounding; the saidnA component number in the centrality of the feature vector; said- >

Is an integer greater than 0; said->

Representing a preset average weight +.>

。

4. The method of median centrality approximation calculation according to claim 1, wherein the slave multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setSComprising:

Obtaining a first sample node set by the nodes; said->

Representing a preset sampling proportion; said->

For networksGThe total number of the middle nodes;

judging whether repeated nodes exist in the first sample node set or not to obtain a second judging result;

when the second judging result is yes, deleting repeated nodes in the first sample node set, selecting new nodes from a method of uniform random probability distribution in a multiple network, adding the new nodes into the first sample node set, enabling the total number of the nodes in the first sample node set to be s, triggering and executing the judgment on whether repeated nodes exist in the first sample node set, and obtaining a second judging result;

5. The method of median centrality approximation calculation of claim 1, wherein the calculation network GFeature vector centrality, resulting in feature vector centrality, comprising:

constructing a networkGAdjacent matrix a of (a);

constructing a first characteristic equation based on the adjacency matrix A;

the first characteristic equation is

；

Where A is the adjacency matrix of the network,

is characteristic value (I)>

Is a feature vector;

6. The method of median centrality approximation calculation of claim 1, wherein the calculating the set of weight-free sample nodesSCenter of bettery, get networkGA median centrality approximation comprising:

using a median centrality calculation model for the non-heavy sample node setSProcessing to obtain the medium centrality of the sample node set S;

aggregating the sample nodesSIs determined as a networkGIs approximated to the median centrality of (a) to obtain a networkGThe median centrality approximation.

7. A medium centrality approximation calculation apparatus, the apparatus comprising:

a first computing module for computing a networkGIn feature vectors Heart, obtaining the centrality of the feature vector; the number of components in the centrality of the characteristic vector is equal to the networkGThe total number of the middle nodes;

a first construction module for calculating a network using the feature vector centralityGThe number of the nodes again, construct the multiple network

；

A second building block for building up a network from the multiple networksG ₂ Sample nodes are selected to obtain a non-heavy sample node setS；

8. A medium centrality approximation calculation apparatus, the apparatus comprising:

a memory storing executable program code;

a processor coupled to the memory;

the processor invokes the executable program code stored in the memory to perform the betweenness centrality approximation calculation method of any of claims 1-6.

9. A computer storage medium storing computer instructions which, when invoked, perform the medium centrality approximation calculation method of any one of claims 1-6.