CN118055030A - Propagation network reconstruction method, system, storage medium and equipment - Google Patents

Propagation network reconstruction method, system, storage medium and equipment Download PDF

Info

Publication number
CN118055030A
CN118055030A CN202410439289.0A CN202410439289A CN118055030A CN 118055030 A CN118055030 A CN 118055030A CN 202410439289 A CN202410439289 A CN 202410439289A CN 118055030 A CN118055030 A CN 118055030A
Authority
CN
China
Prior art keywords
node
score
nodes
scoring
final
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410439289.0A
Other languages
Chinese (zh)
Inventor
黄浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Qiushi Higher Research Institute
Original Assignee
Jiangxi Qiushi Higher Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Qiushi Higher Research Institute filed Critical Jiangxi Qiushi Higher Research Institute
Priority to CN202410439289.0A priority Critical patent/CN118055030A/en
Publication of CN118055030A publication Critical patent/CN118055030A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a propagation network reconstruction method, a system, a storage medium and equipment, wherein the method comprises the steps of obtaining historical observation data of a network node to obtain a historical observation data set; calculating scoring functions of every two nodes in the network according to the historical observation data set to obtain a scoring set, and calculating candidate father nodes of each node in the scoring set to obtain a candidate father node set; calculating the final parent node of each node by combining the candidate parent nodes, and obtaining a final parent node set according to the final parent node of each node; and generating directed edges from the father nodes to the nodes according to each node and each father node of the final father node set corresponding to the nodes, thereby obtaining a propagation network topological structure to reconstruct the propagation network structure. The application can reconstruct the network only by the node state at the last observation, and does not need to relate to the exact occurrence time and infection state of each node infection; the application also needs no probability estimation, so that more accurate results can be obtained under a small number of data samples.

Description

Propagation network reconstruction method, system, storage medium and equipment
Technical Field
The present invention relates to the field of information propagation technologies, and in particular, to a propagation network reconstruction method, system, storage medium, and device.
Background
The propagation network is a mathematical model for researching the propagation rule of substances and information. The transmission of views, rumors, and diseases is typically modeled as a probabilistic process over the transmission network. In a network, directed edges represent parent-child relationships, which parent nodes can influence child nodes with a certain probability. Propagation network structure reconstruction aims at deducing the propagation network structure (i.e. the topology that affects the relationships) from the observed data. In most cases, this effect is not visible, and only a limited number of history propagation processes can be observed. The problem of how to restore an accurate propagation network structure from a limited number of historical propagation process data is of considerable interest in the fields of social networks, virus marketing, epidemic prevention and the like, because the reconstructed propagation network structure can intuitively reveal potential interactions between nodes, is crucial for formulating strategies for controlling future propagation processes, and can help researchers to better predict, promote or organize future substance and information propagation.
However, in conventional propagation network reconstruction, on the one hand, existing reconstruction methods assume that the observed data contains the exact time of occurrence of each node infection and contains the infection status of the node during each diffusion; in less ideal and more realistic environments, the time information of node infection is unknown. On the other hand, the existing propagation network reconstruction is often based on a probability method, and the result can be absolutely accurate only when the known data tends to be infinite, so that probability needs to be estimated, accurate result can be obtained only under a large amount of sample data, and the influence of the sample data amount is large.
Disclosure of Invention
Based on this, the present invention aims to provide a method, a system, a storage medium and a device for reconstructing a propagation network, which are used for solving the technical problems that in the prior art, the propagation network reconstruction needs to rely on the exact occurrence time of each node and a relatively accurate result needs to be obtained under a large amount of sample data.
In one aspect, the present invention provides a method for reconstructing a propagation network, including:
Acquiring historical observation data of a network node to obtain a historical observation data set;
Calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation dataset to obtain a scoring set, and calculating candidate father nodes of every node in the scoring set to obtain a candidate father node set C, wherein v i and v y are nodes;
calculating the final parent node of each node by combining the candidate parent nodes, and obtaining a final parent node set F according to the final parent node of each node;
Generating a directed edge from a parent node to a node v i according to each parent node of the final parent node set F corresponding to each node v i and the node v i, thereby obtaining a propagation network topology structure to reconstruct the propagation network structure;
Wherein the step of calculating a final parent node of each node in combination with the candidate parent nodes includes:
Obtaining a preset queue Q, a preset set T and a current upper score limit g max, wherein the set T is an empty set initially, and q= { -The current upper score limit g max is preset to minus infinity- ≡;
Judging whether the queue Q is an empty set or not;
If not, acquiring an element E from the queue Q, and recording the sequence number j of a corresponding node of the element E;
The j+1st node is taken out from the candidate father node set of the node v i and is marked as the node v p, and the set T is added, wherein the final father node F i of the preset node v i is an empty set
Calculating a scoring function f (T, v i);
Judging whether the scoring function f (T, v i) is larger than the current scoring upper limit g max;
If the score is greater than the current score upper limit g max, updating the score upper limit g max according to the score function F (T, v i), updating the final parent node F i according to the set T, and calculating the score upper limit function g according to the updated final parent node F i (C i,T, vi);
judging whether the score upper limit function g (C i,T, vi) is larger than the score upper limit g max;
If the score is greater than the score upper limit g max, the node v p joins the queue Q, and returns to execute the step of judging whether the queue Q is empty or not until the number of candidate parent nodes of v i is reached, so as to obtain a final parent node set F of any node v i.
According to the propagation network reconstruction method, the propagation network is reconstructed, so that the network can be reconstructed only by the node state in the final observation, and the exact occurrence time of each node infection and the node infection state in each diffusion process are not required to be related; furthermore, the technical scheme of the application does not need to estimate the probability, but converts the probability problem into the sampling problem, and the dependence on the number of samples is eliminated, so that a more accurate result can be obtained under a small number of data samples; the method solves the technical problem that the reconstruction of the propagation network in the prior art needs to depend on the exact occurrence time of each node and can obtain more accurate results under a large amount of sample data.
In addition, the propagation network reconstruction method according to the present invention may further have the following additional technical features:
Further, the calculation formula of the scoring function f (C, v i) is:
Where Tr (-) represents the trace of the matrix, J is an n-dimensional matrix, n is the number of entries of the propagation network node historical observation dataset, j=i-1/n, I represents an n-dimensional identity matrix; k C and K Vi are n-dimensional symmetric matrices, the ith row and the jth column of K C are inner products of the ith record and the jth record of the node represented by the set C, and the ith row and the jth column of K Vi are inner products of the ith record and the jth record of the node v i.
Further, the calculation formula of the scoring upper limit function is:
Where n is the number of records, delta n represents reordering the order of n records, n is the total of the ≡ ordering, δ is one of the reordering schemes, (v i)δ represents delta reordering of n records of node v i).
Further, the step of calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation dataset to obtain a scoring set includes: the scoring function f (v i,vy) of every two nodes in the network is calculated to obtain n (n-1)/2 scores to obtain a scoring set, and the scoring set is marked as { f 1,2,f1,3,…,fi,j,…,fn-1,n }.
Further, the step of calculating a candidate parent node for each node in the score set includes:
K-Means clustering is applied to the grading set, wherein the K value is set to be 2, and one clustering center is fixed to be 0;
And selecting a class of scores with the clustering center not being 0 from the clustered score set, and scoring each score f i,j in the class, wherein the record node v y has a candidate father node v i.
Another aspect of the invention provides a propagation network reconstruction system, the system comprising:
The acquisition module is used for acquiring historical observation data of the network node to acquire a historical observation data set;
The candidate father node calculation module is used for calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation data set to obtain a scoring set, and calculating candidate father nodes of every node in the scoring set to obtain a candidate father node set C, wherein v i and v y are nodes;
the final parent node calculation module is used for calculating the final parent node of each node by combining the candidate parent nodes and obtaining a final parent node set F according to the final parent node of each node;
A reconstruction module, configured to generate a directed edge from a parent node to a node v i according to each parent node of the final parent node set F corresponding to each node v i and the node v i, so as to obtain a propagation network topology structure to reconstruct the propagation network structure;
the final parent node calculation module comprises:
Obtaining a preset queue Q, a preset set T and a current upper score limit g max, wherein the set T is an empty set initially, and q= { -The current upper score limit g max is preset to minus infinity- ≡;
Judging whether the queue Q is an empty set or not;
If not, acquiring an element E from the queue Q, and recording the sequence number j of a corresponding node of the element E;
The j+1st node is taken out from the candidate father node set of the node v i and is marked as the node v p, and the set T is added, wherein the final father node F i of the preset node v i is an empty set
Calculating a scoring function f (T, v i);
Judging whether the scoring function f (T, v i) is larger than the current scoring upper limit g max;
If the score is greater than the current score upper limit g max, updating the score upper limit g max according to the score function F (T, v i), updating the final parent node F i according to the set T, and calculating the score upper limit function g according to the updated final parent node F i (C i,T, vi);
judging whether the score upper limit function g (C i,T, vi) is larger than the score upper limit g max;
If the score is greater than the score upper limit g max, the node v p joins the queue Q, and returns to execute the step of judging whether the queue Q is empty or not until the number of candidate parent nodes of v i is reached, so as to obtain a final parent node set F of any node v i.
Another aspect of the invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a propagation network reconstruction method as described above.
In another aspect, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing a propagation network reconstruction method as described above when executing the program.
Drawings
FIG. 1 is a flow chart of a method of propagation network reconstruction in an embodiment of the present invention;
The invention will be further described in the following detailed description in conjunction with the above-described figures.
Detailed Description
In order that the invention may be readily understood, a more complete description of the invention will be rendered by reference to the appended drawings. Several embodiments of the invention are presented in the figures. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
In order to solve the technical problems that in the prior art, the reconstruction of a propagation network needs to depend on the exact occurrence time of each node and needs to obtain a relatively accurate result under a large amount of sample data, the application provides a method, a system, a storage medium and equipment for reconstructing the propagation network, and the propagation network is reconstructed by reconstructing the propagation network, so that the network can be reconstructed only by the node state in the final observation, and the exact occurrence time of each node infection and the node infection state in each diffusion process are not required to be related; furthermore, the technical scheme of the application does not need to estimate the probability, but converts the probability problem into the sampling problem, and the dependence on the number of samples is eliminated, so that a more accurate result can be obtained under a small number of data samples.
The propagation network is a directed graph g= { V, E }, V representing a set of vertices (nodes), E representing a set of directed edges. Some nodes are initially "infected" with a state flag of 1, uninfected nodes with a state flag of 0, and at regular intervals, infected nodes infect neighboring nodes pointed to by their own directed edges with a certain probability p, which is called the IC propagation model of the propagation network. At the end of a period of time, the infection status of all nodes in the network is recorded, called a historical infection (observation) record.
The propagation network node history observation dataset d= { D 1,D2,…,Dn } has n records in total, each record represents the final diffusion (infection) result of a certain propagation network, wherein each record D i=(di,1,di,2,…,di,m) is an m-dimensional vector, each component D i,j ∈ {0,1},0 represents that node v j is not infected in this diffusion, and 1 represents that it is infected.
In the propagation network, if there is a directed edge from node v i to node v j, node v i is referred to as the parent of node v j. The invention defines a scoring function f (C, v i), wherein v i represents the ith node of the propagation network, C represents the candidate parent node set of v i, f (C, v i) measures the matching degree of the node set C being the parent node of v i for the node v i, and the greater the f (C, v i), the better the matching degree.
Wherein, the calculation formula of the scoring function f (C, v i) is:
Where Tr (-) represents the trace of the matrix, J is an n-dimensional matrix, n is the number of entries of the propagation network node historical observation dataset, j=i-1/n, I represents an n-dimensional identity matrix; k C and K Vi are n-dimensional symmetric matrixes, the ith row and the jth column of elements of K C are inner products of the ith record and the jth record of the nodes represented by the set C, if the set C has three nodes { v 2,v11,v17 }, the state in the 5 th record is (1, 0, 1) and is marked as a vector form, the state in the 7 th record is (1, 1) and is also marked as a vector form, the two vectors are subjected to inner product calculation to obtain a result 2, and the value of the 7 th column element of the 5 th row of the matrix K C is 2; k Vi is defined as the same: the ith row and jth column element of K Vi is the inner product, i.e., the product, of the ith record and the jth record of node v i.
It should be further noted that, the parameters of the scoring function mean a set, a single node is considered to be a set of only one element, and a plurality of nodes are considered to be a set of a plurality of elements, so that the scoring function f (v i,vy) and the scoring function f (T, v i) can be calculated by reasonably applying the calculation formulas of the scoring function f (C, v i).
The invention also defines a scoring upper limit function g (C, C ', v i), which means that for node v i, its candidate parent node set is C, its determined parent node set is C ' E C, and no matter how a new parent node is added to C ', its score f (C ', v i) can not exceed g (C, C ', v i). The calculation formula of the scoring upper limit function is as follows:
Where n is the number of records, delta n represents reordering the order of n records, n is the total of the ≡ ordering, δ is one of the reordering schemes, (v i)δ represents delta reordering of n records of node v i).
In order to facilitate an understanding of the invention, several embodiments of the invention will be presented below. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Example 1
Referring to fig. 1, a propagation network reconstruction method according to a first embodiment of the present invention is shown, and the method includes steps S101 to S104:
S101, acquiring historical observation data of network nodes to obtain a historical observation data set.
S102, calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation data set to obtain a scoring set, and calculating candidate father nodes of every node in the scoring set to obtain a candidate father node set C, wherein v i and v y are nodes.
As a specific example:
the scoring function f (v i,vy) of every two nodes in the network is calculated to obtain n (n-1)/2 scores to obtain a scoring set, and the scoring set is marked as { f 1,2,f1,3,…,fi,j,…,fn-1,n }.
K-Means clustering is applied to the grading set, wherein the K value is set to be 2, and one clustering center is fixed to be 0;
And selecting a class of scores with the clustering center not being 0 from the clustered score set, and scoring each score f i,j in the class, wherein the record node v y has a candidate father node v i.
S103, calculating the final parent node of each node by combining the candidate parent nodes, and obtaining a final parent node set F according to the final parent node of each node.
In this embodiment, the step of calculating the final parent node of each node in combination with the candidate parent node includes:
Obtaining a preset queue Q, a preset set T and a current upper score limit g max, wherein the set T is an empty set initially, and q= { -The current upper score limit g max is preset to minus infinity- ≡;
Judging whether the queue Q is an empty set or not;
If not, acquiring an element E from the queue Q, and recording the sequence number j of a corresponding node of the element E;
The j+1st node is taken out from the candidate father node set of the node v i and is marked as the node v p, and the set T is added, wherein the final father node F i of the preset node v i is an empty set
Calculating a scoring function f (T, v i);
Judging whether the scoring function f (T, v i) is larger than the current scoring upper limit g max;
If the score is greater than the current score upper limit g max, updating the score upper limit g max according to the score function F (T, v i), updating the final parent node F i according to the set T, and calculating the score upper limit function g according to the updated final parent node F i (C i,T, vi);
judging whether the score upper limit function g (C i,T, vi) is larger than the score upper limit g max;
If the score is greater than the score upper limit g max, the node v p joins the queue Q, and returns to execute the step of judging whether the queue Q is empty or not until the number of candidate parent nodes of v i is reached, so as to obtain a final parent node set F of any node v i.
S104, generating a directed edge from the father node to the node v i according to each father node of the final father node set F corresponding to each node v i and the node v i, thereby obtaining a propagation network topological structure to reconstruct the propagation network structure.
Taking DUNF datasets as an example, DUNF is a blog network dataset in the real world, containing 750 users, representing nodes; 2974 concerns represent directed edges between nodes, an IC propagation model is applied to the network, different initial infection node proportions (0.05,0.1,0.15,0.2,0.25) are selected, and 5×200 historical observation records are obtained through simulation respectively.
The reconstructed propagation network method of the present invention and prior art method TENDS are applied to this dataset and after all steps are completed, recall and accuracy are recorded. The recall rate represents the proportion of the edges of the propagation network which are correctly found by the method to the real edges, and the accuracy represents the correct proportion of the edges which are found by the method. For the recall and the accuracy, the recall and the accuracy are blended and averaged to obtain an F-score value which reflects the comprehensive performance of the method on the recall and the accuracy, the range is [0,1], and the higher the value is, the better the performance is. The results are shown in Table 1:
table 1:
Taking the DPU data set as an example, the DPU is a larger blog network data set that contains 1038 users and 11385 concerns. The IC propagation model is applied to the network, different initial infection node proportions (0.05,0.1,0.15,0.2,0.25) are selected, 5 multiplied by 200 historical observation records are obtained through simulation respectively, the reconstructed propagation network method and the reconstructed propagation network method TENDS in the prior art are applied to the data set, and recall rate and accuracy rate are recorded after all steps are completed. The results are shown in Table 2:
Table 2:
In summary, in the propagation network reconstruction method in the above embodiment of the present application, by reconstructing the propagation network, the network can be reconstructed only by the node state at the time of final observation, and the exact occurrence time of each node infection and the infection state of the node in each diffusion process are not required to be involved; furthermore, the technical scheme of the application does not need to estimate the probability, but converts the probability problem into the sampling problem, and the dependence on the number of samples is eliminated, so that a more accurate result can be obtained under a small number of data samples; the method solves the technical problem that the reconstruction of the propagation network in the prior art needs to depend on the exact occurrence time of each node and can obtain more accurate results under a large amount of sample data.
Example two
The propagation network reconstruction system provided in the second embodiment of the present invention includes:
The acquisition module is used for acquiring historical observation data of the network node to acquire a historical observation data set;
The candidate father node calculation module is used for calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation data set to obtain a scoring set, and calculating candidate father nodes of every node in the scoring set to obtain a candidate father node set C, wherein v i and v y are nodes;
the final parent node calculation module is used for calculating the final parent node of each node by combining the candidate parent nodes and obtaining a final parent node set F according to the final parent node of each node;
A reconstruction module, configured to generate a directed edge from a parent node to a node v i according to each parent node of the final parent node set F corresponding to each node v i and node v i, so as to obtain a propagation network topology structure to reconstruct the propagation network structure;
the final parent node calculation module comprises:
Obtaining a preset queue Q, a preset set T and a current upper score limit g max, wherein the set T is an empty set initially, and q= { -The current upper score limit g max is preset to minus infinity- ≡;
Judging whether the queue Q is an empty set or not;
If not, acquiring an element E from the queue Q, and recording the sequence number j of a corresponding node of the element E;
The j+1st node is taken out from the candidate father node set of the node v i and is marked as the node v p, and the set T is added, wherein the final father node F i of the preset node v i is an empty set
Calculating a scoring function f (T, v i);
Judging whether the scoring function f (T, v i) is larger than the current scoring upper limit g max;
If the score is greater than the current score upper limit g max, updating the score upper limit g max according to the score function F (T, v i), updating the final parent node F i according to the set T, and calculating the score upper limit function g according to the updated final parent node F i (C i,T, vi);
judging whether the score upper limit function g (C i,T, vi) is larger than the score upper limit g max;
If the score is greater than the score upper limit g max, the node v p joins the queue Q, and returns to execute the step of judging whether the queue Q is empty or not until the number of candidate parent nodes of v i is reached, so as to obtain a final parent node set F of any node v i.
In summary, in the propagation network reconstruction system in the above embodiment of the present application, by reconstructing the propagation network, the network can be reconstructed only by the node state at the time of final observation, and the exact occurrence time of each node infection and the infection state of the node in each diffusion process are not required to be involved; furthermore, the technical scheme of the application does not need to estimate the probability, but converts the probability problem into the sampling problem, and the dependence on the number of samples is eliminated, so that a more accurate result can be obtained under a small number of data samples; the method solves the technical problem that the reconstruction of the propagation network in the prior art needs to depend on the exact occurrence time of each node and can obtain more accurate results under a large amount of sample data.
Furthermore, an embodiment of the present invention proposes a computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, implements the steps of the method in the above-mentioned embodiment.
Furthermore, an embodiment of the present invention also proposes a data processing apparatus including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method in the above embodiment when executing the program.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims (8)

1. A method of propagation network reconstruction, comprising:
Acquiring historical observation data of a network node to obtain a historical observation data set;
Calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation dataset to obtain a scoring set, and calculating candidate father nodes of every node in the scoring set to obtain a candidate father node set C, wherein v i and v y are nodes;
calculating the final parent node of each node by combining the candidate parent nodes, and obtaining a final parent node set F according to the final parent node of each node;
Generating a directed edge from a parent node to a node v i according to each parent node of the final parent node set F corresponding to each node v i and the node v i, thereby obtaining a propagation network topology structure to reconstruct the propagation network structure;
Wherein the step of calculating a final parent node of each node in combination with the candidate parent nodes includes:
Obtaining a preset queue Q, a preset set T and a current upper score limit g max, wherein the set T is an empty set initially, and q= { -The current upper score limit g max is preset to minus infinity- ≡;
Judging whether the queue Q is an empty set or not;
If not, acquiring an element E from the queue Q, and recording the sequence number j of a corresponding node of the element E;
The j+1st node is taken out from the candidate father node set of the node v i and is marked as the node v p, and the set T is added, wherein the final father node F i of the preset node v i is an empty set
Calculating a scoring function f (T, v i);
Judging whether the scoring function f (T, v i) is larger than the current scoring upper limit g max;
If the score is greater than the current score upper limit g max, updating the score upper limit g max according to the score function F (T, v i), updating the final parent node F i according to the set T, and calculating the score upper limit function g according to the updated final parent node F i (C i,T, vi);
judging whether the score upper limit function g (C i,T, vi) is larger than the score upper limit g max;
If the score is greater than the score upper limit g max, the node v p joins the queue Q, and returns to execute the step of judging whether the queue Q is empty or not until the number of candidate parent nodes of v i is reached, so as to obtain a final parent node set F of any node v i.
2. The propagation network reconstruction method according to claim 1, wherein the calculation formula of the scoring function f (C, v i) is:
Where Tr (-) represents the trace of the matrix, J is an n-dimensional matrix, n is the number of entries of the propagation network node historical observation dataset, j=i-1/n, I represents an n-dimensional identity matrix; k C and K Vi are n-dimensional symmetric matrices, the ith row and the jth column of K C are inner products of the ith record and the jth record of the node represented by the set C, and the ith row and the jth column of K Vi are inner products of the ith record and the jth record of the node v i.
3. The propagation network reconstruction method according to claim 2, wherein the calculation formula of the upper score limit function is:
Where n is the number of records, delta n represents reordering the order of n records, n is the total of the ≡ ordering, δ is one of the reordering schemes, (v i)δ represents delta reordering of n records of node v i).
4. The method of claim 1, wherein calculating a scoring function f (v i,vy) for each node in the network from the historical observation dataset to obtain a set of scores comprises: the scoring function f (v i,vy) of every two nodes in the network is calculated to obtain n (n-1)/2 scores to obtain a scoring set, and the scoring set is marked as { f 1,2,f1,3,…,fi,j,…,fn-1,n }.
5. The method of propagation network reconstruction of claim 4, wherein the step of calculating candidate parent nodes for each node in the score set comprises:
K-Means clustering is applied to the grading set, wherein the K value is set to be 2, and one clustering center is fixed to be 0;
And selecting a class of scores with the clustering center not being 0 from the clustered score set, and scoring each score f i,j in the class, wherein the record node v y has a candidate father node v i.
6. A propagation network reconstruction system, the system comprising:
The acquisition module is used for acquiring historical observation data of the network node to acquire a historical observation data set;
The candidate father node calculation module is used for calculating a scoring function f (v i,vy) of every two nodes in the network according to the historical observation data set to obtain a scoring set, and calculating candidate father nodes of every node in the scoring set to obtain a candidate father node set C, wherein v i and v y are nodes;
the final parent node calculation module is used for calculating the final parent node of each node by combining the candidate parent nodes and obtaining a final parent node set F according to the final parent node of each node;
A reconstruction module, configured to generate a directed edge from a parent node to a node v i according to each parent node of the final parent node set F corresponding to each node v i and the node v i, so as to obtain a propagation network topology structure to reconstruct the propagation network structure;
the final parent node calculation module comprises:
Obtaining a preset queue Q, a preset set T and a current upper score limit g max, wherein the set T is an empty set initially, and q= { -The current upper score limit g max is preset to minus infinity- ≡;
Judging whether the queue Q is an empty set or not;
If not, acquiring an element E from the queue Q, and recording the sequence number j of a corresponding node of the element E;
The j+1st node is taken out from the candidate father node set of the node v i and is marked as the node v p, and the set T is added, wherein the final father node F i of the preset node v i is an empty set
Calculating a scoring function f (T, v i);
Judging whether the scoring function f (T, v i) is larger than the current scoring upper limit g max;
If the score is greater than the current score upper limit g max, updating the score upper limit g max according to the score function F (T, v i), updating the final parent node F i according to the set T, and calculating the score upper limit function g according to the updated final parent node F i (C i,T, vi);
judging whether the score upper limit function g (C i,T, vi) is larger than the score upper limit g max;
If the score is greater than the score upper limit g max, the node v p joins the queue Q, and returns to execute the step of judging whether the queue Q is empty or not until the number of candidate parent nodes of v i is reached, so as to obtain a final parent node set F of any node v i.
7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a propagation network reconstruction method as claimed in any one of claims 1-5.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the propagation network reconstruction method according to any one of claims 1-5 when the program is executed by the processor.
CN202410439289.0A 2024-04-12 2024-04-12 Propagation network reconstruction method, system, storage medium and equipment Pending CN118055030A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410439289.0A CN118055030A (en) 2024-04-12 2024-04-12 Propagation network reconstruction method, system, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410439289.0A CN118055030A (en) 2024-04-12 2024-04-12 Propagation network reconstruction method, system, storage medium and equipment

Publications (1)

Publication Number Publication Date
CN118055030A true CN118055030A (en) 2024-05-17

Family

ID=91052236

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410439289.0A Pending CN118055030A (en) 2024-04-12 2024-04-12 Propagation network reconstruction method, system, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN118055030A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115801600A (en) * 2022-11-14 2023-03-14 武汉大学 Method and device for reconstructing propagation network structure facing noise data environment
CN116308853A (en) * 2022-09-09 2023-06-23 武汉大学 Propagation network structure reconstruction method, device, equipment and readable storage medium
CN116304205A (en) * 2023-02-28 2023-06-23 江西求是高等研究院 Propagation network structure reconstruction method, device, equipment and storage medium
CN116611508A (en) * 2023-04-13 2023-08-18 江西求是高等研究院 Propagation network reconstruction method and device for non-infection timestamp data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116308853A (en) * 2022-09-09 2023-06-23 武汉大学 Propagation network structure reconstruction method, device, equipment and readable storage medium
CN115801600A (en) * 2022-11-14 2023-03-14 武汉大学 Method and device for reconstructing propagation network structure facing noise data environment
CN116304205A (en) * 2023-02-28 2023-06-23 江西求是高等研究院 Propagation network structure reconstruction method, device, equipment and storage medium
CN116611508A (en) * 2023-04-13 2023-08-18 江西求是高等研究院 Propagation network reconstruction method and device for non-infection timestamp data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孔露露: "基于聚类剪枝的传播网络推演研究", CNKI优秀硕士学位论文全文库, 15 March 2024 (2024-03-15) *

Similar Documents

Publication Publication Date Title
Corchado et al. Ibr retrieval method based on topology preserving mappings
CA3066029A1 (en) Image feature acquisition
JP4885679B2 (en) Document clustering method
Sinanović et al. Toward a theory of information processing
CN113554175B (en) Knowledge graph construction method and device, readable storage medium and terminal equipment
CN111027610B (en) Image feature fusion method, apparatus, and medium
CN116304205A (en) Propagation network structure reconstruction method, device, equipment and storage medium
CN115439192A (en) Medical commodity information pushing method and device, storage medium and computer equipment
Xie et al. Optimal Bayesian estimation for random dot product graphs
CN112446739B (en) Click rate prediction method and system based on decomposition machine and graph neural network
CN112559868B (en) Information recall method and device, storage medium and electronic equipment
Pan et al. A simultaneous variable selection methodology for linear mixed models
CN111581235B (en) Method and system for identifying common incidence relation
CN116611508A (en) Propagation network reconstruction method and device for non-infection timestamp data
CN111859117A (en) Information recommendation method and device, electronic equipment and readable storage medium
CN118055030A (en) Propagation network reconstruction method, system, storage medium and equipment
CN116543259A (en) Deep classification network noise label modeling and correcting method, system and storage medium
CN115907015A (en) Multitask propagation network inference method, device, equipment and readable storage medium
Dahinden et al. Decomposition and model selection for large contingency tables
US11676050B2 (en) Systems and methods for neighbor frequency aggregation of parametric probability distributions with decision trees using leaf nodes
JP4963341B2 (en) Document relationship visualization method, visualization device, visualization program, and recording medium recording the program
Wei et al. Improved model identification for non-linear systems using a random subsampling and multifold modelling (RSMM) approach
JP7302229B2 (en) Data management system, data management method, and data management program
CN111125541B (en) Method for acquiring sustainable multi-cloud service combination for multiple users
JP6517731B2 (en) Probability density function estimation device, continuous value prediction device, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination