CN113254717A - Multidimensional graph network node clustering processing method, apparatus and device - Google Patents
Multidimensional graph network node clustering processing method, apparatus and device Download PDFInfo
- Publication number
- CN113254717A CN113254717A CN202110645181.3A CN202110645181A CN113254717A CN 113254717 A CN113254717 A CN 113254717A CN 202110645181 A CN202110645181 A CN 202110645181A CN 113254717 A CN113254717 A CN 113254717A
- Authority
- CN
- China
- Prior art keywords
- node
- nodes
- graph network
- dimensional
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to a multidimensional graph network node clustering processing method, a multidimensional graph network node clustering processing device and multidimensional graph network node clustering processing equipment, wherein the method comprises the following steps: converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes; according to the built in-layer transition probability and cross-layer random walk transition probability, carrying out in-layer and cross-layer multilayer network random walk processing on the weighted multi-dimensional graph network to obtain a sampling sequence of each node of the weighted multi-dimensional graph network; converting the sampling sequence of each node into low-dimensional embedding based on the SkipGram model; clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multidimensional graph network; and embedding and projecting each low dimension into a two-dimensional space by adopting a dimension reduction technology and displaying a clustering result by adopting a graph visualization technology. The purpose of remarkably improving the clustering effect is achieved, and the clustering effect is excellent.
Description
Technical Field
The present application relates to the field of network data processing technologies, and in particular, to a multidimensional graph network node clustering method, apparatus, and device.
Background
The network theory can be used for modeling complex relationships among various entities in real life, and the method for the relationships among individuals in the traditional modeling system mostly adopts a simple single network or a single-layer network, namely, the networks with the same node types and only one interaction type exist in the networks; wherein, the nodes in the network represent individuals in a complex system, and the continuous edges represent the interactive relationship existing between the individuals. The multilayer network can model different interaction relations existing among individuals, in other words, the multilayer network is a network comprising a plurality of layers, and each layer network in the layers is formed by independent single-layer networks (namely, traditional networks); edges in each layer of a multi-layer network are of the same type, but the edge types in different layers may be different; the node type of each layer in a multi-layer network may also be different.
Networks at different levels in a multidimensional graph network are composed of the same entities, and the connection relationships between nodes in each level have different properties. A multidimensional graph network is a special type of network that is a multi-layer network. The objective of attribute single-layer network node clustering is to satisfy the following requirements: 1) the structure compactness, namely the nodes in the same cluster are closely connected, and the nodes in different clusters are far away; 2) attribute homogeneity, i.e., nodes in the same cluster have similar attribute values, while nodes in different clusters have significant differences in attribute values. In practice, for node clustering of an attribute multidimensional graph network, not only the above-mentioned structure compactness and attribute homogeneity need to be satisfied, but also the association relationship between different dimensions and the information amount of different dimension graph networks need to be considered when node clustering is performed, that is, the influence of different dimension graph networks on node clustering of the whole system plays different importance. However, in the process of implementing the present invention, the inventor finds that the node clustering technology of the current graph network has a technical problem of poor clustering effect.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a multidimensional graph network node clustering method with a better clustering effect, a multidimensional graph network node clustering device, a computer device, and a computer readable storage medium.
In order to achieve the above purpose, the embodiment of the invention adopts the following technical scheme:
in one aspect, an embodiment of the present invention provides a multidimensional graph network node clustering method, including:
converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes;
according to the built in-layer transition probability and cross-layer random walk transition probability, carrying out in-layer and cross-layer multilayer network random walk processing on the weighted multi-dimensional graph network to obtain a sampling sequence of each node of the weighted multi-dimensional graph network;
converting the sampling sequence of each node into low-dimensional embedding based on the SkipGram model;
clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multidimensional graph network;
and embedding and projecting each low dimension into a two-dimensional space by adopting a dimension reduction technology and displaying a clustering result by adopting a graph visualization technology.
In another aspect, a multidimensional graph network node clustering processing apparatus is also provided, including:
the network conversion module is used for converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes;
the migration processing module is used for carrying out multilayer network random migration processing of in-layer and cross-layer on the weighted multidimensional graph network according to the built in-layer transition probability and cross-layer random migration probability to obtain a sampling sequence of each node of the weighted multidimensional graph network;
the embedding processing module is used for converting the sampling sequence of each node into low-dimensional embedding based on the SkipGram model;
the clustering processing module is used for clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multidimensional graph network;
and the visualization module is used for embedding and projecting each low-dimensional image into a two-dimensional space by adopting a dimension reduction technology and displaying a clustering result by adopting a graph visualization technology.
In still another aspect, a computer device is further provided, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of any one of the above-mentioned multidimensional graph network node clustering processing methods when executing the computer program.
In still another aspect, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above-mentioned methods for processing clusters of nodes in a multidimensional graph network.
One of the above technical solutions has the following advantages and beneficial effects:
according to the method, the device and the equipment for processing the clustering of the nodes of the multidimensional graph network, an original unweighted attribute multidimensional graph network (namely the unweighted multidimensional graph network) is converted into a weighted multidimensional graph network, and comprehensive similarity characteristics of attribute similarity and structural similarity between nodes with connected edges are coded in the conversion, so that the clustering performance of the nodes can be enhanced. Secondly, based on different importance differences exerted by different dimension graph networks on node clustering, namely different information quantities of different dimension graph networks, cross-layer random walk transfer probability is established according to the information quantity difference in a heterogeneous mode, transfer probability in a combination layer is combined, a sampling sequence of each node can be obtained, and the sampling sequence captures neighbor node information of each node. The resulting sample sequence is then converted to low-dimensional embedding using network embedding techniques. And based on the low-dimensional embedding of all the nodes, clustering the nodes by adopting a K-means clustering algorithm to obtain a clustering result of the nodes. Finally, the low-dimensional embedding is projected into the two-dimensional space by adopting a dimension reduction technology, the coordinate value of each node in the two-dimensional space is obtained, the label information of the node is used as color mapping, the node clustering effect is displayed from the visual angle by adopting a graph visualization technology, the purpose of remarkably improving the clustering effect is achieved, the clustering effect is excellent, and the application range of the network embedding technology is expanded.
Drawings
FIG. 1 is a schematic diagram of a conventional single-layer network and a multi-dimensional network;
FIG. 2 is a flowchart illustrating a method for clustering nodes in a multidimensional graph network according to an embodiment;
FIG. 3 is a flow diagram that illustrates the conversion process of the multidimensional graph network in one embodiment;
FIG. 4 is a diagram illustrating conversion of an unweighted multidimensional graph network to a weighted multidimensional graph network, under an embodiment;
fig. 5 is a schematic block diagram of a multidimensional graph network node cluster processing apparatus in an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should be considered to be absent and not within the protection scope of the present invention.
As shown in fig. 1, which is a schematic diagram of a conventional single-layer network and a multidimensional network, different layers of the multidimensional network can be regarded as interaction relationships existing in different angles of the same node set. If the network with multiple interactive relationships is represented as a multi-relationship fusion network as shown in fig. 1(a), the respective structural features of the same dimension graph network and the coupling information and the interactive association information between different dimensions cannot be clearly expressed. Compared with fig. 1(B), fig. 1(C) can more clearly represent the interaction relationship of the same node set in three different dimensions, namely dimension a, dimension B and dimension C, and the correlation information between layers. Compared with the traditional single-layer network, the multi-dimensional graph network describes different characteristics of the complex system from different angles, and makes up for the deviation brought by a single visual angle, so that the result obtained by analyzing the complex system based on the multi-dimensional graph network is more accurate. When a node in the network contains an attribute feature, the multidimensional graph network is called an attribute multidimensional graph network.
In addition, the visualization technology expresses information in a visual image mode, and powerful support is provided for discovering and understanding scientific laws. The graph visualization becomes an important graph network data analysis method, and the method mainly comprises the following steps: a force-guided based method and a data-dimension-reduction based method. Compared with a force guiding method, the graph visualization technology based on data dimension reduction strives to maintain the similarity between the node distribution in the original graph space and the two-dimensional layout space by optimizing an objective function, so that the node distribution in the two-dimensional layout space can reflect the node information in the original graph space. The visualization technology based on nonlinear dimension reduction can reflect structural data with nonlinear relation, and is more widely applied compared with the visualization technology based on linear dimension reduction.
The objective of attribute single-layer network node clustering is to satisfy the following requirements: 1) the structure compactness, namely the nodes in the same cluster are closely connected, and the nodes in different clusters are far away; 2) attribute homogeneity, i.e., nodes in the same cluster have similar attribute values, while nodes in different clusters have significant differences in attribute values. In practice, for node clustering of an attribute multidimensional graph network, not only the above-mentioned structure compactness and attribute homogeneity need to be satisfied, but also the association relationship between different dimensions and the information amount of different dimension graph networks need to be considered when node clustering is performed, that is, the influence of different dimension graph networks on node clustering of the whole system plays different importance.
The inventor finds that the node clustering technology of the current graph network has the technical problem of poor clustering effect, which can be specifically shown as follows: 1) a classical network embedding technology (node2vec) based on random walk focuses on analyzing a single-layer network, and the single-layer network usually only has topology information without considering the attribute characteristics of nodes; 2) While the attribute graph network node clustering method considering the attribute characteristics is generally only directed to a single-layer graph network. 3) The traditional node clustering method of the multidimensional graph network is usually a node clustering method of a single-layer network, such as modularity or matrix decomposition, and the like, and is extended to the multidimensional graph network, however, the methods are not suitable for large-scale graph networks.
The invention provides an effective solution to the technical problem of poor clustering effect of the node clustering technology of the current graph network, and can remarkably enhance the clustering effect of the nodes.
For convenience of illustration and understanding, the structure shown in FIG. 1(c) isA multidimensional graph network (or multi-relationship network) composed of graph networks of different dimensions is represented as:,representing the channels in the multidimensional networkA node set consisting of individual nodes;representing the set of edges of the multi-dimensional network;Andrespectively representing the sizes of the node set and the edge set in the multidimensional network;is represented byA feature matrix formed by the feature values of the nodes;an adjacency matrix representing the multidimensional network;the dimension of expression isA graph network of;indicating that the graph network is an unlicensed network (sized to be) Wherein, whenRepresenting nodesAnd nodeIn the dimension ofThere are connected edges in the graph network (i.e. there are connected edges in the graph network)) Otherwise, the。
Referring to fig. 2, in one aspect, the present invention provides a method for processing node clusters of a multidimensional graph network, including the following processing steps S12 to S20:
and S12, converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes.
And S14, carrying out multilayer network random walk processing of the weighted multidimensional graph network in and across layers according to the built in-layer transition probability and cross-layer random walk transition probability to obtain a sampling sequence of each node of the weighted multidimensional graph network.
It is understood that in a multidimensional graph network, graph networks in different dimensions represent different relationships of the same group of nodes in different views, and attribute information of the nodesShared by all dimensions. The purpose of the multidimensional graph network node clustering is to detect the cluster shared by all the multidimensional graph networks, and meanwhile, the correlation information among different multidimensional graph networks needs to be considered. From the perspective of the whole system, different image layers generally play different roles in the performance of the whole system, that is, the graph networks with different dimensions can be sorted according to different importance degrees exerted in the node clustering performance of the multidimensional graph network, and a corresponding sorting result with a descending order can be obtained.
Based on the above setting conditions, the embodiment designs a multilayer random walk method for a multidimensional graph network, which includes two walk processing conditions: one is random walk in the layer; the second is cross-layer random walk.
S16, converting the sampling sequence of each node into low-dimensional embedding based on the SkipGram model; it can be understood that the skip gram model is also a skip-gram neural network model that is known in the art.
S18, clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multi-dimensional graph network; it can be understood that the K-means algorithm is also known in the art as a K-means clustering algorithm (K-means clustering algorithm), which is a clustering analysis algorithm for iterative solution. And (3) clustering the low-dimensional embedding by adopting a K-means clustering algorithm, namely dividing each node into K different clusters, so that the sum of squares in each cluster is minimum, and obtaining a clustering result of the node, namely K different clusters.
And S20, embedding and projecting each low-dimensional image into a two-dimensional space by using a dimension reduction technology and displaying a clustering result by using a graph visualization technology.
According to the multi-dimensional graph network node clustering processing method, an original unweighted attribute multi-dimensional graph network (namely the unweighted multi-dimensional graph network) is converted into a weighted multi-dimensional graph network, comprehensive similarity characteristics of attribute similarity and structure similarity between nodes with connected edges are coded in the conversion, and the clustering performance of the nodes can be enhanced. Secondly, based on different importance differences exerted by different dimension graph networks on node clustering, namely different information quantities of different dimension graph networks, cross-layer random walk transfer probability is established according to the information quantity difference in a heterogeneous mode, transfer probability in a combination layer is combined, a sampling sequence of each node can be obtained, and the sampling sequence captures neighbor node information of each node. The resulting sample sequence is then converted to low-dimensional embedding using network embedding techniques. And based on the low-dimensional embedding of all the nodes, clustering the nodes by adopting a K-means clustering algorithm to obtain a clustering result of the nodes. Finally, the low-dimensional embedding is projected into the two-dimensional space by adopting a dimension reduction technology, the coordinate value of each node in the two-dimensional space is obtained, the label information of the node is used as color mapping, the node clustering effect is displayed from the visual angle by adopting a graph visualization technology, the purpose of remarkably improving the clustering effect is achieved, the clustering effect is excellent, and the application range of the network embedding technology is expanded.
Referring to fig. 3 and 4, in an embodiment, the step S12 may include the following steps:
s122, for each dimension of the graph network, determining the attribute similarity of the nodes with the connected edges according to the similar number of the nodes with the connected edges in the attribute vector of the F-dimension attribute;
s124, determining the structural similarity between the nodes by adopting a structural similarity measurement method;
and S126, adding weights to each unweighted connecting edge in the graph network of each dimension by utilizing the attribute similarity and the structure similarity, and converting the unweighted multidimensional graph network into a weighted multidimensional graph network.
It can be understood that the clustering target of the attribute multidimensional graph network needs to satisfy not only the compactness of the structure but also the homogeneity of the attributes. Therefore, based on the idea, for the graph network of each dimension, firstly measuring the attribute similarity between the nodes with connected edges in the graph network, secondly calculating the structural similarity between the nodes based on a structural similarity measurement method, and then fusing the attribute similarity and the structural similarity between the nodes to obtain the connected edge weight between the nodes, namely converting the unweighted graph network of each dimension into a weighted graph network.
Specifically, for processing attribute similarity:
the most intuitive measure has nodes with edges: () The method for attribute similarity between nodes is to compare the nodes one by oneAnd nodeAttribute vector ofAndin (1),number of similarities in dimension attributes:
based on the formula (1), the nodeAnd nodeThe attribute similarity between them can be expressed as:
for the treatment of structural similarity:
there are many Common methods for measuring structural similarity between nodes, such as Common neighbor algorithm (CN), Jaccard Coefficient, Resource Allocation Index (RA), adaptive Adar Index (AA Index), Preferred Attachment (PA), and Community Resource Allocation Index (Community Resource Allocation Index). The structural similarity between the nodes can be determined by adopting any one of the structural similarity measurement methods.
In one embodiment, preferably, the structural similarity measure is an RA metric measure. Since the RA index is the best performance method in the tasks of community detection and link prediction graph analysis, the structural similarity between nodes is measured based on the RA index in this embodiment, and the specific calculation method is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeA common neighbor between them and a common neighbor between them,representing the value of each node in the common neighborhood, taking the reciprocal and then adding up to obtain the nodeAnd nodeRA value in between, i.e. nodeAnd nodeStructural similarity of (c). In this way, optimal processing performance can be achieved in the process of determining structural similarity between nodes.
Constructing a weighted multidimensional graph network based on the attribute similarity and the structural similarity: in particular, for dimensions ofGraph network ofEach of the unauthorized strips can be connected to the edgeAdding a weightThe weight encodes the nodeAnd nodeThe structural similarity and the attribute similarity are calculated according to the formula (4):
wherein the parametersAnd parametersRespectively for measuring nodesAnd nodeThe relative magnitude of structural and attribute similarity between them.Represents the calculation result of the formula (3),the calculation result of formula (2) is expressed. Based on equation (4), the attribute unweighted multidimensional graph network can be constructedConversion into a weighted multidimensional graph networkAs shown in fig. 4, wherein A, B and C represent three different dimensions, respectively.
In an embodiment, regarding the step S14, the process of performing intra-layer and inter-layer multilayer network random walk processing on the weighted multidimensional graph network according to the built intra-layer transition probability and cross-layer random walk transition probability may specifically include the following processing procedures:
carrying out intra-layer biased random walk processing on the weighted multi-dimensional graph network by adopting an embedding method of graph data; node pointTo the nodeProbability of intra-layer transition ofIs composed ofWherein, in the step (A),the calculation method of (c) is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeDistance between, nodesIs a nodeOf a previous node, a nodeIs a nodeThe next-hop node of (1) is,representing nodesAnd nodeWeight of the connecting edge between, parameterAnd parametersA parameter for guiding the random walker to perform biased random walk;
determining cross-layer random walk transfer probability according to the modularity of the graph network of each dimension; cross-layer random walk transition probabilityComprises the following steps:
wherein the content of the first and second substances,representing nodesAnd nodeThe weight of the connecting edge between the two,representing according to nodesAnd nodeDistance between them to measure the nodeSelecting a next hop nodeThe probability of (d); whileA layer jump probability is represented, wherein,(ii) a When the dimension isWhen the modularity of the graph network of (a) is high,a higher value will be set; on the contrary, when the dimension isWhen the modularity of the graph network of (a) is low,a lower value will be set;representing for a dimension ofA network of graphs according toProbability values for modularity settings of the graph network, and similarly,representing for a dimension ofThe probability values set by the modularity of the graph network,representing for a dimension ofThe probability values set by the modularity of the graph network,representing probability values set for a graph network of dimension 1 according to its modularity, which, when higher, will beSet to a higher value, then the random walker will haveJumping the probability of | to a graph network with the dimension of 1;
and indicating a random walker to determine the layers and the moving nodes of the traversed weighted multidimensional graph network according to the cross-layer random walk transfer probability, and performing cross-layer random walk.
Specifically, for intra-layer random walk, a node2vec method (i.e., a graph data embedding method) is adopted. Given a source node,Is expressed as length ofFirst in the wandering path length ofAnd a sampling node. Assume that the current sampling node is atThe random walker can walk to the next neighbor node according to the following probability distribution:
Wherein the content of the first and second substances,middle representation nodeAnd nodeIn the type of relationshipThere are connected edges in the graph network of (1),representing nodesAnd nodeThe probability of an intra-layer transition between,which represents a normalization constant, is shown,i.e., the above equation (5).
For cross-layer random walker, assume that the random walker is initially at a nodeThe random walker will then decide which layer to traverse in the first step and perform biased random walk initialization. The random walker will then decide the next node to move. In the next traversal, the random walker determines the level to traverse and the next node to walk.
The cross-layer random walk transfer probability is taken according to each dimension graph networkFor the multi-dimensional graph networkThe importance degree of the node clustering performance is determined, so that the information quantity of each dimension graph network can be measured by adopting a heuristic modularity method, and the calculation result based on the modularity is obtainedAnValue of modularity of. Secondly, a layer jump probability is defined:
In some embodiments, the network of the graph with lower modularityOf networks with differences smaller than the modularity of the networkA difference value; wherein the content of the first and second substances,representing for a dimension ofA network of graphs according toProbability values set by the modularity of the graph network,representing for a dimension ofA network of graphs according toProbability values set by modularity of the graph network.
It is to be understood that ifIs very low (i.e. the modularity ofVery small), then will giveSetting a lower value for the difference of (a); instead, a slightly higher value will be set. The setting idea is as follows: for a graph network layer with low modularity (small information content), a random walker has relatively low probability to traverse the network layer, and conversely, for a graph network layer with high modularity (large information content), the random walker has relatively high probability to traverse the graph network layer; the reason is that the graph network layer with high modularity plays a more important role in node clustering of the whole system.
In an embodiment, the step S16 may specifically include the following processing steps:
s162, dividing a sampling sequence through a window to obtain a training sample sequence of node information;
s164, inputting the training sample sequence into a SkipGram model and optimizing an objective function by adopting a random gradient descent method to obtain low-dimensional dense embedding of the nodes;
the process of optimizing the objective function by the random gradient descent method comprises the steps of sequentially determining conditional probability through conditional independence assumption and symmetry assumption optimization, and obtaining the objective function according to the optimized conditional probability.
It can be understood that in the processing of the previous step S14, a plurality of layers is obtained by the intra-layer and inter-layer random walk strategiesDimension graph networkInSampling sequence of individual nodesThe sequence comprising nodesContext information of (i.e., neighbor nodes); then dividing the sequence through a window to obtain a training sample sequence related to the node information; inputting a training sample sequence into a Skip-Gram model, and optimizing an objective function by a random gradient descent methodTo obtain a nodeLow dimensional dense embedding of。
wherein the content of the first and second substances,is a node toA dimension is mapped to a size ofThe embedded matrix of (a);is a pre-set parameter of the process,representing nodesThe neighbor nodes of (a) are,is a conditional probability that indicates that, given each node, the probability of having its neighbor node appear is maximized.
Under the assumption of conditional independence (i.e., given a source node, the probability of its neighbor node appearing is independent of the rest of the nodes in the neighbor set), the conditional probability can be further expressed as:
under the assumption of symmetry (when in)When the influence between two nodes in the dimensional feature space is symmetrical, that is, when one node is used as a source node and as a neighbor node, the same low-dimensional embedding is shared), the conditional probability can be further optimized and expressed as:
in one embodiment, regarding step S18, specifically, the step S16 results in a multidimensional mapping networkInLow dimensional embedding of individual nodes (nodes)Is represented as) That is, the obtained low-dimensional embedded set isWherein each token vector is oneThe real vector of dimensions. Clustering these low-dimensional embeddings by using K-means algorithm, i.e. clusteringA node is divided intoIn a different clusterSo that the sum of squares within each cluster is minimized, i.e., the clustering goal of K-means is to find a cluster that satisfies the following equation,
Wherein the content of the first and second substances,representing clustersAverage of all points in (1) such that each point belongs to a point away from the centerThe closest mean (cluster center) corresponds to the cluster. Finally, the original multidimensional graph network can be obtainedInAfter the nodes are clusteredA different cluster.
In an embodiment, the step S20 may specifically include the following processing steps S202 to S208:
s202, obtaining the similarity between the nodes according to the low-dimensional embedding calculationDistributing;the distribution is as follows:
wherein the content of the first and second substances,representing nodesSelecting a nodeAs a conditional probability of its close point,representing nodesAnd nodeThe distance between the two or more of the two or more,representing nodesIs the variance of the gaussian distribution of the center point,representing nodesAnd nodeThe distance between the two or more of the two or more,representing nodesSelecting a nodeAs a conditional probability of its close point,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesSelecting a nodeAs a conditional probability of its close point,indicating the number of nodes.
It can be understood that because ofThe low-dimensional embedding of the nodes has captured the similarity of attributes and structure between nodes in the multidimensional graph network, and the correlation between different dimensional graph networks and the difference of importance to clustering performance, and is therefore based onAndto calculateAnd formulae (13) and (14) toThe distribution can more accurately and comprehensively describe the clustering characteristics in the multidimensional graph network.
S204, measuring nodes in the two-dimensional layout space based on Student-t distributionAnd nodeProximity between them, calculatingDistributing;the distribution is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeThe distance in the two-dimensional layout space,andrespectively representing nodesAnd nodeThe coordinate values in the two-dimensional layout space,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,andrespectively representing nodesAnd nodeCoordinate values in the two-dimensional layout space.
It is understood thatThe distribution is similar in that,the distribution shows that similar nodes are closer in distance and dissimilar nodes are relatively farther in distance in the two-dimensional layout space.
S206, calculatingAre distributed anddistributed byKLDivergence; the calculation formula is as follows:
it can be appreciated that in the model optimization process, there is a constant decreaseCan makeAs much as possible reflectNamely, the coordinate position of the node in the two-dimensional layout space reflects the characteristic information in the original graph network as much as possible.
S208, whenKLAnd when the divergence stops iterative optimization, obtaining two-dimensional coordinate values of each node, drawing the nodes with the same label by adopting the same color, drawing the nodes with different labels by adopting different colors, and performing cluster display.
It will be appreciated that when the model stops iterative optimization, this is the caseIs thatTwo-dimensional coordinate values of the individual nodes. In this embodiment, for the processing of embedding and projecting the low dimension into the two-dimensional space and visually displaying the clustering result, the following may be specifically briefly mentioned:
(1) based on low-dimensional embeddingCalculatingSimilarity between low-dimensional embedding of individual nodes, i.e. calculationDistributing;
(2) in a two-dimensional layout space, calculatingLayout proximity between individual nodes, i.e. calculationDistributing;
(3) computingAre distributed andbetween distributionsDivergence, continuous iterative optimization of objective function, reductionAre distributed anddifference between the distributions to obtainTwo-dimensional coordinate values of the nodes in the two-dimensional layout space;
(4) and carrying out graph visualization mapping according to the two-dimensional coordinate values and the labels of the nodes.
In the embodiment, the nodes with the same label (label) are drawn in the same color, the nodes with different labels are drawn in different colors, the visual result is the visual effect of the multidimensional graph network node clustering, the nodes in the same cluster are close to each other, and the nodes in different clusters are far away from each other.
Compared with the prior art, the method and the device have the advantages that the original weightless network is converted into the weighting network based on the attribute similarity and the structural similarity of the nodes, the comprehensive similarity between the nodes is coded by the weights, and the clustering effect of the nodes can be enhanced; according to the method and the device, the cross-layer random walk transfer probability is set according to the information amount of different layers, so that the clustering effect of the nodes can be further enhanced; the method and the device apply the classic random walk-based network embedding technology aiming at the single-layer network to attribute multidimensional graph network node clustering, and expand the application range of the network embedding technology.
It should be understood that although the steps in the flowcharts of fig. 2 and 3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps of fig. 2 and 3 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
Referring to fig. 5, a multidimensional graph network node clustering processing apparatus 100 is further provided, which includes a network conversion module 13, a migration processing module 15, an embedding processing module 17, a clustering processing module 19, and a visualization module 21. The network conversion module 13 is configured to convert the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structural similarity of the nodes. The migration processing module 15 is configured to perform multilayer network random migration processing in and across layers on the weighted multidimensional graph network according to the built in-layer transition probability and cross-layer random migration probability, so as to obtain a sampling sequence of each node of the weighted multidimensional graph network. The embedding processing module 17 is configured to convert the sample sequence of each node into low-dimensional embedding based on the SkipGram model. The clustering processing module 19 is configured to perform clustering processing on the low-dimensional embedding of each node by using a K-means algorithm to obtain a clustering result of each node of the weighted multidimensional graph network. The visualization module 21 is configured to embed and project each low-dimensional image into a two-dimensional space by using a dimension reduction technique and display a clustering result by using a graph visualization technique.
The multidimensional graph network node clustering processing device 100 firstly converts an original unweighted attribute multidimensional graph network (i.e., the unweighted multidimensional graph network) into a weighted multidimensional graph network through cooperation of all modules, and encodes comprehensive similarity characteristics with attribute similarity and structural similarity between nodes with edges in the conversion, so that the clustering performance of the nodes can be enhanced. Secondly, based on different importance differences exerted by different dimension graph networks on node clustering, namely different information quantities of different dimension graph networks, cross-layer random walk transfer probability is established according to the information quantity difference in a heterogeneous mode, transfer probability in a combination layer is combined, a sampling sequence of each node can be obtained, and the sampling sequence captures neighbor node information of each node. The resulting sample sequence is then converted to low-dimensional embedding using network embedding techniques. And based on the low-dimensional embedding of all the nodes, clustering the nodes by adopting a K-means clustering algorithm to obtain a clustering result of the nodes. Finally, the low-dimensional embedding is projected into the two-dimensional space by adopting a dimension reduction technology, the coordinate value of each node in the two-dimensional space is obtained, the label information of the node is used as color mapping, the node clustering effect is displayed from the visual angle by adopting a graph visualization technology, the purpose of remarkably improving the clustering effect is achieved, the clustering effect is excellent, and the application range of the network embedding technology is expanded.
In one embodiment, the network conversion module 13 includes an attribute sub-module, a structure sub-module, and a conversion sub-module. The attribute submodule is used for determining attribute similarity of the nodes with the connected edges according to the number of similar nodes in the F-dimensional attributes in the attribute vector of the nodes with the connected edges for the graph network of each dimension. The structure submodule is used for determining the structural similarity between the nodes by adopting a structural similarity measurement method. And the conversion submodule is used for adding weight to each unweighted connecting edge in the graph network of each dimension by utilizing the attribute similarity and the structure similarity, and converting the unweighted multidimensional graph network into a weighted multidimensional graph network.
In one embodiment, the structural similarity measure is an RA index measure.
In an embodiment, the migration processing module 15 is configured to, according to the built in-layer transition probability and cross-layer random migration transition probability, perform intra-layer and cross-layer multilayer network random migration processing on the weighted multidimensional graph network, and specifically may be configured to implement the following processing procedures:
carrying out intra-layer biased random walk processing on the weighted multi-dimensional graph network by adopting an embedding method of graph data; node pointTo the nodeProbability of intra-layer transition ofIs composed ofWherein, in the step (A),the calculation method of (c) is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeDistance between, nodesIs a nodeOf a previous node, a nodeIs a nodeThe next-hop node of (1) is,representing nodesAnd nodeWeight of the connecting edge between, parameterAnd parametersA parameter for guiding the random walker to perform biased random walk;
determining cross-layer random walk transfer probability according to the modularity of the graph network of each dimension; cross-layer random walk transition probabilityComprises the following steps:
wherein the content of the first and second substances,representing nodesAnd nodeThe weight of the connecting edge between the two,representing according to nodesAnd nodeDistance between them to measure the nodeSelecting a next hop nodeThe probability of (d); whileA layer jump probability is represented, wherein,(ii) a When the dimension isWhen the modularity of the graph network of (a) is high,a higher value will be set; on the contrary, when the dimension isWhen the modularity of the graph network of (a) is low,a lower value will be set;representing for a dimension ofA network of graphs according toProbability values for modularity settings of the graph network, and similarly,representing for a dimension ofThe probability values set by the modularity of the graph network,representing for a dimension ofThe probability values set by the modularity of the graph network,representing probability values set for a graph network of dimension 1 according to its modularity, which, when higher, will beSet to a higher value, then the random walker will haveJumps to the graph network with dimension 1;
and indicating a random walker to determine the layers and the moving nodes of the traversed weighted multidimensional graph network according to the cross-layer random walk transfer probability, and performing cross-layer random walk.
In an embodiment, the migration processing module 15 is configured to, according to the built in-layer transition probability and cross-layer random migration transition probability, perform intra-layer and cross-layer multilayer network random migration processing on the weighted multidimensional graph network, and specifically, may further be configured to implement the following processing procedures:
of networks of graphs with lower modularityOf networks with differences smaller than the modularity of the networkA difference value;
representing for a dimension ofA network of graphs according toProbability values set by the modularity of the graph network,representing for a dimension ofA network of graphs according toProbability values set by modularity of the graph network.
In an embodiment, the embedded processing module 17 may be specifically configured to implement the following processing steps: dividing a sampling sequence through a window to obtain a training sample sequence of node information; inputting the training sample sequence into a SkipGram model and optimizing an objective function by adopting a random gradient descent method to obtain low-dimensional dense embedding of nodes; the process of optimizing the objective function by the random gradient descent method comprises the steps of sequentially determining conditional probability through conditional independence assumption and symmetry assumption optimization, and obtaining the objective function according to the optimized conditional probability.
In one embodiment, the visualization module 21 may include a first distribution calculation sub-module, a second distribution calculation sub-module, a divergence calculation sub-module, and a presentation sub-module. Wherein the first distribution calculation submodule is used for obtaining the similarity between the nodes according to the low-dimensional embedding calculationDistributing;the distribution is as follows:
wherein the content of the first and second substances,representing nodesSelecting a nodeAs a conditional probability of its close point,representing nodesAnd nodeThe distance between the two or more of the two or more,representing nodesIs the variance of the gaussian distribution of the center point,representing nodesAnd nodeThe distance between the two or more of the two or more,representing nodesSelecting a nodeAs a conditional probability of its close point,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesSelecting a nodeAs a conditional probability of its close point,indicating the number of nodes.
The second distribution calculation submodule is used for measuring nodes in the two-dimensional layout space based on Student-t distributionAndproximity between them, calculatingDistributing;the distribution is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeThe distance in the two-dimensional layout space,andrespectively representing nodesAnd nodeThe coordinate values in the two-dimensional layout space,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,andrespectively representing nodesAnd nodeCoordinate values in a two-dimensional layout space;
divergence calculation submodule for calculatingAre distributed anddistributed byKLDivergence; the calculation formula is as follows:
the display sub-module is used inKLAnd when the divergence stops iterative optimization, obtaining two-dimensional coordinate values of each node, drawing the nodes with the same label by adopting the same color, drawing the nodes with different labels by adopting different colors, and performing cluster display.
For specific limitations of the multidimensional graph network node clustering processing apparatus 100, reference may be made to the corresponding limitations of the above multidimensional graph network node clustering processing method, which is not described herein again. The modules in the multidimensional graph network node clustering processing device 100 can be wholly or partially realized by software, hardware and a combination thereof. The modules may be embedded in a hardware form or a device independent of a specific data processing function, or may be stored in a memory of the device in a software form, so that a processor can call and execute operations corresponding to the modules, where the computing device may be, but is not limited to, various computers existing in the field.
In still another aspect, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor executes the computer program to implement the following steps: converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes; according to the built in-layer transition probability and cross-layer random walk transition probability, carrying out in-layer and cross-layer multilayer network random walk processing on the weighted multi-dimensional graph network to obtain a sampling sequence of each node of the weighted multi-dimensional graph network; converting the sampling sequence of each node into low-dimensional embedding based on the SkipGram model; clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multidimensional graph network; and embedding and projecting each low dimension into a two-dimensional space by adopting a dimension reduction technology and displaying a clustering result by adopting a graph visualization technology.
In one embodiment, the processor, when executing the computer program, may further implement the additional steps or sub-steps in the embodiments of the multidimensional graph network node clustering processing method.
In yet another aspect, there is also provided a computer readable storage medium having a computer program stored thereon, the computer program when executed by a processor implementing the steps of: converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes; according to the built in-layer transition probability and cross-layer random walk transition probability, carrying out in-layer and cross-layer multilayer network random walk processing on the weighted multi-dimensional graph network to obtain a sampling sequence of each node of the weighted multi-dimensional graph network; converting the sampling sequence of each node into low-dimensional embedding based on the SkipGram model; clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multidimensional graph network; and embedding and projecting each low dimension into a two-dimensional space by adopting a dimension reduction technology and displaying a clustering result by adopting a graph visualization technology.
In one embodiment, the computer program, when executed by the processor, may further implement the additional steps or sub-steps in the embodiments of the multidimensional graph network node clustering processing method.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link DRAM (Synchlink) DRAM (SLDRAM), Rambus DRAM (RDRAM), and interface DRAM (DRDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the spirit of the present application, and all of them fall within the scope of the present application. Therefore, the protection scope of the present patent should be subject to the appended claims.
Claims (9)
1. A multidimensional graph network node clustering processing method is characterized by comprising the following steps:
converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes;
according to the built in-layer transition probability and cross-layer random walk transition probability, carrying out in-layer and cross-layer multilayer network random walk processing on the weighted multi-dimensional graph network to obtain a sampling sequence of each node of the weighted multi-dimensional graph network;
converting the sampling sequence of each node into low-dimensional embedding based on a SkipGram model;
clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multi-dimensional graph network;
and embedding and projecting each low-dimensional image into a two-dimensional space by adopting a dimension reduction technology and displaying the clustering result by adopting a graph visualization technology.
2. The method for processing node clusters in a multidimensional graph network according to claim 1, wherein the step of converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structural similarity of the nodes comprises:
for each dimension of the graph network, determining the attribute similarity of the nodes with the connected edges according to the similar number of the nodes with the connected edges in the attribute vector of the nodes with the connected edges in the F-dimension attribute;
determining the structural similarity between nodes by adopting a structural similarity measurement method;
and adding weights to each unweighted connecting edge in the graph network of each dimension by using the attribute similarity and the structural similarity, and converting the unweighted multidimensional graph network into the weighted multidimensional graph network.
3. The method of claim 2, wherein the structural similarity measure is an RA metric measure.
4. The multi-dimensional graph network node clustering processing method according to claim 1, wherein a process of performing intra-layer and inter-layer multi-layer network random walk processing on the weighted multi-dimensional graph network according to the built intra-layer transition probability and cross-layer random walk transition probability includes:
carrying out intra-layer biased random walk processing on the weighted multi-dimensional graph network by adopting an embedding method of graph data; node pointTo the nodeProbability of intra-layer transition ofIs composed ofWhereinThe calculation method of (c) is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeDistance between, nodesIs a nodeOf a previous node, a nodeIs a nodeThe next-hop node of (1) is,representing nodesAnd nodeWeight of the connecting edge between, parameterAnd parametersA parameter for guiding the random walker to perform biased random walk;
determining the cross-layer random walk transfer probability according to the modularity of the graph network of each dimension; the cross-layer random walk transition probabilityComprises the following steps:
wherein the content of the first and second substances,representing nodesAnd nodeThe weight of the connecting edge between the two,representing according to nodesAnd nodeMeasure the distance between the nodesSelecting a next hop nodeThe probability of (a) of (b) being,representing for a dimension ofThe probability values set by the modularity of the graph network,representing for a dimension ofOf a graph networkThe probability value of (a) is determined,representing for a dimension ofThe modularity of the graph network, M is an integer greater than 1,representing a probability value set for a modularity of the graph network with dimension 1;
and indicating the random walker to determine the traversed layers and moving nodes of the weighted multidimensional graph network according to the cross-layer random walk transfer probability, and performing cross-layer random walk.
5. The method according to claim 4, wherein the weighted multidimensional graph network is subjected to intra-layer and inter-layer multi-layer network random walk processing according to the built intra-layer transition probability and cross-layer random walk transition probability, and further comprising:
of networks of graphs with lower modularityOf networks with differences smaller than the modularity of the networkA difference value;
6. The multi-dimensional graph network node clustering processing method according to claim 1, wherein the step of converting the sampling sequence of each node into low-dimensional embedding based on a SkipGram model includes:
dividing the sampling sequence through a window to obtain a training sample sequence of the node information;
inputting the training sample sequence into the SkipGram model and optimizing an objective function by adopting a random gradient descent method to obtain low-dimensional dense embedding of nodes;
the process of optimizing the objective function by the random gradient descent method comprises the steps of sequentially determining conditional probability through conditional independence assumption and symmetry assumption optimization, and obtaining the objective function according to the optimized conditional probability.
7. The method for processing the clustering of the nodes in the multidimensional graph network according to any one of claims 1 to 6, wherein the step of projecting each low-dimensional embedding into a two-dimensional space by using a dimension reduction technique and displaying the clustering result by using a graph visualization technique comprises:
obtaining similarity between nodes according to said low-dimensional embedding calculationDistributing; the above-mentionedThe distribution is as follows:
wherein the content of the first and second substances,represents the ith nodeSelect the jth nodeAs a conditional probability of its close point,representing nodesAnd nodeThe distance between the two or more of the two or more,representing nodesIs the variance of the gaussian distribution of the center point,representing nodesAnd the kth nodeThe distance between the two or more of the two or more,representing nodesSelecting a nodeAs a conditional probability of its close point,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesSelecting a nodeAs a conditional probability of its close point,representing the number of nodes;
student-t distribution based two-dimensional layout space node measurementAnd nodeProximity between them, calculatingDistributing; the above-mentionedThe distribution is as follows:
wherein the content of the first and second substances,representing nodesAnd nodeThe distance in the two-dimensional layout space,andrespectively representing nodesAnd nodeThe coordinate values in the two-dimensional layout space,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,representing nodesAnd nodeOf similarity between themThe distribution of the water content is carried out,andrespectively representing nodesAnd the h nodeCoordinate values in a two-dimensional layout space;
calculating the saidDistribution and saidDistributed byKLDivergence; the calculation formula is as follows:
when saidKLWhen divergence stops iterative optimization, the two-dimensional coordinate value of each node is obtained, and the two-dimensional coordinate value is used for calculating the divergence of each nodeAnd drawing the nodes with the same label by adopting the same color, drawing the nodes with different labels by adopting different colors, and performing cluster display.
8. A multidimensional graph network node clustering processing device is characterized by comprising:
the network conversion module is used for converting the original unweighted multidimensional graph network into a weighted multidimensional graph network according to the attribute similarity and the structure similarity of the nodes;
the migration processing module is used for carrying out multilayer network random migration processing of in-layer and cross-layer on the weighted multidimensional graph network according to the built in-layer transition probability and cross-layer random migration probability to obtain a sampling sequence of each node of the weighted multidimensional graph network;
the embedding processing module is used for converting the sampling sequence of each node into low-dimensional embedding based on a SkipGram model;
the clustering processing module is used for clustering the low-dimensional embedding of each node by adopting a K-means algorithm to obtain a clustering result of each node of the weighted multi-dimensional graph network;
and the visualization module is used for embedding and projecting each low-dimensional image into a two-dimensional space by adopting a dimension reduction technology and displaying the clustering result by adopting a graph visualization technology.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program performs the steps of the method for cluster processing of nodes of a multidimensional graph network as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110645181.3A CN113254717A (en) | 2021-06-10 | 2021-06-10 | Multidimensional graph network node clustering processing method, apparatus and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110645181.3A CN113254717A (en) | 2021-06-10 | 2021-06-10 | Multidimensional graph network node clustering processing method, apparatus and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113254717A true CN113254717A (en) | 2021-08-13 |
Family
ID=77187250
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110645181.3A Pending CN113254717A (en) | 2021-06-10 | 2021-06-10 | Multidimensional graph network node clustering processing method, apparatus and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113254717A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113729686A (en) * | 2021-09-23 | 2021-12-03 | 南京航空航天大学 | Brain local function dynamic real-time measurement system |
CN114826921A (en) * | 2022-05-05 | 2022-07-29 | 苏州大学应用技术学院 | Network resource dynamic allocation method, system and medium based on sampling subgraph |
CN114819971A (en) * | 2022-04-22 | 2022-07-29 | 支付宝(杭州)信息技术有限公司 | Wind control method based on multi-dimensional relational data, graph clustering method and device |
CN115631799A (en) * | 2022-12-20 | 2023-01-20 | 深圳先进技术研究院 | Sample phenotype prediction method and device, electronic equipment and storage medium |
-
2021
- 2021-06-10 CN CN202110645181.3A patent/CN113254717A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113729686A (en) * | 2021-09-23 | 2021-12-03 | 南京航空航天大学 | Brain local function dynamic real-time measurement system |
CN113729686B (en) * | 2021-09-23 | 2023-12-01 | 南京航空航天大学 | Brain local function dynamic real-time measurement system |
CN114819971A (en) * | 2022-04-22 | 2022-07-29 | 支付宝(杭州)信息技术有限公司 | Wind control method based on multi-dimensional relational data, graph clustering method and device |
CN114826921A (en) * | 2022-05-05 | 2022-07-29 | 苏州大学应用技术学院 | Network resource dynamic allocation method, system and medium based on sampling subgraph |
CN114826921B (en) * | 2022-05-05 | 2024-05-17 | 苏州大学应用技术学院 | Dynamic network resource allocation method, system and medium based on sampling subgraph |
CN115631799A (en) * | 2022-12-20 | 2023-01-20 | 深圳先进技术研究院 | Sample phenotype prediction method and device, electronic equipment and storage medium |
CN115631799B (en) * | 2022-12-20 | 2023-03-28 | 深圳先进技术研究院 | Sample phenotype prediction method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113254717A (en) | Multidimensional graph network node clustering processing method, apparatus and device | |
Refenes et al. | Exploratory data analysis by the self-organizing map: Structures of welfare and poverty in the world | |
Nepusz et al. | Fuzzy communities and the concept of bridgeness in complex networks | |
Fischer et al. | Bagging for path-based clustering | |
CN109389151B (en) | Knowledge graph processing method and device based on semi-supervised embedded representation model | |
Gorban et al. | Principal manifolds and graphs in practice: from molecular biology to dynamical systems | |
CN108171010B (en) | Protein complex detection method and device based on semi-supervised network embedded model | |
CN108764726B (en) | Method and device for making decision on request according to rules | |
KR101866522B1 (en) | Object clustering method for image segmentation | |
Astudillo et al. | Imposing tree-based topologies onto self organizing maps | |
CN110379521A (en) | Medical data collection feature selection approach based on information theory | |
Meng et al. | A new quality assessment criterion for nonlinear dimensionality reduction | |
Tripathy et al. | A Study of Algorithm Selection in Data Mining using Meta-Learning. | |
Kumar et al. | Comparative analysis of SOM neural network with K-means clustering algorithm | |
Basto-Fernandes et al. | A survey of diversity oriented optimization: Problems, indicators, and algorithms | |
Gao et al. | A soft-sensor model of VCM rectification concentration based on an improved WOA-RBFNN | |
Király et al. | Geodesic distance based fuzzy c-medoid clustering–searching for central points in graphs and high dimensional data | |
CN110232151A (en) | A kind of construction method of the QoS prediction model of mixing probability distribution detection | |
Li et al. | Unsupervised domain adaptation via discriminative feature learning and classifier adaptation from center-based distances | |
Sakri et al. | Analysis of the dimensionality issues in house price forecasting modeling | |
CN109409415A (en) | A kind of LLE algorithm kept based on global information | |
Rafi et al. | Optimal fuzzy min-max neural network (fmmnn) for medical data classification using modified group search optimizer algorithm | |
Xun et al. | Sparse estimation of historical functional linear models with a nested group bridge approach | |
Mohammadi et al. | An enhanced noise resilient K-associated graph classifier | |
Meng et al. | Passage method for nonlinear dimensionality reduction of data on multi-cluster manifolds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210813 |
|
RJ01 | Rejection of invention patent application after publication |