CN102194149B - Community discovery method - Google Patents

Community discovery method Download PDF

Info

Publication number
CN102194149B
CN102194149B CN 201010122852 CN201010122852A CN102194149B CN 102194149 B CN102194149 B CN 102194149B CN 201010122852 CN201010122852 CN 201010122852 CN 201010122852 A CN201010122852 A CN 201010122852A CN 102194149 B CN102194149 B CN 102194149B
Authority
CN
China
Prior art keywords
community
node
search
network
find
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201010122852
Other languages
Chinese (zh)
Other versions
CN102194149A (en
Inventor
韩毅
李爱平
贾焰
韩伟红
杨树强
周斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN 201010122852 priority Critical patent/CN102194149B/en
Publication of CN102194149A publication Critical patent/CN102194149A/en
Application granted granted Critical
Publication of CN102194149B publication Critical patent/CN102194149B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a community discovery method which comprises the following steps: delimiting a search region in the scale range of a to-be-discovered community in a social network; pruning in the search region according to the number of neighbor nodes of a node, pruning a node with a neighbor node number less than the closeness of the to-be-discovered community from the social network; selecting one node in the rest nodes in the pruned social network, searching a community with the size of absolute value of S-1 in the neighbor node of the node, then forming the to-be-discovered community by combining the node and the searched community with the size of absolute value of S-1, adding the to-be-discovered community in a result set, wherein the absolute value of S represents the size of the community which is desired to discover; and moving the left margin of the searched region to left, and then repeatedly executing the previous steps in an enlarged search region until the search region achieves the minimum value of the scale of the to-be-discovered community.

Description

Community discovery method
Technical field
The present invention relates to network calculations, particularly a kind of community discovery method.
Background technology
Community network (Social Networking, be called for short SN) is a kind of relational network of connecting each other between each individuals in the society of being used for representing.Very popular Facebook, Twitter etc. can be considered community network in the actual life.Community network can be represented with matrix method or graphic interpretation.In graphic interpretation, represent a certain individuality in the society with the node among the figure, with the contact between internodal chained representation individuality, use the size that links to represent the tightness degree of getting in touch between individuality.
Getting in touch in individuality in the community network and the network between other individuality exists closely and sparse difference, with those have the group of individuals that is closely connected and are called community in the community network.There is bigger difference in each community in the community network on such as attributes such as scale, density; Among the application those are called typical communities with the bigger community of other community's difference on some or some attribute; As a community on scale greater than other community, then this community is considered to typical communities.From community network, find out typical communities' (also be called as find community), particularly in large scale community network, find out typical communities, have great importance for many application (like the directed input of advertisement, internet public feelings discovery etc.) of community network.
There has been the correlation technique of from community network, finding community in the prior art.For example, excavate the method for frequent structure in the network.These class methods mainly are in network, to obtain frequent minor structure through search, and resulting frequent minor structure is commonly used to find one group of community with fixed communication behavior, or are used for finding in field of bioinformatics the effect structure etc. of protein.These class methods typically have based on the method for Apriori with based on the method for FP-Grow.These class methods are intended to the community that finds that structure is fixed and the frequency of occurrences is high, but and are not suitable for being used for finding typical communities.
And for example, find the method for community through the maximum minor structure (like complete graph) that meets certain condition in the discovery network.In graph theory; Determine whether that the problem already that satisfies the complete subgraph of a specific size or excavate maximum complete subgraph just was proved to be in 1972 to being one of 21 np complete problems (seeing also list of references 1 " R.M.Karp.Reducibility among combinatorialproblems.Complexity of Computer Computations; 1972 "); Therefore; These class methods have the high defective of computational complexity, are difficult in large scale network, find community with these class methods.
And for example, based on the community network community discovery method of density and entropy.These class methods mainly are to adopt the method for density or entropy, adopt connection matrix or vector that network is carried out modeling, through the higher dimensional matrix computing, find the part of network middle-high density.These class methods may focus on the Hub node that has a large amount of neighbours with community; Or it is big to focus on Connection Density, and the network portion that weights are high possibly ignored the sparse part of network to some extent; And these class methods often need be provided with a threshold value, and density or entropy are retrained.Because these class methods only find to this characteristic of Connection Density, so the community that these class methods are found mainly concentrates on the big part of Connection Density, ignored the little part of density.In fact in community network, everybody can be present in the specific community environment, and the sparse part of density helps forming larger community all the better.Therefore, these class methods also not too are fit to be used for finding community.
Again for example, the method for content-based constraint.These class methods mainly are through the additional information on the network (the text label information of carrying like network linking) being analyzed, carried out cluster through the build-in attribute to node, thereby obtain the community information of network.This method requires data set itself to have text or label substance label, but owing to privacy reason or data itself, much data sets do not have text label or content tab, therefore, and being of limited application of this class methods.
In a word; Though there is polytype method of finding community from community network in the prior art; But these methods have only been utilized some attributes of community in the process of finding community; Like density or structure frequent degree etc., therefore be not suitable for from community network, finding typical communities, the community that particularly on two above attributes, has typicalness.
Summary of the invention
The objective of the invention is to overcome the defective that community discovery method of the prior art is not suitable for from community network, finding typical communities, thereby a kind of community discovery method is provided.
To achieve these goals, the invention provides a kind of community discovery method, comprising:
Step 1), in a community network, according to find that the scale scope of community delimit a region of search; Wherein, The left margin L of said region of search is the size of the maximum community of current expectation discovery; Coboundary U is the neighbours' number that has maximum neighbours' node in the said community network, and right margin is β (L-1) for
Figure GSA00000030860300021
lower boundary; Described β representes a predefined ratio;
Step 2), the neighbor node number according to node in described region of search does cut operator, with the neighbor node number less than the node of tight ness rating of the community that will find from community network, wipe out;
Step 3), in through the residue node of the community network of cut operator selected node; Search-size does in the neighbor node of this node | the community of S|-1; After finding do with this node and the size that searches | the community of S|-1 form the community that will find, be added in the result set; Wherein, said | S| representes to expect the size of the community that finds;
Step 4), the left margin of said region of search is moved to the left, then execution in step 2 again in the region of search after expansion) and step 3), reach up to the region of search to find the minimum value of the scale of community.
In the technique scheme, described step 2) comprising:
Step 2-1), selected undressed node in community network;
Step 2-2), the node degree of judging selected node whether be less than or equal to β (| S|-1), if this node is deleted from community network; Carry out next step then; Otherwise, execution in step 2-1 again), the node in community network all is processed;
Step 2-3), the node degree of judging the neighbor node of selected node whether be less than or equal to β (| S|-1); If words; This node is deleted from community network, then to being repeated this step by the neighbor node of deletion of node, the judgement of all nodes in accomplishing community network.
In the technique scheme, described step 3) comprises:
Step 3-1), selected undressed node in community network;
Step 3-2), search-size does in the neighbor node of this node | the community of S|-1; If can find, then do with this node and the size that searches | the community of S|-1 form the community that will find, and be added in the result set; If can not find, execution in step 3-1 again);
Step 3-3), in the neighbor node of selected node, accomplish all community's search operations after, should from community network, delete by selected node, then execution in step 3-1 again), up to completion to community network in the processing of all nodes.
In the technique scheme, at described step 3-1) in, when selecting undressed node, the minimum node of node degree begins to select from said community network.
In the technique scheme, the size of described β value is between 0 to 1.
In the technique scheme, in described step 4), when the left margin of said region of search was moved to the left, mobile size was 1.
The invention has the advantages that:
1, the present invention can either find to connect community closely, can find large-scale community again.
2, the present invention can both find community on the network of any Connection Density.
3, the present invention is owing to owing to adopted preferred beta pruning strategy, can effectively reduce executive overhead, thereby can on ultra-large network, find community.
Description of drawings
Fig. 1 is that a scale is the synoptic diagram of the community of 6 β=80%;
Fig. 2 is the synoptic diagram of 3 Skyband for width;
Fig. 3 is the synoptic diagram that is used to explain tight ness rating and dimension constraint;
Fig. 4 is the synoptic diagram that is used to explain the search volume;
Fig. 5 (a)-Fig. 5 (c) is the synoptic diagram of region of search conversion;
Fig. 6 is the process flow diagram of the inventive method.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is explained.
Before the inventive method is elaborated, at first the notion among the present invention is done unified explanation, so that understand.
Figure: the figure among the present invention is used to represent community network, and it adopts G=(V, E; D) expression, wherein V representes the set of node v, i.e. v ∈ V; E representes the set of internodal limit (link) e; Be e ∈ E, D (e) is the mark function on the e of limit, two node u that representative edge e is connected, the distance of v on figure.Distance among the figure is not limited only to the individual distance geographically of society in the community network, can also be used to represent the close relation degree between society individual (like the people).For example, in communication network, can describe person to person relation,, then think their close relation if two people communicate by letter frequently with the frequency of communication.
Node degree (degree): representative of the node degree of a node v and the direct-connected number of nodes of node v, promptly G=(V, E, D) in, spend and do | { u|{u, v} ∈ E}| can be expressed as d (v).
Arest neighbors (Nearest Neighbors, be called for short NN): figure G=(V, E, D) in, the arest neighbors of a node u (u ∈ V) can formalization representation do NN ( u ) = { v ∈ V | ∃ v ′ ∈ V : D ( u , v ) > D ( u , v ′ ) } , Its expression NN (u) is the arest neighbors set of node u (u ∈ V), outside NN (u), does not exist and node that node u distance is nearer.
K neighbour (k-Nearest Neighbors is called for short k-NN): similar with the definition of NN (u), the k neighbour set of kNN (u) representation node u; Promptly for natural number k >=1; KNN (u) expression can't find k node v ' (v ' ∈ V) make v ' satisfy D (u, v)>D (u, v ').KNN (u) has represented node u nearest k node in network G.It should be noted that if u ∈ is kNN (v), then u ∈ (k+1) NN is (v); Otherwise it is quite different.
Use the k neighbor relationships that relation between the individuality in the community network has been weighed following advantage:
1, the k neighbor relationships is adaptive relation, and in whole network, for connecting part closely, result is the most closely returned in k neighbour inquiry; Connecting sparse part, the k neighbour still can return result relatively closely.
2, the k neighbour is the controlled inquiry of scale, and no matter how complicated a people's social relationships are, and how huge scale is, and the k neighbor relationships is only paid close attention in its relation k companion the most closely all the time, can control through community's scale of k neighbor relationships discovery.
Complete graph: (V, E is D) with set of node S for G= ( S ⊆ V ) , If any two the node u among the set of node S, v have their fillet to belong to E, promptly { then S is a complete graph for u, v} ∈ E.
K community (k-cliques): figure G=(V, E, D) in, for a natural number k (k>=1), if set of node S ( S ⊆ V ) In any two node u, v satisfy u ∈ kNN (v), and v ∈ kNN (u), then S be a k community.That is to say that in this community, any two nodes all satisfy the arest neighbors relation.K community is a complete graph.
Tight ness rating: if S is a k community, rather than (k-1) community, then the tight ness rating of the S of community is k, is designated as closeness (S)=k.The meaning of tight ness rating is a k value maximum in a community, thinks on the ordinary meaning that tight ness rating is more little, and then the relation of community is tight more.Tight ness rating is in the k-community, the minimum value of the maximum k in all kNN relations.
Can find out that based on above-mentioned definition the size of a k community can not surpass k+1 (distance is a continuous variable, and does not have the absolute distance that equates between body and its neighbours one by one, does not promptly have the situation of arranging k side by side).According to the definition of k community, S is actually the limit of a figure G and satisfies the complete subgraph that kNN retrains.But in graph theory, determine whether to satisfy the complete subgraph of a specific size, perhaps excavate maximum complete subgraph and have very high computational complexity, so the present invention uses kNN to control the scale of community when doing community discovery.Yet, using complete subgraph to define community and may bring difficulty the discovery of community, the complete subgraph that scale is n contains C n 2The two-way limit of bar, and in actual conditions is a difficulty very if hope to find community that the kNN constraint is all satisfied on larger and every limit fully, for example, find that a scale is 30 community, needs C in this community 30 2The kNN annexation is satisfied on the bar limit, and obviously, such computing scale is very huge.So, following community's definition based on the quasi-full subgraph has been proposed in the present invention.
(β, k)-((β, k)-cliques): (V, E is D) with complete graph S at figure G=in community ( S ⊆ V ) In, for any node v (v ∈ S), can in S, find β except that v (| S|-1) individual node u, make u ∈ kNN (v), then S be one (β, k)-community.If S be one (β, k)-community, rather than one (β, k-1)-community, then closeness (S)=k.| S| represents the quantity of node among the S.
The quasi-full subgraph require one (β, k)-community in, for any node, as long as its neighbor node shared ratio in total neighbor node number of this node that satisfies the kNN constraint reaches β.The requirement for figure has been loosened in such definition, and target is to find bigger community.A scale be shown be the community of 6 β=80% among Fig. 1.
Dominance relation (Domination): if S 1, S 2All be (β, k)-community, if closeness is (S 1)≤closeness (S 2) and | S 1|>=| S 2| (two equal signs are not set up simultaneously), then S 1Domination S 2, be designated as
Figure GSA00000030860300061
M-Skyband:m-skyband be all communities that satisfy following condition (β, k)-set of community: for arbitrarily (β, k)-S of community ( S ⋐ V ) , If in network G, can't find m to be different from S and satisfied
Figure GSA00000030860300063
The S ' of community, just say that S is on the Skyband of m at width, is designated as S ∈ m-skyband.
The community that satisfies above-mentioned constraint has not only scale greatly but also connect characteristics closely.Can find out that from above-mentioned definition if width m=1, then no any community can arrange the community among the Skyband in whole network.Width be shown be the synoptic diagram of 3 Skyband among Fig. 2.Stain is represented a community in the network, and coordinate is represented tight ness rating and two attributes of scale of corresponding community.If the S of community 1Domination S 2, so in the drawings, S 1Can appear at S 2Lower right-most portion.As can be seen from Figure 2, there are S ' of community and S in the lower right of the S of community ", therefore that arranges the S of community among this figure has S ' and a S ".The quantity of community of the domination S of community is less than 3, so the S of community is in width is 3 Skyband.Any other point among this figure is same so, promptly in width is 3 Skyband, can not find other point that surpasses 2 on closeness and size attribute, to arrange this point simultaneously.
After having described the above-mentioned key concept among the present invention, explain in the face of concrete performing step of the present invention down.
The present invention will find not only scale greatly but also connect typical communities closely in community network, in community network, scale big with is connected closely between be the relation of mutual restriction, in general, big more community, the connection tight ness rating is also just low more.Therefore, in the process of finding typical communities, need give comprehensive consideration to above-mentioned two factors.In conjunction with the related definition of aforementioned m-skyband, the present invention is if there is community can satisfy m-skyband constraint, and then this community is exactly that not only scale had been greatly but also connect community closely in the community network.That is to say, the present invention will from community network, find to satisfy simultaneously scale big be connected the problem of community closely and be converted into problem how from community network, to find to satisfy the community that m-skyband retrains.
In front to (β k)-mention in the related description of community, defines community with complete subgraph and can cause very big difficulty to the discovery of community; And relevant computing scale is also very huge; Therefore, among the present invention the community that will find all be (β, k)-community.In addition, in order to improve the efficient of community discovery, avoid some unnecessary calculation process in the computation process, the present invention has done cut operator to the node among the figure in the process that realizes community discovery, and in the search volume that limits, realizes the search to community.
For the ease of the principle of understanding cut operator and in the limit search space, searching for,, propose to be applicable to three attributes of the present invention earlier at this according to the definition of related notion noted earlier.
Character one (node degree): if node u belong to one (β, k)-S of community (scale of the S of community does | S|), so node degree d (u) >=β of node u (| S|-1), and d (u) >=closeness (S).
Above-mentioned character one can be by (β, k)-definition of community releases, above-mentioned character can be used in search, do cut operator.That is, when the degree of a node during less than the requiring of character one, then this node can not get into the community that requires size, thus can directly be eliminated, and need not it is calculated the kNN operation.
Can also further derive by top character one: in search, if the degree of node u just satisfies the requirement of character one, promptly d (u)=β (| S|-1), and the neighbours v of u is by beta pruning, and then u also can be by beta pruning.Above character can utilize reduction to absurdity to prove, if promptly u can get into a size and does | the community of S| because d (u)=β (| S|-1), then according to (β, k)-community's definition, in S, must have be no less than β (| S|-1) individual neighbours are kNN of u, and promptly v also one fixes among the S; If v not in S, can prove that u is not also in S, so u can be by beta pruning.
Character two (tight ness rating and dimension constraint): for any tight ness rating be w (β, k)-S of community, have k >=β (| S|-1), according to the definition of tight ness rating, have closeness (S) >=β (| S|-1).Can release so Closeness ( S ) β + 1 ≥ S .
According to above-mentioned character, be tight ness rating at ordinate shown in Figure 3, horizontal ordinate is among the figure of size, all communities all can only appear at dotted line top.Can find out tentatively also that from this character the search volume that is used to search for community is limited at more than the diagram oblique line, and if have two u of community, v, if u domination v, then u necessarily than v more near oblique line.
Character three: for individual v and parameter k, if among the neighbours of v m node formed one (β, k)-S of community, so S ∪ v} also be one (β, k)-community.
According to character three, (β k)-during community, search in the kNN set that only needs at a node and gets final product in search.
Below with reference to figure 6 and combine a concrete example that method of the present invention is explained.
In one embodiment; A mobile communication network is arranged, and the node in this network is represented cell-phone number, has 100000 altogether; Limit between the node is the communications records between the cell-phone number; If between two cell-phone numbers communication was arranged, just between the node of representing these two cell-phone numbers, connect a limit, the communication frequency between the distance expression cell-phone number between the node.Now will be from this mobile communication network the community of discovery scale 50 to 100.Before concrete performing step, also to set the β value.The size of β value is set between 0.5~1 usually, and concrete value is decided according to the actual requirements, for example, if find friend's circle of mobile communication person, so the β value should obtain higher, as more than 0.8.If but hope to find mobile communication person's business relations circle, the β value just could be established relatively lowly so, as could being 0.5.In the present embodiment, can the β value be set at 0.8.Explain in the face of the concrete performing step of community discovery down.
At first, distribute (degree distributes) because the node degree of the node in the community network usually satisfies long-tail, promptly the minority node has more neighbor node; Neighbours' number of most of node is less and level off to a constant; Therefore, if set a higher size threshold, then most of node all can be wiped out; Help reducing workload, improve counting yield.So the present invention at first sets a bigger scale access threshold, begin to find from larger community; Separate (skyband) in the hope of obtaining part, after the whole discoveries of the community that scale meets the demands finish, progressively loosen the scale threshold again; With the expansion region of search, thereby obtain to separate fully.In conjunction with present embodiment, can earlier community's scale be limited between 90 to 100.
Can know by aforesaid character two; On the basis of above-mentioned value, can generate a search volume; The left margin of this search volume is 90; The coboundary is 150, and (value of coboundary can not surpass the neighbor node number of the node that has maximum neighbor nodes; Neighbor node number maximum in this hypothesis is 150) lower boundary be 71.2 (calculated by β (L-1), wherein the value of left margin L is 90, the value of β be the front set 0.8); Right margin is 188.5 (calculated by
Figure GSA00000030860300081
, wherein U represents the coboundary).Provide a synoptic diagram of search volume among Fig. 4, after obtaining current search volume, just can in this search volume, accomplish cut operator and community's decision operation.
The principle of described cut operator has had corresponding explanation in the description of character one.For example, the neighbor node number of some nodes is 40 in community network, and obviously its neighbor node number can't become the node of candidate community less than 71.2 (being β (L-1)), and therefore this node can be wiped out in this search.According to character one, after obtaining this node of for a moment being wiped out, whether the neighbor node that continues this node of investigation can satisfy character one, same, if can't satisfy, is just wiped out, otherwise keeps.Neighbor node recurrence to being wiped out node is carried out above-mentioned cut operator, the correlated judgment work of all nodes in accomplishing community network.Can significantly reduce the number of nodes in the community network through above-mentioned cut operator, for example, aforementioned 100000 nodes possibly only be left 8000 nodes in the community network after via cut operator, and this will help reducing the workload of follow-up work.
After accomplishing cut operator, will in the current search space, do the community discovery operation.In preamble, mention, can find in the current search space that the scale of community has certain restriction, in the present embodiment, community's scale is limited between 90 to 100.Suppose that in the operation of community discovery wanting search-size is 100 community, then the search procedure to the type community is following:
At first in the residue node of the community network that passes through cut operator, select a node; Whether judgement can the discovery scale in the neighbor node of this node be 99 community; If can find; Then with this node and aforementioned scale be 99 community to form a scale be 100 community, if can not find, then in community network, reselect another node and continue search.The neighbor node number of selected node usually can be greater than 99, and if any 105 of neighbor nodes, at this moment, the aforesaid scale of from neighbor node, finding out is that the action need of 99 community is C 105 99Inferior.
When accomplish all possible community discovery operation for the neighbor node of selected node after, should select node and from community network, delete.Because all communities that comprise this node were all traveled through, therefore this knot removal is helped the double counting of avoiding follow-up.
From the residue node of community network, select a node then again, repeat aforesaid operations, in community network, no longer include node degree and surpass 99 node.
As a kind of preferred implementation, when selecting node, can begin from the minimum node of node degree.Still be example with top example, at first delete those node degrees less than 99 node, traversal just in time has the node of 99 neighbor nodes afterwards; After a node traveled through, extract this node immediately and (for example extractd a degree and be 99 node, must have the degree of 99 other nodes can to subtract 1 so; Do like this and can greatly improve performance); Traversal just in time has the node of 100 neighbor nodes again, and the rest may be inferred, the operation of all nodes in accomplishing community network.
More than be that 100 community is an example with scale, community's search procedure is explained, but those of ordinary skills should understand, like this equally to the search of other scale community.
In above-mentioned search procedure, mentioned community's decision operation, explain in the face of the concrete realization of community's decision operation down.
So-called community decision operation is meant after n point arranged, judges whether this n point is a community.When judging, get two node u, v at first arbitrarily, if { u, v} do not belong to G.E to the limit between the u, v, then are not communities.If any two members satisfy above-mentioned condition, then this n point is a community.Calculate the closeness value of this community then,, other members sorted according to the distance to u for any member u in the community, remember β (| S|-1) rank of name in the neighbours of all u is k (u).Calculate all members' k (u) value, wherein maximum k (u) is worth as the closeness value.
After community's search work of accomplishing the current search space, enlarge the region of search, the left margin that is about to the region of search moves to left, and in newly-generated region of search, does the search work of community then.For example, the scope of the search volume that preamble is mentioned is that left margin is 90, and the coboundary is 150, and lower boundary is 71.2, and right margin is 188.5.After the region of search is enlarged; The scope of newly-generated search volume is that left margin is 89 (on the basis of former left margin, subtracting 1); The coboundary is 150, and lower boundary is that 70.4 (calculated by β (L-1), wherein the value of left margin L is 89; The value of β be the front set 0.8), right margin is 188.5.
In new search volume, carry out aforesaid cut operator and community's search operation again, newfound community is joined in the result set.Repeat the operation that the search volume enlarges, reached up to the search volume the border of scale of the community that will find.
Because previous embodiment is comparatively complicated, understand the conversion process of region of search for ease, below with one comparatively simple example the conversion of region of search is described.In example shown in Figure 5, find that width is 2 skyband.In Fig. 5 (a), in the current search zone, search for community, and generate the community that satisfies condition.In Fig. 5 (b), enlarge the region of search then, even left margin left=left-1, and in new region of search, continue search.If in the region of search, find a new community according to the explanation of front; Whether check among the current set skyband of community that satisfies condition exists 2 U of community, V to arrange S; If, then dwindle the upper bound, region of search and arrange regional to U, V like Fig. 5 (c); If not, with new search to the S of community join among the skyband; Upgrade skyband, and check whether the new community that adds arranges other communities among the skyband, if check then whether this domination surpasses 2; If explain that this community can not eliminate this community in 2-skyband; Continue to calculate other communities, if not, then remain unchanged.
It should be noted last that above embodiment is only unrestricted in order to technical scheme of the present invention to be described.Although the present invention is specified with reference to embodiment; Those of ordinary skill in the art is to be understood that; Technical scheme of the present invention is made amendment or is equal to replacement, do not break away from the spirit and the scope of technical scheme of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (5)

1. community discovery method comprises:
Step 1), in a community network, according to find that the scale scope of community delimit a region of search; Wherein, The left margin L of said region of search is the size of the maximum community of current expectation discovery; Coboundary U is the neighbours' number that has maximum neighbours' node in the said community network, and right margin is β (L-1) for
Figure FDA00001978607200011
lower boundary; Described β representes that for arbitrary node in the community it satisfies neighbor node shared ratio in total neighbor node number of this node of k neighbour constraint;
Step 2), the neighbor node number according to node in described region of search does cut operator, with the neighbor node number less than the node of tight ness rating of the community that will find from community network, wipe out; Described step 2) comprising:
Step 2-1), selected undressed node in community network;
Step 2-2), the node degree of judging selected node whether be less than or equal to β (| S|-1), if this node is deleted from community network; Carry out next step then; Otherwise, execution in step 2-1 again), the node in community network all is processed; Wherein, said | S| representes to expect the size of the community that finds;
Step 2-3), the node degree of judging the neighbor node of selected node whether be less than or equal to β (| S|-1); If words; This node is deleted from community network; Then to being repeated this step by the neighbor node of deletion of node, the judgement of all nodes in accomplishing to community network;
Step 3), in through the residue node of the community network of cut operator selected node; Search-size does in the neighbor node of this node | the community of S|-1; After finding do with this node and the size that searches | the community of S|-1 form the community that will find, be added in the result set;
Step 4), the left margin of said region of search is moved to the left, then execution in step 2 again in the region of search after expansion) and step 3), reach up to the region of search to find the minimum value of the scale of community.
2. community discovery method according to claim 1 is characterized in that, described step 3) comprises:
Step 3-1), selected undressed node in community network;
Step 3-2), search-size does in the neighbor node of this node | the community of S|-1; If can find, then do with this node and the size that searches | the community of S|-1 form the community that will find, and be added in the result set; If can not find, execution in step 3-1 again);
Step 3-3), in the neighbor node of selected node, accomplish all community's search operations after, should from community network, delete by selected node, then execution in step 3-1 again), up to completion to community network in the processing of all nodes.
3. community discovery method according to claim 2 is characterized in that, at described step 3-1) in, when selecting undressed node, the minimum node of node degree begins to select from said community network.
4. community discovery method according to claim 1 is characterized in that the size of described β value is between 0 to 1.
5. community discovery method according to claim 1 is characterized in that, in described step 4), when the left margin of said region of search was moved to the left, mobile size was 1.
CN 201010122852 2010-03-01 2010-03-01 Community discovery method Expired - Fee Related CN102194149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010122852 CN102194149B (en) 2010-03-01 2010-03-01 Community discovery method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010122852 CN102194149B (en) 2010-03-01 2010-03-01 Community discovery method

Publications (2)

Publication Number Publication Date
CN102194149A CN102194149A (en) 2011-09-21
CN102194149B true CN102194149B (en) 2012-12-05

Family

ID=44602184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010122852 Expired - Fee Related CN102194149B (en) 2010-03-01 2010-03-01 Community discovery method

Country Status (1)

Country Link
CN (1) CN102194149B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103327092A (en) * 2012-11-02 2013-09-25 中国人民解放军国防科学技术大学 Cell discovery method and system on information networks
CN103325061B (en) * 2012-11-02 2017-04-05 中国人民解放军国防科学技术大学 A kind of community discovery method and system
CN103914493A (en) * 2013-01-09 2014-07-09 北大方正集团有限公司 Method and system for discovering and analyzing microblog user group structure
CN103150350B (en) * 2013-02-18 2016-01-27 北京邮电大学 A kind of method and apparatus building relational network
CN104503997B (en) * 2014-12-05 2017-12-26 北京百度网讯科技有限公司 Colleague's localization method, device and computer equipment
CN106708844A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 User group partitioning method and device
CN107171838B (en) * 2017-05-18 2018-04-13 陕西师范大学 A kind of Web content based on limited content backup reconstructs method for optimizing
CN108804516B (en) * 2018-04-26 2021-03-02 平安科技(深圳)有限公司 Similar user searching device, method and computer readable storage medium
CN108959453B (en) * 2018-06-14 2021-08-27 中南民族大学 Information extraction method and device based on text clustering and readable storage medium
CN111274457B (en) * 2020-02-03 2023-12-19 中国人民解放军国防科技大学 Network graph segmentation method and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149756A (en) * 2007-11-09 2008-03-26 清华大学 Individual relation finding method based on path grade at large scale community network
JP2009116844A (en) * 2007-10-19 2009-05-28 Nec Corp Electronic computer and program for calculating social network structural model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10229389B2 (en) * 2008-02-25 2019-03-12 International Business Machines Corporation System and method for managing community assets

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009116844A (en) * 2007-10-19 2009-05-28 Nec Corp Electronic computer and program for calculating social network structural model
CN101149756A (en) * 2007-11-09 2008-03-26 清华大学 Individual relation finding method based on path grade at large scale community network

Also Published As

Publication number Publication date
CN102194149A (en) 2011-09-21

Similar Documents

Publication Publication Date Title
CN102194149B (en) Community discovery method
CN107153713B (en) Overlapping community detection method and system based on similitude between node in social networks
Liu et al. Detecting the evolving community structure in dynamic social networks
US10078802B2 (en) Method and system of discovering and analyzing structures of user groups in microblog
CN106503148B (en) A kind of table entity link method based on multiple knowledge base
CN109783628B (en) Method for searching KSAARM by combining time window and association rule mining
CN103514191A (en) Method and device for determining keyword matching mode of target popularization information
CN110719106B (en) Social network graph compression method and system based on node classification and sorting
CN102799625B (en) Method and system for excavating topic core circle in social networking service
CN107392782A (en) Corporations' construction method, device and computer-processing equipment based on word2Vec
CN116340646A (en) Recommendation method for optimizing multi-element user representation based on hypergraph motif
Valero-Mas et al. On the suitability of Prototype Selection methods for kNN classification with distributed data
CN104700311B (en) A kind of neighborhood in community network follows community discovery method
CN107240028B (en) Overlapped community detection method in complex network of Fedora system component
CN116383519A (en) Group recommendation method based on double weighted self-attention
CN111177578A (en) Search method for most influential community around user
CN104765852A (en) Data mining method based on fuzzy algorithm under big data background
CN116450938A (en) Work order recommendation realization method and system based on map
John et al. Dynamic sorting and average skyline method for query processing in spatial-temporal data
CN112035545B (en) Competition influence maximization method considering non-active node and community boundary
CN115757897A (en) Intelligent culture resource recommendation method based on knowledge graph convolution network
Nawaz et al. Collaborative similarity measure for intra graph clustering
CN113535956A (en) News hotspot prediction method based on medium contribution degree
Chen et al. Research and application of cluster analysis algorithm
Fan et al. MiniPFL: Mini federations for hierarchical personalized federated learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121205

Termination date: 20140301