CN103678474B - A kind of method of a large amount of hot issue of quick obtaining in social networks - Google Patents

A kind of method of a large amount of hot issue of quick obtaining in social networks Download PDF

Info

Publication number
CN103678474B
CN103678474B CN201310440419.4A CN201310440419A CN103678474B CN 103678474 B CN103678474 B CN 103678474B CN 201310440419 A CN201310440419 A CN 201310440419A CN 103678474 B CN103678474 B CN 103678474B
Authority
CN
China
Prior art keywords
event
state
user
cover
good friend
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310440419.4A
Other languages
Chinese (zh)
Other versions
CN103678474A (en
Inventor
王灿
王哲
金家禾
卜佳俊
陈纯
何占盈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201310440419.4A priority Critical patent/CN103678474B/en
Publication of CN103678474A publication Critical patent/CN103678474A/en
Application granted granted Critical
Publication of CN103678474B publication Critical patent/CN103678474B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of method of a large amount of hot issue of quick obtaining in social networks, captures the forwarding record of " state " that user issues in social networks;Being clustered all " state " contents by clustering algorithm, each class is defined as an event;Record is forwarded by analyzing " state ", for targeted customer, can be with K good friend of cover-most event in its friend group is chosen at the shortest time;This K good friend is placed on specific good friend and is grouped interior, it is recommended that to targeted customer.Advantage of the process is that and forward situation by analyzing the history of user good friend in social networks, the good friend that can cover all hot information is collected, is placed in a specific packet.When limited time or " state " quantity are too much, user has only to all message of this packet of fast browsing, grasp current events focus that just can be the fastest and hot issue.

Description

A kind of method of a large amount of hot issue of quick obtaining in social networks
Technical field
The present invention relates to this technical field packet-optimized of good friend in social networks, particularly from helping user in the short time Interior acquisition this angle of more hot information carries out the Optimization Work of good friend's packet.
Background technology
In recent years, along with the high speed development of the Internet, the friend-making circle of people also begins to turn to network, social network from reality The rise of network significantly extends the friend-making scope of people.From relatives at one's side, friend to industry famous person not acquainted with each other, entertain bright Star, social networks is that ordinary user provides a broader dating site and the effective way of the information of acquisition.User is every It can obtain substantial amounts of data message in social networks, and the Information Communication amount in social networks is considerably beyond extensively Broadcast, traditional news media such as TV, newspaper.But, most of message all have ageing, along with paying close attention to increasing of good friend, The quantity of message also can sharp increase, everyone time, energy are limited, the most how in a large amount of message the shortest time In to filter out more hot information be a problem demanding prompt solution.
A lot of social networkies both provide good friend's block functions, can be by only showing that " state " of a certain packet is carried out Selectivity is read.User can be grouped with the relation of oneself according to good friend, such as relatives, friend;Can also be according to good friend's Occupational identity is grouped, such as film star, Computer Engineer etc..Substantial amounts of in order to help user to obtain within the shortest time Hot information, the present invention proposes a kind of new good friend's group technology, forwards situation, it by the history analyzing user good friend In can cover the good friend of all hot information and collect, be placed in a specific packet.At limited time or " shape State " quantity too much time, user has only to all message of this packet of fast browsing, grasp current events focus that just can be the fastest and heat Door topic.
Summary of the invention
User obtains substantial amounts of hot information within the shortest time for convenience, grasps current current events focus and popular words Topic, the present invention proposes a kind of method of a large amount of hot issue of quick obtaining in social networks:
1, the method comprises the following steps:
1) in social networks, capture the forwarding record of " state " that user issues, including user name, forward content, forwarding Time, transfer amount, original author and former " state " deliver the time;
2) being clustered all " state " contents by clustering algorithm, each class is defined as an event;
3) by analyzing " state " forwarding record, for targeted customer, can in its friend group is chosen at the shortest time K the good friend with cover-most event;
4) this K good friend is placed on specific good friend and is grouped interior, it is recommended that to targeted customer.
2, step 2) described in by clustering algorithm, all " state " contents are clustered, each class is defined as one Individual event, it is characterised in that:
1) each class " state " is defined as a representations of events, just gets information any one " state " in this event Represent the message obtaining such topic.
3, by analyzing " state " forwarding record described in step 3), for targeted customer, in its " state " friend group Can be with K good friend of cover-most event in being chosen at the shortest time, it is characterised in that:
3.1 assume that user forwarded any " state " in some event, i.e. represent this user and cover this Event;
3.2 K the good friends arbitrarily choosing targeted customer form set A, definition y=T (i, A), represent that set A covers event i Time, i.e. minima in all times that all users in A cover event i, if set A be not covered with event i, then remember T (i, A)=∞;
3.3 definitionFor penalty, time t is mapped to a real number, represents and cover in t Cover the loss that this event is brought,
Wherein miFor the hop count of all " state " in event i, sum (i) is the hop count of all " state ", we Assuming that the significance level of event is directly proportional to forwarding ratio, therefore loss is directly proportional to the product of cover time and significant coefficient, Penalty f hereiniT () can make other changes according to practical situation, if T (ui, A)=∞ then fiT () takes function maxima FMax(is manually set);
3.4 travel through all events, define the penalty of whole network:
F ( A ) = Σ i p ( i ) f i ( T ( i , A ) )
WhereinThe probability that expression event i occurs, miFor event i owns the hop count of " state ", Total (i) is the number of all " state ";
The generation of 3.5 certain behaviors assuming user b is directly or indirectly affected by user a, and factor of influence is more than A certain threshold value, then it is assumed that a covers b, such as user b forward certain " state ", in addition to being affected by original author and the person of being forwarded, May also be subjected to the potential impact of other users, original author and the person of being forwarded and only may can affect user b in this event, and User a then may play more crucial influence in other events, and we will represent this process with probabilistic model;
3.6 definition σ (A) represent that in user's number that set A covers, i.e. set A, all users are covered each by not repeating to use Total number at family, σ (A) has multiple computational methods, and the present invention uses linear threshold Model Calculating Method, definition:
σ ( A ) = Σ i Σ v I ( Σ w b v , w ≥ θ v )
B in this modelV, w=e-a(tv-tw)(tv> twAnd v, there is between w friend relation, otherwise value is 0) represent in event i The middle user w factor of influence to v, tw、twIt is respectively w
With the time that v covers event i, a, θvFor adjustable parameter, I is indicative function.
3.7 objective function:
G ( A ) = min A F ( A ) - βσ ( A )
Wherein F (A) is the penalty of whole network, and σ (A) represents user's number that set A covers, and β is adjustable parameter, By solve above-mentioned object function obtain within the shortest time can with K good friend of cover-most event, K for being manually set, collection The user closed in A has the following characteristics that a) have considerable influence power in the customer group of object function, the message forward rate of issue The highest;B) within a short period of time, a large amount of important messages forwarding others to issue;
3.8 to minimize object function G (A) be a NP-hard problem, defines Ri(A)=fi(∞)-fi(T (i, A)), then
Minimizing G (A) and being equivalent to maximize H (A)=R (A)+β σ (A), provable H (A) is a submodular function, can pass through Greedy algorithm obtains approximate solution, and approximate ratio is more than 1-1/e=0.63.
The present invention proposes the good friend's group technology in a kind of new social networks, has an advantage in that: by analyzing user The history of good friend forwards situation, and the good friend that can cover all hot information is collected, and is placed on one specifically point In group.When limited time or " state " quantity are too much, user has only to all message of this packet of fast browsing, with regard to energy The fastest grasp current events focus and hot issue.
Accompanying drawing explanation
Fig. 1 is the method flow diagram of the present invention.
Detailed description of the invention
Referring to the drawings, the present invention is further illustrated:
A kind of method of a large amount of hot issue of quick obtaining in social networks:
1, the method comprises the following steps:
1) in social networks, capture the forwarding record of " state " that user issues, including user name, forward content, forwarding Time, transfer amount, original author and former " state " deliver the time;
2) being clustered all " state " contents by clustering algorithm, each class is defined as an event;
3) by analyzing " state " forwarding record, for targeted customer, can in its friend group is chosen at the shortest time K the good friend with cover-most event;
4) this K good friend is placed on specific good friend and is grouped interior, it is recommended that to targeted customer.
2, step 2) described in by clustering algorithm, all " state " contents are clustered, each class is defined as one Individual event, it is characterised in that:
1) each class " state " is defined as a representations of events, just gets information any one " state " in this event Represent the message obtaining such topic.
3, by analyzing " state " forwarding record described in step 3), for targeted customer, in its " state " friend group Can be with K good friend of cover-most event in being chosen at the shortest time, it is characterised in that:
1) assume that user forwarded any one " state " in some event, i.e. represent this user and cover this thing Part;
2) K the good friend arbitrarily choosing targeted customer forms set A, definition t=T (1, A), represents that set A covers event i Time, i.e. minima in all times that all users in A cover event i, if set A be not covered with event i, then remember T (i, A)=∞;
3) definitionFor penalty, time t is mapped to a real number, represents and cover in t The loss that this event is brought,For the significant coefficient of event i, wherein miFor the forwarding time of all " state " in event i Number, sum (i) is the hop count of all " state ", it will be assumed that the significance level of event is directly proportional to forwarding ratio, therefore damages Lose and be directly proportional to the product of cover time and significant coefficient, penalty f hereiniT () can make other more according to practical situation Change, if T (i, A)=∞ then fiT () takes function maxima FMax(and is manually set);
4) travel through all events, define the penalty of whole network:
F ( A ) Σ i p ( i ) f i ( T ( i , A ) )
WhereinThe probability that expression event i occurs, miFor event i owns the hop count of " state ", Total (t) is the number of all " state ";
5) assume that the generation of certain behavior of user b is directly or indirectly affected by user a, and factor of influence is more than certain One threshold value, then it is assumed that a covers b, such as user b forward certain " state ", in addition to being affected by original author and the person of being forwarded, also Only may may can be affected user b in this event by the potential impact of other users, original author and the person of being forwarded, and use Family a then may play more crucial influence in other events, and we will represent this process with probabilistic model;
6) during definition σ (A) represents user's number that set A covers, i.e. set A, all users are covered each by not duplicate customer Total number, σ (A) has multiple computational methods, and the present invention uses linear threshold Model Calculating Method, definition:
σ ( A ) = Σ i Σ v I ( Σ w b v , w ≥ θ v )
B in this modelV, w=e-a(tv-tw)(tv> twAnd v, there is between w friend relation, otherwise value is 0) represent in event i The middle user w factor of influence to v, tw、tvIt is respectively w and v and covers the time of event i, a, θvFor adjustable parameter, I is indicative letter Number.
7) objective function:
G ( A ) = min A F ( A ) - βσ ( A )
Wherein F (A) is the penalty of whole network, and σ (A) represents user's number that set A covers, and β is adjustable parameter, By solve above-mentioned object function obtain within the shortest time can with K good friend of cover-most event, K for being manually set, collection The user closed in A has the following characteristics that a) have considerable influence power in the customer group of object function, the message forward rate of issue The highest;B) within a short period of time, a large amount of important messages forwarding others to issue;
8) minimizing object function G (A) is a NP-hard problem, defines Ri(A)=fi(∞)-fi(T (i, A)), then
Minimizing G (A) and being equivalent to maximize H (A)=R (A)+β σ (A) provable H (A) is a submodular function, can pass through Greedy algorithm obtains approximate solution, and approximate ratio is more than 1-1/e=0.63.
Content described in this specification embodiment is only enumerating of the way of realization to inventive concept, the protection of the present invention Scope is not construed as being only limitted to the concrete form that embodiment is stated, protection scope of the present invention is also and in art technology Personnel according to present inventive concept it is conceivable that equivalent technologies means.

Claims (1)

1. a method for a large amount of hot issue of quick obtaining in social networks, the method is characterized in that:
1) in social networks, the forwarding record of " state " that user issues is captured, during including user name, forwarding content, forwarding Between, transfer amount, original author and former " state " deliver the time;
2) being clustered all " state " contents by clustering algorithm, each class is defined as an event;Step 2) in institute That states is clustered all " state " contents by clustering algorithm, and each class is defined as an event, it is characterised in that:
Each class " state " is defined as a representations of events, gets information any one " state " in this event just representative and obtains Obtained the message of such topic;
3) by analyzing " state " forwarding record, for targeted customer, can cover in its friend group is chosen at the shortest time Cover K good friend of most event;
Step 3) described in by analyze " state " forward record, for targeted customer, be chosen in its " state " friend group Can be with K good friend of cover-most event in shortest time, it is characterised in that:
3.1 assume that user forwarded any " state " in some event, i.e. represent this user and cover this event;
3.2 K the good friends arbitrarily choosing targeted customer form set A, definition t=T (i, A), represent that set A covers event i Minima in all times that all users in time, i.e. A cover event i, if set A is not covered with event i, then remembers T (i, A)=∞;
3.3 definitionFor penalty, time t is mapped to a real number, represents to cover in t and be somebody's turn to do The loss that event is brought,For the significant coefficient of event i, wherein miFor event i owns the hop count of " state ", Sum (i) is the hop count of all " state ", it will be assumed that the significance level of event is directly proportional to forwarding ratio, therefore loses It is directly proportional to the product of cover time and significant coefficient, herein penalty fiT () can make other changes according to practical situation, If T (i, A)=∞ then fiT () takes function maxima FMax;
3.4 travel through all events, define the penalty of whole network:
F ( A ) = Σ i p ( i ) f i ( T ( i , A ) )
WhereinThe probability that expression event i occurs, miFor the hop count of all " state ", total in event i I () is the number of all " state ";
The generation of 3.5 certain behaviors assuming user b is directly or indirectly affected by user a, and factor of influence is more than a certain Threshold value, then it is assumed that a covers b, such as user b forward certain " state ", in addition to being affected by original author and the person of being forwarded, also may be used May can only be understood by the potential impact of other users, original author and the person of being forwarded in this event, affect user b, and user A then may play more crucial influence in other events, represents this process with probabilistic model;
3.6 definition σ (A) represent that in user's numbers that set A cover, i.e. set A, all users are covered each by not duplicate customer Total number, σ (A) has multiple computational methods, and the present invention uses linear threshold Model Calculating Method, definition:
σ ( A ) = Σ i Σ v I ( Σ w b v , w ≥ θ v )
In this model(tv> twAnd v, there is between w friend relation, otherwise value is 0) represent in event The user w factor of influence to v, t in iw、tvIt is respectively w and v and covers the time of event i, a, θvFor adjustable parameter, I is indicative Function;
3.7 objective function:
G ( A ) = m i n A F ( A ) - β σ ( A )
Wherein F (A) is the penalty of whole network, and σ (A) represents user's number that set A covers, and β is adjustable parameter, passes through Solving that above-mentioned object function obtains within the shortest time can be with K good friend of cover-most event, and K is for being manually set, in set A User have the following characteristics that
A) having considerable influence power in the customer group of object function, the message forward rate of issue is the highest;
B) within a short period of time, a large amount of important messages forwarding others to issue;
3.8 to minimize object function G (A) be a NP-hard problem, defines Ri(A)=fi(∞)-fi(T (i, A)), then
Minimizing G (A) and being equivalent to maximize H (A)=R (A)+β σ (A), provable H (A) is a submodular function, can be by greedy Greedy algorithm obtains approximate solution, and approximate ratio is more than 1-1/e=0.63;
4) this K good friend is placed on specific good friend and is grouped interior, it is recommended that to targeted customer.
CN201310440419.4A 2013-09-24 2013-09-24 A kind of method of a large amount of hot issue of quick obtaining in social networks Active CN103678474B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310440419.4A CN103678474B (en) 2013-09-24 2013-09-24 A kind of method of a large amount of hot issue of quick obtaining in social networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310440419.4A CN103678474B (en) 2013-09-24 2013-09-24 A kind of method of a large amount of hot issue of quick obtaining in social networks

Publications (2)

Publication Number Publication Date
CN103678474A CN103678474A (en) 2014-03-26
CN103678474B true CN103678474B (en) 2016-10-05

Family

ID=50316022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310440419.4A Active CN103678474B (en) 2013-09-24 2013-09-24 A kind of method of a large amount of hot issue of quick obtaining in social networks

Country Status (1)

Country Link
CN (1) CN103678474B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809252B (en) * 2015-05-20 2018-05-04 成都信息工程大学 Internet data extraction system
CN108153914B (en) * 2018-01-25 2021-03-23 北京东方科诺科技发展有限公司 Perception maximization-based network burst hotspot perception method
CN109800351A (en) * 2018-12-29 2019-05-24 常熟理工学院 High-impact usage mining method in microblogging specific topics
CN111310058B (en) * 2020-03-27 2023-08-08 北京百度网讯科技有限公司 Information theme recommendation method, device, terminal and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8626768B2 (en) * 2010-01-06 2014-01-07 Microsoft Corporation Automated discovery aggregation and organization of subject area discussions
CN102316046B (en) * 2010-06-29 2016-03-30 国际商业机器公司 To the method and apparatus of the user's recommendation information in social networks
CN102117325A (en) * 2011-02-24 2011-07-06 清华大学 Method for predicting dynamic social network user behaviors
CN103246670B (en) * 2012-02-09 2016-02-17 深圳市腾讯计算机***有限公司 Microblogging sequence, search, methods of exhibiting and system

Also Published As

Publication number Publication date
CN103678474A (en) 2014-03-26

Similar Documents

Publication Publication Date Title
CN103678474B (en) A kind of method of a large amount of hot issue of quick obtaining in social networks
CN105357054B (en) Website traffic analysis method, device and electronic equipment
Estrada The structure of complex networks: theory and applications
CN106294590B (en) A kind of social networks junk user filter method based on semi-supervised learning
CN103116605B (en) A kind of microblog hot event real-time detection method based on monitoring subnet and system
CN105095419B (en) A kind of informational influence power maximization approach towards microblogging particular type of user
CN102629904B (en) Detection and determination method of network navy
CN107273496B (en) Method for detecting microblog network region emergency
CN103345524B (en) Method and system for detecting microblog hot topics
CN103136330B (en) Based on the User reliability appraisal procedure of microblog
WO2008046338A1 (en) Method and system of determining garbage information
CN104933622A (en) Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme
CN104298782B (en) Internet user actively accesses the analysis method of action trail
CN104298767A (en) Method for measuring user influence power in microblog network
Dong et al. An experimental study of large-scale mobile social network
CN105447144B (en) Microblogging forwarding visual analysis method and system based on big data analysis technology
CN104063479B (en) A kind of branded network temperature computational methods based on community network
CN111460796A (en) Accidental sensitive word discovery method based on word network
CN109615239A (en) The appraisal procedure of urban air-quality based on social network media data
Ziyi et al. Research on methods to identify the opinion leaders in Internet community
CN103761292B (en) User forward behavior based microblog reading probability calculation method
CN107369099A (en) A kind of user behavior analysis system towards social networks
Shi et al. Research on the Dissemination Path of Police-related Public Opinion Based on SIR
Chan et al. How Health Information Spreads in Twitter: The Whos and Whats of Philippine TB-data.
Tian et al. Prediction of information dissemination based on passive-aggressive algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant