CN101552689A - User interest drift detection method and system based on network structure - Google Patents

User interest drift detection method and system based on network structure Download PDF

Info

Publication number
CN101552689A
CN101552689A CNA2009101404571A CN200910140457A CN101552689A CN 101552689 A CN101552689 A CN 101552689A CN A2009101404571 A CNA2009101404571 A CN A2009101404571A CN 200910140457 A CN200910140457 A CN 200910140457A CN 101552689 A CN101552689 A CN 101552689A
Authority
CN
China
Prior art keywords
user
interest
consistency
project
drift
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2009101404571A
Other languages
Chinese (zh)
Other versions
CN101552689B (en
Inventor
陈恩红
杨杰
刘淇
宝腾飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN2009101404571A priority Critical patent/CN101552689B/en
Publication of CN101552689A publication Critical patent/CN101552689A/en
Application granted granted Critical
Publication of CN101552689B publication Critical patent/CN101552689B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention provides an user interest drift detection method and system based on network structure, comprising the following steps: constructing a global item network GIN; detecting whether the user is noise user or not; using the cutting algorithm based on consistency after removing the noise user and detecting whether the user interest drift generates or not and the searching the drift position. The detection method greatly reduces the calculation amount and realizes the quick detection of the user interest drift and prevents the noise user from influencing the stability of a recommendation system.

Description

The user interest drift detection method of structure Network Based and system
Technical field
The present invention relates to network media service field, particularly a kind of personalized recommendation method and system.
Background technology
Along with enriching constantly of Internet resources, information overload, information problem such as get lost is on the rise, so personalized recommendation system has obtained extensive concern.Personalized recommendation system utilizes the potential interested object of existing selection course or each user of similitude relation excavation by setting up the binary crelation between user and the information products, alleviates the pressure that information explosion brings the user to a certain extent.
The basis of personalized recommendation system and core are the user interest modelings.The user interest modeling is by acquiring and maintaining and user interest, demand or the relevant knowledge of custom, produces the user model of an expression peculiar background knowledge of user or interest, demand.Yet user's interest usually changes in the actual life, and the drift meeting of user interest brings very big difficulty to user modeling, influences the recommendation precision of personalized recommendation system simultaneously.
Existingly do not need to show that the method for the solution interest drift problem of feedback mainly contains two kinds: a kind of is the method for upgrading or adjust user interest model passively, mainly refer to some adaptive model update methods, comprise the time window method, forget model method, long-term and short-term interest model method and some other adaptive method; A kind of is the method for utilizing algorithm to detect interest drift on one's own initiative and carrying out respective handling, one cover offset of interest testing mechanism is generally arranged, the time keep original interest model constant substantially when not detecting drift, in case detect that user interest drifts about then reconstruct or adjust interest model on a large scale.
The defective of existing method is, more complicated needs to consume a large amount of computing costs, can't satisfy the real-time requirement of on-line system, and can't effectively get rid of noise jamming, and efficient is not high.Therefore, need a kind of method to address the above problem.
Summary of the invention
Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency, particularly solves the big and stable defective of noise jamming influence of computing cost.
In order to achieve the above object, the present invention proposes a kind of user interest drift detection method of structure Network Based, may further comprise the steps: make up global item network GIN; Whether detect the user is noise user; After removing noise user, adopt cutting algorithm, detect the position whether user takes up drift and drift takes place based on consistency.
As one embodiment of the present of invention, described structure global item network GIN may further comprise the steps: according to the similarity between the described project of the property calculation of project, sim ( I i , I j ) = Σ l = 1 t w l × | P i . l ∩ P j , l | | P i . l ∪ P j , l | , Wherein, w lFor the weight of l attribute of project, represent the effect size of this attribute aspect the project similitude, | P I, l∩ P J, l| expression project I iAttribute P I, lWith project I jAttribute P J, lThe number of the value of occuring simultaneously, | P I, l∪ P J, l| expression project I iAttribute P I, lWith project I jAttribute P J, lThe number of the value of union; The similarity and the pre-set threshold S0 that calculate between the described project of gained are compared, if described similarity greater than threshold value S0, then between two projects, set up a limit, otherwise do not set up the limit, make up described global item network GIN.
As one embodiment of the present of invention, whether described detection user is noise user, may further comprise the steps: according to putting in order of user id, select undressed user as the targeted customer successively; From described global item network GIN, extract described targeted customer's user capture project network UVIN; Situation according to described UVIN limit is calculated consistency, calculates continuation degree according to the time sequencing of user capture project; Compare calculating the described consistency value of gained and continuation degree value and predefined consistency threshold value D0 and continuation degree threshold value C0, if the value of described consistency and continuation degree is all less than D0 and C0, then described targeted customer is a noise user, otherwise is domestic consumer.
As one embodiment of the present of invention, described consistency is represented the intensity of user interest, and computing formula is density=e/ (v (v-1)/2)=2e/v (v-1), and wherein, e is the number on limit among the UVIN, and v is the project number of user capture; Described continuation degree is represented the lasting degree of user interest, computing formula is continuity=e '/(v-1), wherein, e ' is arranged in formation for the time order and function order by the user capture project with project, only consider the bar number on the limit between adjacent two projects in the formation, v is the project number of user capture.
As one embodiment of the present of invention, described cutting algorithm based on consistency may further comprise the steps: calculate the consistency d_total of the subgraph of cuit sequence formation, the number item_num of project in the record list entries, be sequence length, and the position i=L0 of initialization detection node; Judge whether also L0 reciprocal of no show of detection position, i.e. i≤item_num-L0 whether; Calculate the consistency d_right of the subgraph that the item sequence on the consistency d_left of the subgraph that the item sequence on the described node i left side constitutes and the right constitutes respectively; Calculate consistency cutting increment T_i=(the d_left+d_right)/2-d_total of described node i, and record; After having calculated the T_i of node i of all eligible i≤item_num-L0, find the maximum of T _ max among all T_i, the node of record T_max correspondence is P_max, obtains left side sequence S_l and the right sequence S_r after cutting from P_max; Judge whether described T_max is not less than predefined interest drift generation threshold value T0, and whether the length of described S_l and S_r is not less than described L0; Judge that whether described S_l and S_r are interest independently; If above-mentioned condition all satisfies, then obtain described S_l and S_r at described P_max point place cutting list entries, record cutting node is then respectively to described S_l and this algorithm of S_r recursive call.
As one embodiment of the present of invention, describedly judge that whether described S_l and S_r be interest independently, may further comprise the steps: check whether described S_l is positioned at the beginning of whole user capture item sequence; Calculate consistency cutting increment T_l=(d_l+d_l ')/2-d_ll ' between the subsequence S_l ' that described S_l and adjacent with it cutting by before produce, wherein, d_l represents the consistency of described S_l, d_l ' represents the consistency of described S_l ', and two sections sequences of described S_l of d_ll ' expression and S_l ' merge the consistency that the back constitutes sequence; Judge whether described T_l is not less than described threshold value T0,, otherwise can conclude that described S_l and S_r are not independent interest if continue to check; Check whether described S_r is positioned at the end of whole user capture item sequence, if, can conclude that described S_l and S_r are independent interest, otherwise, continue to check; Calculate consistency cutting increment T_r=(d_r+d_r ')/2-d_rr ' between the subsequence S_r ' that described S_r and adjacent with it cutting by before produce, wherein, d_r represents the consistency of described S_r, d_r ' represents the consistency of described S_r ', and these two sections sequences of described S_r of d_rr ' expression and S_r ' merge the consistency that the back constitutes sequence; Checking whether described T_r is not less than described threshold value T0, if then described S_l and S_r are independent interest, otherwise is not independent interest, and function returns.
The present invention also proposes a kind of personalized recommendation method that utilizes the user interest drift detection method of structure Network Based, comprises processed offline part and online recommendation part.Wherein, processed offline comprises that partly global network structure, user data preliminary treatment and commending system make up.
As one embodiment of the present of invention, described user data preliminary treatment may further comprise the steps: judge all users whether all preliminary treatment finish; According to putting in order of user id, select undressed user successively as the targeted customer; Check whether described targeted customer is noise user, if, then remove described targeted customer, otherwise, continue; Check whether described targeted customer is the interest drift user, if then according to the position of drift generation, described targeted customer's the visit data section of being divided into, each segment table shows user's a class interest; To cut apart good described user data consigns to commending system respectively and makes up.
As one embodiment of the present of invention, described online recommendation part may further comprise the steps: user data is detected, if belong to noise data, then send friendly the prompting to the user, point out the user interest option, and recommend popular project to give the user according to oneself; If belong to the interest drift user,, and provide the recommended project in conjunction with the structure of commending system according to this up-to-date interest then according to obtaining the current up-to-date interest of user based on the cutting algorithm of consistency; User data is upgraded,, write down this noise user if be noise user; If be the interest drift user, write down the position that this interest drift user and drift take place.
The present invention also proposes a kind of personalized recommendation system that utilizes the user interest drift detection method of structure Network Based on the other hand, comprises that GIN makes up module, noise user detection module, interest drift user detection module, parcel program module, online recommending module, commending system structure module and commending system update module.Wherein, described GIN makes up module, is used for making up global item network GIN according to the similarity between the project, and the network information is offered described noise user detection module and interest drift module; Described noise user detection module is used for detecting according to consistency and continuation degree whether the user is noise user, and testing result is offered described parcel module and online recommending module; Described interest drift user detection module is used to adopt the DBS algorithm to detect the position whether user interest drifts about and drift about, and testing result is offered described parcel module and online recommending module; Described parcel program module is used for user data is carried out preliminary treatment, result is offered described commending system make up module; Described online recommending module is used for user data is detected and record, and some feedback lastest imformations are offered described commending system update module; Described commending system makes up module, and the data construct commending system that utilizes described parcel program module to provide is provided; Described commending system update module, the commending system that is used to upgrade in time and has made up.
The present invention is by according to the similar network of the project build of user capture, and serve as according to removing noise user with the consistency of this network and continuation degree, simultaneously according to the generation of the change-detection interest drift of consistency whether with take place constantly, reduced amount of calculation greatly, realized in the fast detecting interest drift, avoided of the influence of noise user data commending system stability.
Aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize by practice of the present invention.
Description of drawings
Above-mentioned and/or additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:
Fig. 1 is the user interest drift detection method flow chart of the structure Network Based of the embodiment of the invention;
Fig. 2 is the schematic diagram of an example making up of the global item network GIN of the embodiment of the invention;
Fig. 3 is the schematic diagram of two examples of the user capture project network UVIN of example shown in Figure 2;
Fig. 4 is the flow chart based on the cutting algorithm of consistency of the embodiment of the invention;
Fig. 5 is the frame diagram of the personalized recommendation method of the user interest drift detection method that utilizes structure Network Based of the embodiment of the invention;
Fig. 6 is the pretreated flow chart of the user data of the embodiment of the invention;
Fig. 7 is the structure chart of the personalized recommendation system of the user interest drift detection method that utilizes structure Network Based of the embodiment of the invention.
Embodiment
Describe embodiments of the invention below in detail, the example of described embodiment is shown in the drawings, and wherein identical from start to finish or similar label is represented identical or similar elements or the element with identical or similar functions.Below by the embodiment that is described with reference to the drawings is exemplary, only is used to explain the present invention, and can not be interpreted as limitation of the present invention.
The present invention mainly is according to the similar network of the project build of user capture, and serve as according to removing noise user with the consistency of this network and continuation degree, simultaneously according to the generation of the change-detection interest drift of consistency whether with take place constantly, reduced amount of calculation greatly, realized in the fast detecting interest drift, avoided of the influence of noise user data commending system stability.
As shown in Figure 1, the user interest drift detection method flow chart for the structure Network Based of the embodiment of the invention may further comprise the steps:
Step S101 according to the similarity between the property calculation project of project, if the similarity between two projects then adds a limit, otherwise do not set up the limit greater than pre-set threshold S0 between these two projects, in view of the above, makes up global item network GIN.
As one embodiment of the present of invention, the project set that will have in the commending system of n project is expressed as I={I 1, I 2...., I n, each project I wherein i(i=1,2 ..., n) have multiple attribute, can be expressed as I i=<P I, 1, P I, 2...., P I, t, wherein, P I, j(the expression of 1≤j≤t) project I iJ attribute, t represents project I iHave t attribute.Similarly, each attribute P I, jThe value that can have some, and each attribute P I, jThe number of value be variable, therefore, project I iJ attribute P I, jCan be expressed as P I, j={ k 1, k 2...., k n... .}, wherein, k nBe attribute P I, jA value.Each attribute P I, jCorresponding weights w is all arranged j, represent its effect size aspect the project similitude.Therefore, project I iAnd I jBetween calculating formula of similarity can be expressed as weighted sum:
sim ( I i , I j ) = Σ l = 1 t w l × | P i . l ∩ P j , l | | P i . l ∪ P j , l |
Wherein, w lBe the weight of l attribute, | P I, l∩ P J, l| representation attribute P I, lAnd P J, lThe number of the value of occuring simultaneously, | P I, l∪ P J, l| representation attribute P I, lAnd P J, lThe number of the value of union.
As one embodiment of the present of invention, the film in the following table is an example, the building process of calculation of similarity degree method and the similar network G IN of overall project between the description project.
Node Movie name Style The playwright, screenwriter The director
A " Harry meets Sha Li " Comedy, romance Nora Ephron Rob Reiner
B " Titanic " Romantic James Cameron James Cameron
C " sky makes it the city " Romantic W.Wender,P.Handke B.Silberling
D " Luo Lita " Drama, romance V.Nabokov,S.Schiff A.Lyne
E " Nuo Dingshan " Comedy, romance Richard Curtis Roger Michell
F " Interview with the Vampire " Drama, terror Anne Rice Neil Jordan
G " Braveheart " Drama, war Randall Wallace Mel Gibson
H " interspace door " Action, risk, science fiction D.Devlin,R.Emmerich R.Emmerich
I " outman ET " Children, illusion, science fiction Mellissa Mathison S.Spielberg
J " interspace war 5 " Action, science fiction, war G.Lucas,L.Brackett I.Kershner
K " transformer " Action, science fiction R.Orci,A.Kurtzman Michael Bay
L " hacker kingdom " Action, science fiction A.,L.Wachowski A.,L.Wachowski
M " screaming 2 " Terrified, terrible Kevin Williamson Wes Craven
N 《8MM》 Terrible Andrew Kevin Walker Joel Schumacher
O " instinct " Mysterious, terrible Joe Eszterhas Paul Verhoeven
P " Office killer " Terrible T.Kalin,E.MacAdam Cindy Sherman
Q " diamond " Mysterious A.A.KAtz J.M.Asher
Wherein, the film attribute comprises cinematic genre, director and playwright, screenwriter, represents with G, W and D respectively, and corresponding attribute weight can be made as 1,0.5 and 0.5, therefore for film M iAnd M j, their similarity is sim ( M i , M j ) = | G i ∩ G j | | G i ∪ G j | + 0.5 × | W i ∩ W j | | W i ∪ W j | + 0.5 × | D i ∩ D j | | D i ∪ D j | . As, the similarity of film " Titanic " and " sky makes it the city " is sim ( M B , M C ) = 1 1 + 0.5 × 0 2 + 0.5 × 0 2 = 1 , The similarity of film " Titanic " and " hacker kingdom " is sim ( M B , M L ) = 0 2 + 0.5 × 0 2 + 0.5 × 0 2 = 0 . If setting similarity threshold is 0.2, then set up a limit between film " Titanic " and " sky makes it the city ", do not set up the limit between film " Titanic " and " the hacker kingdom ".Similarly, can calculate the similarity between other films, compare with setting threshold and determine whether to set up the limit, like this, obtain the similar network of overall film as shown in Figure 2, the letter among the figure with the table in represent the letter of film corresponding one by one.Should be understood that the foregoing description only is schematic embodiment, does not limit the scope of the invention.
Step S102, from global item network GIN, extract user capture project network UVIN, situation according to the UVIN limit is calculated consistency, time sequencing according to the user capture project is calculated continuation degree, if user's consistency value and continuation degree value are all less than pre-set threshold D0 and C0, represent that then this user is noise user, its visit data also is a noise data, otherwise this user is non-noise user.
As one embodiment of the present of invention, user capture project network UVIN is a subnet of global item network GIN, and the summit among the UVIN is the sports representative's of user capture summit, and the limit among the limit among the UVIN and the GIN between the respective vertices is corresponding.If the number on limit is e among the UVIN, the project number of user capture is v, and project is arranged in formation by the time order and function of user capture project order, only consider the bar number on the limit between adjacent two projects in the formation, be made as e ', then defining consistency is density=e/ (v (v-1)/2)=2e/v (v-1), and continuation degree is continuity=e '/(v-1).
As one embodiment of the present of invention,, set consistency threshold value D0=0.25, continuation degree threshold value C0=0.25 at the MovieLens data set.
Particularly, be example with the similar network of overall film shown in Figure 2, the computational methods of consistency and continuation degree are described.If user ID 1 according to the sequential access project of G → E → B → F → C → D → A → M, then can extract the UVIN network shown in Fig. 3 a, visible v=8, e=14, e '=4, so the consistency of user ID 1 is 0.5, continuation degree is 0.57.If user ID 2 according to the sequential access project of D → P → K → F → Q → M → H → J, then can extract the UVIN network shown in Fig. 3 b, visible v=8, e=3, e '=0, therefore, the consistency of user ID 2 is 0.11, continuation degree is 0.As seen, user ID 1 is a domestic consumer, and user ID 2 is a noise user.As can be seen, the consistency of domestic consumer's visit item sequence and continuation degree are than higher from above-mentioned example, and the consistency of noise user and continuation degree are lower, even approach 0.Should be understood that the foregoing description only is schematic embodiment, does not limit the scope of the invention.
Step S103, get rid of noise user after, adopt cutting algorithm based on consistency (DensityBased Segmentation, DBS) to user's interest drift whether and the interest drift position detect.The input parameter of DBS algorithm is the user capture item sequence, is output as the position whether user takes up drift and drift takes place.Particularly, as shown in Figure 4, may further comprise the steps:
Step S401, the consistency d_total of the subgraph that calculating cuit sequence constitutes, the number item_num of project, i.e. sequence length in the record list entries; The position i=L0 of initialization detection node.Wherein, the subgraph that the cuit sequence constitutes is change for being the summit with the project among the input parameter S with the limit between the respective vertices in the global item network GIN, the subgraph of the GIN that is constituted; Detection node position i represents to begin to carry out the DBS algorithm from i project of list entries, then shift position backward one by one; L0 is the minimum length of cutting back subsequence, in an embodiment of the present invention, sets L0=10, does not consider to begin the situation with L0 project of afterbody.
Step S402 judges whether also L0 reciprocal of no show of detection position, i.e. i≤item_num-L0 whether is if continue step S403, otherwise forward step S406 to.
Step S403, the consistency d_right of the subgraph of the item sequence formation on the consistency d_left of the subgraph of the item sequence formation on the computing node i left side and the right respectively.
Step S404, the consistency cutting increment T_i of computing node i, and record calculates the T_i value of gained.Wherein, consistency cutting increment T_i is the mean value of the subgraph consistency of two sections item sequence formations about the obtaining increment with respect to consistency under the situation of not cutting, i.e. T_i=(d_left+d_right)/2-d_total after the cutting of i point.
Step S405, the detection node position is moved one backward, and promptly i=i+1 repeats above-mentioned testing process, then until i>item_num-L0.
Step S406 finds out the maximum among all T_i, is designated as T_max, and the node that writes down this T_max correspondence is P_max, obtains left side sequence S_l and the right sequence S_r after the P_max cutting.
Step S407 judges whether T_max is not less than T0, and whether the length of the right and left sequence S_l and S_r be not less than L0, if above-mentioned two conditions are all set up, then continue step S208, otherwise function returns.Wherein, the threshold value of threshold value T0 for judging that whether interest drift takes place in embodiments of the present invention, at the MovieLens data set, is provided with T0=0.25, if greater than this threshold value then represent that interest drift has taken place at this some place the user, otherwise do not take place; L0 for the cutting sequence minimum length, if the cutting after sub-sequence length less than L0 abandon the cutting.
As shown in the table, enumerated a domestic consumer and a drift user's visit item sequence.
The user Type Visit project (according to time sequence) Tmax The drift position
ID1 Domestic consumer G→E→B→C→D→A→F→M 0.08 There is not drift
ID3 The drift user C→A→E→F→M→N→O→P 0.46 F → M place
Wherein, the T_max value of domestic consumer is less, and the recruitment of expression cutting back consistency is less, does not therefore have interest drift; And drift user's T_max value reaches 0.46, be illustrated in relevant position " F → M " locate to sequence cutting back consistency to increase trial of strength very big, so take up the herein possibility of drifting about of this user is very big.
Step S408, judge cut into about two sections project subsequence S_l and S_r whether be interest independently, if continue step S409, otherwise function returns.
Wherein, judge cut into about two sections project subsequence S_l and S_r whether be that independently interest comprises that two steps checked:
The first step checks whether left side subsequence S_l is positioned at the beginning of whole user capture item sequence.If directly carried out for second step and check; Otherwise, the left side of S_l certainly exists the subsequence S_l ' that the adjacent with it cutting by before produces, calculate consistency cutting increment T_l=(d_l+d_l ')/2-d_ll ' between S_l and the S_l ', wherein, d_l represents the consistency of S_l, the consistency of d_l ' expression S_l ', these two sections sequences of d_ll ' expression S_l and S_l ' merge the consistency that the back constitutes sequence.If T_l 〉=T0 then carried out for second step and checks, otherwise can conclude that two cross-talk sequences are not independent interest.
In second step, check whether the right subsequence S_r is positioned at the end of whole user capture item sequence.If conclude that two cross-talk sequences are independent interest; Otherwise, the right of S_r certainly exists the subsequence S_r ' that the adjacent with it cutting by before produces, calculate consistency cutting increment T_r=(d_r+d_r ')/2-d_rr ' between S_r and the S_r ', wherein, d_r represents the consistency of S_r, the consistency of d_r ' expression S_r ', these two sections sequences of d_rr ' expression S_r and S_r ' merge the consistency that the back constitutes sequence.If T_r 〉=T0, then two cross-talk sequences are independent interest, otherwise two cross-talk sequences are not independent interest, and function returns.
Should be understood that whether two cross-talk sequences about above-mentioned judgement are that the embodiment of the method for independent interest only is schematic embodiment, do not limit the scope of the invention.
Step S409 if the condition among above-mentioned S407 and the S408 all satisfies, then obtains S_l and S_r at P_max point place cutting list entries, and record cutting node is then respectively to cutting this algorithm of the right and left sequence recursive call that the back produces.
As one embodiment of the present of invention, for the length item sequence that is N, when calculating T_i one by one from left to right, whenever move a point, then the left side increases the i that adds some points newly, and a some i is removed on the right, so when counting on the calculating limit, the left side needs checkpoint i and whether has the limit between i-1 point before, and the right needs whether to have the limit between checkpoint i and N-i thereafter point, therefore needs to check N-1 time.And existing N-2 * L0 point needs calculating T_i, and therefore, the computation complexity of DBS algorithm is (N-1) * (N-2 * L0)=o (N 2).
The present invention also proposes a kind of user interest drift detection method that utilizes above-mentioned structure Network Based to the method that general personalized recommendation system is optimized, and comprises processed offline part and online recommendation part.As shown in Figure 5, be the commending system structure of the embodiment of the invention and the framework schematic diagram of online recommendation.
Wherein, main global network structure, user data preliminary treatment and the commending system be responsible for of processed offline part makes up, and general amount of calculation is bigger, and time loss is more.Particularly, as shown in Figure 6, preliminary treatment may further comprise the steps to user data:
Step S601, judge whether all users all preliminary treatment finish, if then finish preliminary treatment, otherwise continue step S602.
Step S602 according to putting in order of user id, selects a undressed user successively, accepts the targeted customer who handles as the next one.Wherein, user id is user's a unique identifier, can be shaping numerical value, also can be character string, is determined on a case-by-case basis.
Step S603 checks whether the user is noise user, if remove this user, otherwise continue step S604.Wherein, noise user be among the above-mentioned steps S102 detected user capture project network consistency and continuation degree all less than the user of setting threshold.
Step S604 checks whether this user is the interest drift user, if continue step S605, otherwise forward step S606 to.Wherein, the interest drift user utilizes the detected user who has interest drift of DBS algorithm among the above-mentioned steps S103, and the position that interest drift takes place is recorded.
Step S605, according to the position that the interest drift that writes down takes place, the user's access data section of being divided into, each segment table shows user's a class interest.
Step S606 consigns to commending system structure part respectively cutting apart good user data.
As one embodiment of the present of invention, constructed commending system can be commending systems such as general content-based, collaborative filtering, has universality.
Online recommendation part utilizes the method described in above-mentioned steps S102 and the S103 that user data is detected.If this user belongs to noise data, then send friendly the prompting to the user, point out the user interest option, and recommend popular project to give the user according to oneself; If this user belongs to the interest drift user, then obtain the current up-to-date interest of user, and provide the recommended project in conjunction with commending system structure part according to this up-to-date interest according to the DBS algorithm.Simultaneously, also need user data set is upgraded.If noise user then writes down this noise user; If the interest drift user then writes down this interest drift user and drift occurrence positions.Like this, no longer need all users again handled when making up commending system next time.
As shown in Figure 7, be the structure chart of the personalized recommendation system of the user interest drift detection method that utilizes structure Network Based of the embodiment of the invention, this system comprises with lower module: GIN makes up module 701, noise user detection module 702, interest drift user detection module 703, parcel program module 704, online recommending module 705, commending system and makes up module 706 and commending system update module 707.Wherein, GIN makes up module 701, is used for making up global item network GIN according to the similarity between the project, and the network information is offered noise user detection module 702 and interest drift user detection module 703; Noise user detection module 702 is used for detecting according to consistency and continuation degree whether the user is noise user, and testing result is offered parcel program module 704 and online recommending module 705; Interest drift user detection module 703 is used to adopt the DBS algorithm to detect the position whether user interest drifts about and drift about, and testing result is offered parcel program module 704 and online recommending module 705; Parcel program module 704 is used for user data is carried out preliminary treatment, result is offered commending system make up module 706; Online recommending module 705 is used for user data is detected and record, and some feedback lastest imformations are offered commending system update module 707; Commending system makes up module 706, and the data construct commending system that utilizes parcel program module 704 to provide is provided; Commending system update module 707, the commending system that is used to upgrade in time and has made up.
The present invention is by according to the similar network of the project build of user capture, and serve as according to removing noise user with the consistency of this network and continuation degree, simultaneously according to the generation of the change-detection interest drift of consistency whether with take place constantly, reduced amount of calculation greatly, realized in the fast detecting interest drift, avoided of the influence of noise user data commending system stability.
Although illustrated and described embodiments of the invention, for the ordinary skill in the art, be appreciated that without departing from the principles and spirit of the present invention and can carry out multiple variation, modification, replacement and modification that scope of the present invention is by claims and be equal to and limit to these embodiment.

Claims (20)

1, a kind of user interest drift detection method of structure Network Based is characterized in that, may further comprise the steps:
Make up global item network GIN;
Whether detect the user is noise user;
After removing noise user, adopt cutting algorithm, detect the position whether user takes up drift and drift takes place based on consistency.
2, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, described global item network GIN is the global network that is made up according to its similitude by all items,, concerns as the limit with the similitude between the project as the summit with all items.
3, the user interest drift detection method of structure Network Based as claimed in claim 1 is characterized in that, the overall project network packet of described structure is drawn together following steps:
According to the similarity between the described project of the property calculation of project, sim ( I i , I j ) = Σ l = 1 t w l × | P i , l ∩ P j , l | | P i , l ∪ P j , l | , Wherein, w lFor the weight of l attribute of project, represent the effect size of this attribute aspect the project similitude, | P I, l∩ P J, l| expression project I iAttribute P I, lWith project I jAttribute P J, lThe number of the value of occuring simultaneously, | P I, l∪ P J, l| expression project I iAttribute P I, lWith project I jAttribute P J, lThe number of the value of union;
The similarity and the pre-set threshold S0 that calculate between the described project of gained are compared, if described similarity greater than threshold value S0, then between two projects, set up a limit, otherwise do not set up the limit, make up described global item network GIN.
4, the user interest drift detection method of structure Network Based as claimed in claim 1 is characterized in that, whether described detection user is noise user, may further comprise the steps:
According to putting in order of user id, select undressed user successively as the targeted customer;
From described global item network GIN, extract described targeted customer's user capture project network UVIN;
Situation according to described UVIN limit is calculated consistency, calculates continuation degree according to the time sequencing of user capture project;
Compare calculating the described consistency value of gained and continuation degree value and predefined consistency threshold value D0 and continuation degree threshold value C0, if the value of described consistency and continuation degree is all less than D0 and C0, then described targeted customer is a noise user, otherwise is domestic consumer.
5, the user interest drift detection method of structure Network Based as claimed in claim 1 is characterized in that, described user id is user's a unique identifier, can be shaping numerical value and character string.
6, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, described user capture project network UVIN is a subnet of global item network GIN, summit among the described UVIN is the sports representative's of user capture summit, and the limit among the limit among the described UVIN and the GIN between the respective vertices is corresponding.
7, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, described consistency is represented the intensity of user interest, computing formula is density=e/ (v (v-1)/2)=2e/v (v-1), wherein, e is the number on limit among the UVIN, and v is the project number of user capture.
8, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, described continuation degree is represented the lasting degree of user interest, computing formula is continuity=e '/(v-1), wherein, e ' only considers the bar number on the limit between adjacent two projects in the formation for the time order and function order by the user capture project is arranged in formation with project, and v is the project number of user capture.
9, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, described noise user is used commending system for registration, but not according to self interest visit project, but has the user of the attitude random access project of amusement in arms.
10, the user interest drift detection method of structure Network Based as claimed in claim 1 is characterized in that, described cutting algorithm based on consistency may further comprise the steps:
Calculate the consistency d_total of the subgraph of cuit sequence formation, the number item_num of project in the record list entries, i.e. the position i=L0 of sequence length, and initialization detection node;
Judge whether also L0 reciprocal of no show of detection position, i.e. i≤item_num-L0 whether;
Calculate the consistency d_right of the subgraph that the item sequence on the consistency d_left of the subgraph that the item sequence on the described node i left side constitutes and the right constitutes respectively;
Calculate the consistency cutting increment T_i of described node i, and record;
After having calculated the T_i of node i of all eligible i≤item_num-L0, find the maximum of T _ max among all T_i, the node of record T_max correspondence is P_max, obtains left side sequence S_l and the right sequence S_r after cutting from P_max;
Judge whether described T_max is not less than predefined interest drift generation threshold value T0, and whether the length of described S_l and S_r is not less than described L0;
Judge that whether described S_l and S_r are interest independently;
If above-mentioned condition all satisfies, then obtain described S_l and S_r at described P_max point place cutting list entries, record cutting node is then respectively to described S_l and this algorithm of S_r recursive call.
11, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, the subgraph that described cuit sequence constitutes is a subgraph of described global item network GIN, with the input project be the summit, with in the described global item network GIN between the respective vertices while being.
12, the user interest drift detection method of structure Network Based as claimed in claim 1 is characterized in that, described L0 is the minimum length of cutting back subsequence.
13, the user interest drift detection method of structure Network Based as claimed in claim 1, it is characterized in that, the mean value of the consistency of the subgraph that described consistency cutting increment T_i constitutes for two sections item sequences that obtain after the node i cutting is with respect to the increment of the consistency under the not cutting situation, i.e. T_i=(d_left+d_right)/2-d_total.
14, the user interest drift detection method of structure Network Based as claimed in claim 1 is characterized in that, describedly judges that whether described S_l and S_r be interest independently, may further comprise the steps:
Check whether described S_l is positioned at the beginning of whole user capture item sequence, if not continuing next step, otherwise skip following two steps, check directly whether described S_r is positioned at the sequence end;
Calculate consistency cutting increment T_l=(d_l+d_l ')/2-d_ll ' between the subsequence S_l ' that described S_l and adjacent with it cutting by before produce, wherein, d_l represents the consistency of described S_l, d_l ' represents the consistency of described S_l ', and two sections sequences of described S_l of d_ll ' expression and S_l ' merge the consistency that the back constitutes sequence;
Judge whether described T_l is not less than described threshold value T0,, otherwise can conclude that described S_l and S_r are not independent interest if continue to check;
Check whether described S_r is positioned at the end of whole user capture item sequence, if, can conclude that described S_l and S_r are independent interest, otherwise, continue to check;
Calculate consistency cutting increment T_r=(d_r+d_r ')/2-d_rr ' between the subsequence S_r ' that described S_r and adjacent with it cutting by before produce, wherein, d_r represents the consistency of described S_r, d_r ' represents the consistency of described S_r ', and these two sections sequences of described S_r of d_rr ' expression and S_r ' merge the consistency that the back constitutes sequence;
Check whether described T_r is not less than described threshold value T0, if then described S_l and S_r are independent interest, otherwise are not independent interest, function returns.
15, the user interest drift detection method of structure Network Based as claimed in claim 1 then is characterised in that, the complexity of described cutting algorithm based on consistency is O (N 2).
16, a kind of personalized recommendation method that utilizes the user interest drift detection method of structure Network Based is characterized in that, comprises processed offline part and online recommendation part.
17, the personalized recommendation method that utilizes the user interest drift detection method of structure Network Based as claimed in claim 16 is characterized in that, described processed offline comprises that partly global network structure, user data preliminary treatment and commending system make up.
18, the personalized recommendation method that utilizes the user interest drift detection method of structure Network Based as claimed in claim 16 is characterized in that, described user data preliminary treatment may further comprise the steps:
Judge all users whether all preliminary treatment finish;
According to putting in order of user id, select undressed user successively as the targeted customer;
Check whether described targeted customer is noise user, if, then remove described targeted customer, otherwise, continue;
Check whether described targeted customer is the interest drift user, if then according to the position of drift generation, described targeted customer's the visit data section of being divided into, each segment table shows user's a class interest;
To cut apart good described user data consigns to commending system respectively and makes up.
19, the personalized recommendation method that utilizes the user interest drift detection method of structure Network Based as claimed in claim 16 is characterized in that, described online recommendation part may further comprise the steps:
User data is detected,, then send friendly the prompting, point out the user interest option, and recommend popular project to give the user according to oneself to the user if belong to noise data; If belong to the interest drift user,, and provide the recommended project in conjunction with the structure of commending system according to this up-to-date interest then according to obtaining the current up-to-date interest of user based on the cutting algorithm of consistency;
User data is upgraded,, write down this noise user if be noise user; If be the interest drift user, write down the position that this interest drift user and drift take place.
20, a kind of personalized recommendation system that utilizes the user interest drift detection method of structure Network Based, it is characterized in that, comprise that GIN makes up module, noise user detection module, interest drift user detection module, parcel program module, online recommending module, commending system structure module and commending system update module
Described GIN makes up module, is used for making up global item network GIN according to the similarity between the project, and the network information is offered described noise user detection module and interest drift module;
Described noise user detection module is used for detecting according to consistency and continuation degree whether the user is noise user, and testing result is offered described parcel module and online recommending module;
Described interest drift user detection module is used to adopt the DBS algorithm to detect the position whether user interest drifts about and drift about, and testing result is offered described parcel module and online recommending module;
Described parcel program module is used for user data is carried out preliminary treatment, result is offered described commending system make up module;
Described online recommending module is used for user data is detected and record, and some feedback lastest imformations are offered described commending system update module;
Described commending system makes up module, and the data construct commending system that utilizes described parcel program module to provide is provided;
Described commending system update module, the commending system that is used to upgrade in time and has made up.
CN2009101404571A 2009-05-15 2009-05-15 User interest drift detection method and system based on network structure Expired - Fee Related CN101552689B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101404571A CN101552689B (en) 2009-05-15 2009-05-15 User interest drift detection method and system based on network structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101404571A CN101552689B (en) 2009-05-15 2009-05-15 User interest drift detection method and system based on network structure

Publications (2)

Publication Number Publication Date
CN101552689A true CN101552689A (en) 2009-10-07
CN101552689B CN101552689B (en) 2011-11-02

Family

ID=41156697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101404571A Expired - Fee Related CN101552689B (en) 2009-05-15 2009-05-15 User interest drift detection method and system based on network structure

Country Status (1)

Country Link
CN (1) CN101552689B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823880A (en) * 2014-03-03 2014-05-28 国家认证认可监督管理委员会信息中心 Attribute weight-based method for calculating similarity between detection mechanisms
CN105893515A (en) * 2016-03-30 2016-08-24 腾讯科技(深圳)有限公司 Information processing method and server
CN109857857A (en) * 2019-01-17 2019-06-07 中国人民解放军国防科技大学 Method for detecting drift of user reading interest topic

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1848742A (en) * 2005-01-10 2006-10-18 三星电子株式会社 Contextual task recommendation system and method for determining user's context and suggesting tasks
CN101339562A (en) * 2008-08-15 2009-01-07 北京航空航天大学 Portal personalized recommendation service system introducing into interest model feedback and update mechanism
CN101339563B (en) * 2008-08-15 2010-06-02 北京航空航天大学 Interest model update method facing to odd discovery recommendation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823880A (en) * 2014-03-03 2014-05-28 国家认证认可监督管理委员会信息中心 Attribute weight-based method for calculating similarity between detection mechanisms
CN105893515A (en) * 2016-03-30 2016-08-24 腾讯科技(深圳)有限公司 Information processing method and server
CN105893515B (en) * 2016-03-30 2021-02-05 腾讯科技(深圳)有限公司 Information processing method and server
CN109857857A (en) * 2019-01-17 2019-06-07 中国人民解放军国防科技大学 Method for detecting drift of user reading interest topic
CN109857857B (en) * 2019-01-17 2020-11-20 中国人民解放军国防科技大学 Method for detecting drift of user reading interest topic

Also Published As

Publication number Publication date
CN101552689B (en) 2011-11-02

Similar Documents

Publication Publication Date Title
CN105069122B (en) A kind of personalized recommendation method and its recommendation apparatus based on user behavior
CN111444395B (en) Method, system and equipment for obtaining relation expression between entities and advertisement recall system
CN105354277A (en) Recommendation method and system based on recurrent neural network
US20090287687A1 (en) System and method for recommending venues and events of interest to a user
CN109933721B (en) Interpretable recommendation method integrating user implicit article preference and implicit trust
CN104077417B (en) People tag in social networks recommends method and system
CN106997358A (en) Information recommendation method and device
CN106156127A (en) Select the method and device that data content pushes to terminal
CN105187242B (en) A kind of user's anomaly detection method excavated based on variable-length pattern
CN103116639A (en) Item recommendation method and system based on user-item bipartite model
CN103106285A (en) Recommendation algorithm based on information security professional social network platform
CN103116588A (en) Method and system for personalized recommendation
CN111723292B (en) Recommendation method, system, electronic equipment and storage medium based on graph neural network
CN111259263A (en) Article recommendation method and device, computer equipment and storage medium
CN107274242A (en) A kind of Method of Commodity Recommendation based on association analysis algorithm
CN111061962A (en) Recommendation method based on user score analysis
CN107578292A (en) A kind of user's portrait constructing system
CN106168980A (en) Multimedia resource recommends sort method and device
CN104778237A (en) Individual recommending method and system based on key users
KR20100029581A (en) Recommended search terms providing system and method for each user and computer readable medium processing the method
CN106776873A (en) A kind of recommendation results generation method and device
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
CN103500228A (en) Similarity measuring method improved through collaborative filtering recommendation algorithm
CN106776859A (en) Mobile solution App commending systems based on user preference
CN104992348A (en) Method and device for displaying information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111102

Termination date: 20170515

CF01 Termination of patent right due to non-payment of annual fee