CN109977299A - A kind of proposed algorithm of convergence project temperature and expert's coefficient - Google Patents

A kind of proposed algorithm of convergence project temperature and expert's coefficient Download PDF

Info

Publication number
CN109977299A
CN109977299A CN201910128705.4A CN201910128705A CN109977299A CN 109977299 A CN109977299 A CN 109977299A CN 201910128705 A CN201910128705 A CN 201910128705A CN 109977299 A CN109977299 A CN 109977299A
Authority
CN
China
Prior art keywords
user
project
similarity
coefficient
indicate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910128705.4A
Other languages
Chinese (zh)
Other versions
CN109977299B (en
Inventor
宋小磊
薛妍
王宾
陈春芳
贺小伟
侯榆青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201910128705.4A priority Critical patent/CN109977299B/en
Publication of CN109977299A publication Critical patent/CN109977299A/en
Application granted granted Critical
Publication of CN109977299B publication Critical patent/CN109977299B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

User context is introduced the invention discloses the proposed algorithm of a kind of convergence project temperature and expert's coefficient, in the algorithm as the feature for constructing similar subgroup;Project temperature coefficient is introduced in each similar subgroup and expert opinion coefficient is remodified as to original rating matrix;For target user, is calculated in subgroup and recommend neighbor user;Recommend suitable project to target user according to the scoring of neighbor user.Inventive algorithm not only increases the recommendation accuracy of collaborative filtering, also reduces the calculation amount of Collaborative Filtering Recommendation Algorithm, there is important reference application value in personalized recommendation system field.

Description

A kind of proposed algorithm of convergence project temperature and expert's coefficient
Technical field
The invention belongs to personalized recommendation system field, it is related to a kind of convergence project temperature based on user context and expert The Collaborative Filtering Recommendation Algorithm of coefficient.
Background technique
With the development of internet, information radix is exponentially increased, and user has no way of doing it when facing blast information, is caused The problems such as information utilization is too low, useful information is lost without reason etc., really thinks then how to filter out user from massive information The content wanted is current business and the main problem that development in science and technology faces.Recommender system is since eighties of last century proposes with regard to being always The main means for solving problems, play increasingly important role in terms of alleviating the influence of information overload bring, pass through It crosses the algorithm that prolonged research is applied to recommender system and is broadly divided into content-based recommendation algorithm, collaborative filtering recommending calculation Method, mixing proposed algorithm, popularity proposed algorithm and advanced non-traditional proposed algorithm.Wherein Collaborative Filtering Recommendation Algorithm passes through Grading, is clicked and the historical records such as consumption correctly model user preference and item characteristic, since its adaptability is good, logic Simple and easy the advantages of realizing, is industrially widely applied.Although collaborative filtering is being applied compared to other algorithms Upper performance is more outstanding but itself is there is also some defects, for example cold start-up, Deta sparseness and expansivity etc. are asked Topic.Collaborative filtering be based on the certain information for having understood user in the case where according to user behavior find particular model, To create recommendation, then unavoidably system will be caused to be cold-started because of data sparsity problem;Simultaneously because recommending The article popularity situation bad to long-tail type Products Show;It is similar in searching that there are also the collaborative filterings based on similar type Property when need between each user establish connection, then the operation time of system will increase.
In conclusion collaborative filtering there is also some problems in personalized recommendation system: to Deta sparseness mistake Wait improving and recommending performance to be optimized in sensitive, system cold start-up, recommendation accuracy etc..
Summary of the invention
The purpose of the present invention is to propose to the proposed algorithms of a kind of convergence project temperature and expert's coefficient, recommend to calculate to improve The accuracy of method reduces error and calculates cost.
In order to realize above-mentioned task, the invention adopts the following technical scheme:
A kind of proposed algorithm of convergence project temperature and expert's coefficient, comprising the following steps:
Step 1, according to user context information, using the similar subgroup of method building user context of cluster;
Step 2, the project score data for choosing all users in the similar subgroup of user context constitutes project rating matrix, benefit Project rating matrix is modified with project temperature scoring coefficient, to reduce the sparsity of project rating matrix, is simplified Project rating matrix afterwards;
Step 3, coefficient is recommended to correct the feature of simplified project rating matrix using expert, the project after being optimized Rating matrix;User is calculated using the project rating matrix after optimization to score similarity, and user is scored similarity and background Similarity carries out linear fit, obtains total similarity, constructs user's similarity matrix;
Step 4, according to user's similarity matrix, the Candidate Set of target user's score in predicting is obtained, by utilizing similarity Average weighted mode predicts scoring of the target user to project, to obtain project recommendation result.
Further, step 1 specifically includes:
Step 1.1, using the Basic Information Table of all users as sample, user data set is constructed, by user data set amount Change obtains a sample set list, and each of set point represents a sample after quantization;Select the threshold of two distances Value: T1 and T2;
Step 1.2, appoint from list and take a point P, calculate point P at a distance from all subset centers;If do not deposited currently In subset, then by point P as a subset center, and point P is deleted from list, otherwise go to step 1.3;
Step 1.3, such as fruit dot P at a distance from some subset center within T2, then point P deleted from list and by point P is added in the subset;
Step 1.4, if the distance at fruit dot P and some subset center is between T1 and T2, then point P is added to the subset In, but point P is not deleted from list;
Step 1.5, such as fruit dot P at a distance from all subset centers all except T1, then by point P as in a subset The heart, while point P being deleted from list;
Step 1.6, before list is empty, continuous circulation step 1.2 to 1.5;After list is empty, thick cluster process is complete At obtaining different subsets, all the points in each subset averaged, obtain the central point of each subset;
Step 1.7, the result slightly clustered is clustered using K-means algorithm, generates user context phase by cluster Like subgroup;Wherein, the cluster initial value of K-means algorithm is the central point of each subset, the K value of K-means algorithm Slightly to cluster the number for generating subset.
Further, the scoring of project temperature described in step 2 coefficient formulas are as follows:
Wherein, i indicates the project in classification c, and Z (c) indicates the item destination aggregation (mda) under classification c, Pu,iIndicate u pairs of user The scoring of film i, ∑i∈Z(c)Pu,iIndicate overall score of the user u at classification c, ∑ PuIndicate that user u is total under all categories Scoring, d (C) is the sum of all categories for the project that user u was commented on, and d (c) is the total of the project of user comment classification c Number.
Further, recommend coefficient to correct the feature of simplified project rating matrix using expert described in step 3, obtain Project rating matrix after to optimization, comprising:
Calculate the formula that expert recommends coefficient are as follows:
Wherein, Nu,cIndicate evaluation number of the user u in the project of type c, Nu,CIndicate user u in all types project On evaluation number, tu,cIndicate that user u concentrates first time to have the project scoring time of type c with the user in user data It scores the time difference recorded, e is the truth of a matter of natural logrithm;
Recommend coefficient to be added in simplified project rating matrix expert, is carried out with the feature to project rating matrix Amendment, the project rating matrix after being optimized, calculation formula are as follows:
Ruc(tuc)=ru,c(tuc)×R
In above formula, ru,c(tu,c) it is that expert recommends coefficient, R is the simplified project rating matrix of step 2, Ruc(tuc) be Project rating matrix after optimization.
Further, user's scoring similarity is calculated using the project rating matrix after optimization described in step 3, and will used Score similarity and context similarity progress linear fit at family, obtains total similarity, constructs user's similarity matrix, comprising:
The scoring similarity of user is calculated using the project rating matrix after optimization, calculation formula is as follows:
Wherein, u and u1Respectively indicate any two user, pu,cIndicate Ruc(tuc) in scoring of the user u in c intermediate item,Indicate Ruc(tuc) in average score of the user u in all types project,Indicate Ruc(tuc) in user u1In c category Scoring on mesh,Indicate Ruc(tuc) in user u1Average score in all types project, C indicate project category set;
It is calculated using user basic information meter in the user context in the similar subgroup of same user context between two two users Similarity, calculation formula are as follows:
Wherein, ruWithIndicate user u and u1The background attribute feature that is obtained after vectorization of user basic information to Amount;
User's scoring similarity, user context similarity are subjected to linear fit, obtain the total similarity of user:
simUBICF(u,u1)=λ simUB(u,u1)+(1-λ)simIC(u,u1)
Wherein, [0,1] fusion parameters λ ∈;
According to the above-mentioned total calculating formula of similarity of user, the use in the similar subgroup of family background between every two user is calculated The total similarity in family, to constitute user's similarity matrix simu
Further, the step 4 specifically includes:
Step 4.1, it for target user u ', is chosen in user's similarity matrix highest with target user's u ' similarity The Candidate Set U of top n user composition target user's score in predictingneigh
Step 4.2, by Candidate Set UneighIn the user deletion of items target user u ' that evaluated evaluated Project, remaining project are the project recommendation Candidate Set for constituting target user;The Similarity-Weighted of project in Candidate Set is average Minute mark is that target user u ' scores to the prediction of project, and calculation formula is as follows:
Wherein u1For Candidate Set UneighIn any user, simUBICF(u,u1) it is target user u ' and user u1User Total similarity,Indicate user u1Scoring to project i, N (Uneigh) indicate Candidate Set UneighIn had evaluation to project i User's number, p 'u′,iIndicate that target user u ' scores to the prediction of project i;
Step 4.3, it is scored according to target user the prediction of project, the project in project Candidate Set is ranked up, is selected It takes prediction to score highest preceding M project recommendation to target user, obtains project recommendation result.
The present invention has following technical characterstic compared with prior art:
1. the present invention effectively increases the recommendation accuracy under high sparse sample, the concept for introducing similar background subgroup changes The cold start-up problem of kind recommender system.
2. the present invention recommends coefficient to be modified reduction square to original rating matrix by convergence project temperature and expert The degree of rarefication of battle array improves the accuracy recommended, and improved proposed algorithm is calculated than traditional collaborative filtering from square mean error amount Method reduces by 30% or so;20% or so is reduced relative to the collaborative filtering RMSE based on cluster.
3. inventive algorithm reduces proposed algorithm in cost and exists from calculating due to introducing the concept of similar background subgroup Calculate the calculating cost of adjacent user's collection.
Detailed description of the invention
Fig. 1 is the overall flow schematic diagram of the method for the present invention;
Fig. 2 is the recommendation accuracy curve obtained in emulation experiment of the invention for different fusion parameters λ;
Fig. 3 is the recommendation accuracy curve in emulation experiment of the invention in different N values;
Fig. 4 is for inventive algorithm (UBICF) with the existing collaborative filtering (CCF) based on user, based on cluster Collaborative filtering (UCF) is recommending the comparison diagram in accuracy.
Specific embodiment
Rule of thumb, general similar users also have similar selection, generate similar background in such a way that user clusters Subgroup, since the characteristic that new user can refer to is fewer, then when being recommended using collaborative filtering Recommendation results just will appear deviation, so, this programme using similar users generally there are the empirical features of similar selection to propose phase Like the concept of subgroup, user is attributed to similar user subgroup first in place of recommending to start and carries out pushing away in next step in subgroup It recommends.The usual sparsity of the data set applied in recommendation is all very high, then carrying out recommendation Shi Huizeng using collaborative filtering Intensive, this programme are reduced using the thought of project temperature and expert opinion to be recalculated to recommending data collection with this Calculation amount and raising accuracy.
Based on thinking above, this programme proposes the improvement of the collaborative filtering based on cluster optimization, mainly for Unknown subscriber recommend difficult problem and recommend accuracy and recommend the time done optimization generate it is a kind of based on user context and specially Family's opinion similarity cluster clusters by user and generates similar users recommendation subgroup, produced in each subgroup according to K neighbor algorithm Raw recommended candidate list, then scores according to weight estimation and generates the formal recommendation list of Top-N.Due to each recommendation row It is all to be generated in the similar subgroup of user, then being all improved on recommending accuracy and advisory speed.In addition by Make the cluster subgroup obtained in this way than simple in being added to the features such as user context and expert opinion when generating user subgroup It is much better that performance of the subgroup in later period recommendation effect is clustered with the user that tradition cluster obtains.Specific steps of the invention are such as Under:
A kind of proposed algorithm of convergence project temperature and expert's coefficient, comprising the following steps:
Step 1, according to user context information, using the similar subgroup of method building user context of cluster.
Clustering processing is carried out to user basic information table using the clustering algorithm combined in the present invention.Slightly gathered first Class includes two steps, and the first step is quickly and data to be approximatively divided into some subsets, then by the point in subset with accurately Calculation method cluster again, the Important Thought for clustering optimization is: subset initial center point K and region half is arranged to sample data set Diameter, by data efficient be divided into several overlapping subset make all objects all fall within subset-cover in the range of;And it is right The object in the same area is fallen in, new central point is recalculated and is repartitioned according to the distance between object and new central point Object affiliated area;Circulation executes the process of " dividing subset calculates central point ", until the position of K central point no longer becomes Change.It chooses target user to calculate at a distance from the center of different similar background users subgroup, obtains target user and different background phase Like the similarity relationship of subgroup, finally sample is divided into different groups according to similarity.Specific step is as follows:
Step 1.1, using the Basic Information Table of all users as sample, user data set is constructed, by user data set amount Change obtains a sample set list, and each of set point represents a sample after quantization;Select the threshold of two distances Value: T1 and T2, wherein the value of T1 > T2, T1 and T2 can be determined with cross check.
Step 1.2, appoint from list and take a point P, calculate point P at a distance from all subset centers;If do not deposited currently In subset, then by point P as a subset center, and point P is deleted from list, otherwise go to step 1.3;
Step 1.3, such as fruit dot P at a distance from some subset center within T2, then point P deleted from list and by point P is added in the subset;
This step is to think that point P is close enough with this subset at this time, therefore point P cannot be re-used as other subsets Center.
Step 1.4, if the distance at fruit dot P and some subset center is between T1 and T2, then point P is added to the subset In, but point P is not deleted from list;
Indicate that point P can participate in the cluster process of next round in this way;
Step 1.5, such as fruit dot P at a distance from all subset centers all except T1, then by point P as in a subset The heart, while point P being deleted from list;
Step 1.6, before list is empty, continuous circulation step 1.2 to 1.5;After list is empty, thick cluster process is complete At obtaining different subsets, all the points in each subset averaged, obtain the central point of each subset.
After the completion of thick cluster, the central point of each subset and the number of subset are obtained, as what is carefully clustered in next step Basic parameter.Thin cluster process uses K-means algorithm:
Step 1.7, the result that step 1.6 slightly clusters is clustered using K-means algorithm, generates user by cluster The similar subgroup of background;Wherein, the cluster initial value of K-means algorithm is the central point of each subset described in step 1.6, K- The K value of means algorithm is that step 1.6 slightly clusters the number for generating subset.
In the present embodiment, by taking public film data set movieLens-1M as an example, the explanation of different information tables in data set As shown in table 1, table 2, table 3:
User basic information table in 1 MovieLens of table
Attribute-name Explanation
UserID User's unique identification Unified number 1~6040
Gender Address name, two-value type feature are worth for " F " or " M "
Age Age of user, discrete type feature, being worth is 1~56
Occupation User's occupation, discrete type feature, 21 kinds
Zip-code Compressed code
Film Basic Information Table in 2 MovieLens of table
Attribute-name Explanation
MovieID Film unique identification, Unified number 1~3952
Title Movie name
Genres Film types includes 18 kinds of different types of films
User-film score information table in 3 MovieLens of table
Attribute-name Explanation
UserID Customs Assigned Number
MovieID Film number
Rating Scoring [1,5] of the user to film
Time The time that user scores to film is worth for timestamp
Step 2, the project score data for choosing all users in the similar subgroup of user context constitutes project rating matrix, benefit Project rating matrix is modified with project temperature scoring coefficient, to reduce the sparsity of project rating matrix, is simplified Project rating matrix afterwards.
In the present embodiment, the project refers to film, and user items score data extracts from public film data set User-film score information table in movieLens-1M, as shown in table 3, by the common structure of project score data of all users At project rating matrix.
In the present solution, the project temperature referred to is defined as:
A. accounting of the user to the scoring of specific type project in all items scoring;
B. user effectively scores the accounting in all items;
Then project temperature scoring coefficient formulas are as follows:
Wherein, i indicates the project in classification c, and Z (c) indicates the item destination aggregation (mda) under classification c, Pu,iIndicate u pairs of user The scoring of film i, ∑i∈Z(c)Pu,iIndicate overall score of the user u at classification c, ∑ PuIndicate that user u is total under all categories Scoring, d (C) is the sum of all categories for the project that user u was commented on, and d (c) is the total of the project of user comment classification c Number.
According to above-mentioned project temperature scoring coefficient, by script using number of items as the sparse rating matrix dimensionality reduction of height of dimension For using the project category matrix relatively low as the degree of rarefication of dimension, (note: the number of project is bound to be higher than item from the point of view of experience Mesh classification);Specific method is to be handled using above-mentioned project temperature scoring coefficient formulas project rating matrix, Obtained result is the revised simplified project rating matrix based on project temperature.
Step 3, coefficient is recommended to correct the feature of simplified project rating matrix using expert, the project after being optimized Rating matrix;User is calculated using the project rating matrix after optimization to score similarity, and user is scored similarity and background Similarity carries out linear fit.
After introducing project temperature scoring coefficient is modified project rating matrix, by the reality recommended film Scene analysis show that the directive property that the selection tendency of user's viewing will receive expert user in group influences, therefore this programme draws Enter the feature that expert recommends coefficient as revised scoring matrix, merges user context similarity and project temperature and expert recommends phase It is new user's similarity calculation like degree.
Step 3.1, expert recommends coefficient is defined as:
A. user comments on number in the accounting of general comment to the certain types of project;
B. the time attenuation coefficient of user comment.
Then calculate the formula that expert recommends coefficient are as follows:
Wherein, Nu,cIndicate evaluation number of the user u in the project of type c, Nu,CIndicate user u in all types project On evaluation number, tu,cIndicate that user u concentrates first time to have the project scoring time of type c with the user in user data It scores the time difference recorded, e is the truth of a matter of natural logrithm.
Then,Indicate user u different type c project review number accounting,For the decaying of user comment Coefficient, the coefficient indicate whether user is active in the recent period.R under the influence of attenuation coefficientu,c(tuc) be one and become between (0,1) The function of time of change, the time is closer, and comment number is more, then indicating that expert's coefficient of the user u is bigger, pushes away to next step It recommends more advantageous.
Step 3.2, coefficient is recommended to be added in simplified project rating matrix expert, to project rating matrix Feature is modified, the project rating matrix after being optimized;
It is a number between (0,1) with time change since expert recommends coefficient, then can be used as simplification The coefficient of project rating matrix afterwards, thus project temperature scoring coefficient and expert recommend coefficient to merge after the matrix that constructs Calculation formula is as follows:
Ruc(tuc)=ru,c(tuc)×R
In above formula, ru,c(tu,c) it is that expert recommends coefficient, R is the simplified project rating matrix of step 2, Ruc(tuc) be Project rating matrix after optimization.
Step 3.3, the scoring similarity of user is calculated using the project rating matrix after optimization, calculation formula is as follows:
Wherein, u and u1Respectively indicate any two user, pu,cIndicate Ruc(tuc) in scoring of the user u in c intermediate item,Indicate Ruc(tuc) in average score of the user u in all types project,Indicate Ruc(tuc) in user u1In c category Scoring on mesh,Indicate Ruc(tuc) in user u1Average score in all types project, C indicate project category set.
Step 3.4, it is calculated using user basic information meter in the similar subgroup of same user context between two two users User context similarity, calculation formula are as follows:
Wherein, ruWithIndicate user u and u1The background attribute feature that is obtained after vectorization of user basic information to Amount.
Step 3.5, user's scoring similarity, user context similarity are subjected to linear fit, obtain the total similarity of user:
simUBICF(u,u1)=λ simUB(u,u1)+(1-λ)simIC(u,u1)
Wherein, [0,1] fusion parameters λ ∈.
According to the above-mentioned total calculating formula of similarity of user, the use in the similar subgroup of family background between every two user is calculated The total similarity in family, to constitute user's similarity matrix simu
Step 4, according to user's similarity matrix, the Candidate Set of target user's score in predicting is obtained, by utilizing similarity Average weighted mode predicts scoring of the target user to project, to obtain project recommendation result.
Due to number of users and film quantitative proportion serious unbalance (wherein number of users in original user's film rating matrix Amount is far smaller than film quantity) cause to have many places scoring blank (i.e. user is to the film without scoring), recommender system in matrix Final goal be exactly to determine these films not scored whether are liked for user, the film is added to if liking User recommends not add if not liking in set.
User's similarity matrix sim is obtained by step 3u, then having for each user one big according to similarity Then the neighbor user set of small sequence chooses film score in predicting Candidate Set of the similarity top n user as target user, According to user's similarity matrix simuIt is as follows with the process of project rating matrix prediction user's film scoring of user:
Step 4.1, it for target user u ', is chosen in user's similarity matrix highest with target user's u ' similarity The Candidate Set U of top n user composition target user's score in predictingneigh;Wherein the size of N, which can according to need, is configured;
Step 4.2, by Candidate Set UneighIn the user deletion of items target user u ' that evaluated evaluated Project, remaining project are the project recommendation Candidate Set for constituting target user;The Similarity-Weighted of project in Candidate Set is average Minute mark is that target user u ' scores to the prediction of project, and calculation formula is as follows:
Wherein u1For Candidate Set UneighIn any user, simUBICF(u,u1) it is target user u ' and user u1User Total similarity,Indicate user u1Scoring to project i, N (Uneigh) indicate Candidate Set UneighIn had evaluation to project i User's number, p 'u′,iIndicate that target user u ' scores to the prediction of project i.
Step 4.3, it is scored according to target user the prediction of project, the project in project Candidate Set is ranked up, is selected It takes prediction to score highest preceding M project recommendation to target user, obtains project recommendation result;The specific value of M can be according to need It is set, such as value is 1-5.
Fig. 2 to Fig. 4 gives under the method for the present invention difference emulation experiment as a result, can see from test result, different Fusion parameters λ recommendation accuracy of the invention is influenced smaller, and different N values is then to recommending the accuracy to have larger impact; By Fig. 4, it can be seen that, under different N values, the present invention has on mean square error root RMSE value obviously compared to analogous algorithms Reduction, show that recommendation of the invention has very big promotion compared to existing algorithm.

Claims (6)

1. a kind of proposed algorithm of convergence project temperature and expert's coefficient, which comprises the following steps:
Step 1, according to user context information, using the similar subgroup of method building user context of cluster;
Step 2, the project score data for choosing all users in the similar subgroup of user context constitutes project rating matrix, utilizes item Mesh temperature scoring coefficient is modified project rating matrix, to reduce the sparsity of project rating matrix, after being simplified Project rating matrix;
Step 3, coefficient is recommended to correct the feature of simplified project rating matrix using expert, the project scoring after being optimized Matrix;User's scoring similarity is calculated using the project rating matrix after optimization, and user's scoring similarity is similar to background Degree carries out linear fit, obtains total similarity, constructs user's similarity matrix;
Step 4, according to user's similarity matrix, the Candidate Set of target user's score in predicting is obtained, by utilizing Similarity-Weighted Average mode predicts scoring of the target user to project, to obtain project recommendation result.
2. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that step 1 is specific Include:
Step 1.1, using the Basic Information Table of all users as sample, user data set is constructed, user data set is quantified To a sample set list, each of set point represents a sample after quantization;Select the threshold value of two distances: T1 And T2;
Step 1.2, appoint from list and take a point P, calculate point P at a distance from all subset centers;If there is currently no sons Collection, then by point P as a subset center, and point P is deleted from list, otherwise goes to step 1.3;
Step 1.3, such as fruit dot P at a distance from some subset center within T2, then point P is deleted from list and adds point P It is added in the subset;
Step 1.4, if the distance at fruit dot P and some subset center is between T1 and T2, then point P is added in the subset, but Point P is not deleted from list;
Step 1.5, such as fruit dot P at a distance from all subset centers all except T1, then by point P as a subset center, Point P is deleted from list simultaneously;
Step 1.6, before list is empty, continuous circulation step 1.2 to 1.5;After list is empty, thick cluster process is completed, Different subsets is obtained, all the points in each subset are averaged, obtains the central point of each subset;
Step 1.7, the result slightly clustered is clustered using K-means algorithm, generates the similar son of user context by cluster Group;Wherein, the cluster initial value of K-means algorithm is the central point of each subset, and the K value of K-means algorithm is thick Cluster generates the number of subset.
3. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that described in step 2 Project temperature score coefficient formulas are as follows:
Wherein, i indicates the project in classification c, and Z (c) indicates the item destination aggregation (mda) under classification c, Pu,iIndicate user u to film i Scoring, ∑i∈Z(c)Pu,iIndicate overall score of the user u at classification c, ∑ PuIndicate overall score of the user u under all categories, D (C) is the sum of all categories for the project that user u was commented on, and d (c) is the sum of the project of user comment classification c.
4. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that described in step 3 Recommend coefficient to correct the feature of simplified project rating matrix using expert, the project rating matrix after being optimized, packet It includes:
Calculate the formula that expert recommends coefficient are as follows:
Wherein, Nu,cIndicate evaluation number of the user u in the project of type c, Nu,CIndicate user u in all types project Evaluate number, tu,cIndicate that user u concentrates first time to have scoring the project scoring time of type c and the user in user data The time difference of record, e are the truth of a matter of natural logrithm;
Recommend coefficient to be added in simplified project rating matrix expert, is repaired with the feature to project rating matrix Just, the project rating matrix after being optimized, calculation formula are as follows:
Ruc(tuc)=ru,c(tuc)×R
In above formula, ru,c(tu,c) it is that expert recommends coefficient, R is the simplified project rating matrix of step 2, Ruc(tuc) it is after optimizing Project rating matrix.
5. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that described in step 3 Calculate user using the project rating matrix after optimization and score similarity, and by user score similarity and context similarity into Row linear fit obtains total similarity, constructs user's similarity matrix, comprising:
The scoring similarity of user is calculated using the project rating matrix after optimization, calculation formula is as follows:
Wherein, u and u1Respectively indicate any two user, pu,cIndicate Ruc(tuc) in scoring of the user u in c intermediate item,Table Show Ruc(tuc) in average score of the user u in all types project,Indicate Ruc(tuc) in user u1In c intermediate item Scoring,Indicate Ruc(tuc) in user u1Average score in all types project, C indicate project category set;
It is calculated using user basic information meter similar in the user context in the similar subgroup of same user context between two two users Degree, calculation formula are as follows:
Wherein, ruWithIndicate user u and u1The background attribute feature vector that is obtained after vectorization of user basic information;
User's scoring similarity, user context similarity are subjected to linear fit, obtain the total similarity of user:
simUBICF(u,u1)=λ simUB(u,u1)+(1-λ)simIC(u,u1)
Wherein, [0,1] fusion parameters λ ∈;
According to the above-mentioned total calculating formula of similarity of user, the user calculated in the similar subgroup of family background between every two user is total Similarity, to constitute user's similarity matrix simu
6. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that the step 4 specifically include:
Step 4.1, it for target user u ', is chosen and the highest preceding N of target user u ' similarity in user's similarity matrix A user constitutes the Candidate Set U of target user's score in predictingneigh
Step 4.2, by Candidate Set UneighIn the project evaluated of the deletion of items target user u ' that evaluated of user, Remaining project is the project recommendation Candidate Set for constituting target user;The Similarity-Weighted average mark of project in Candidate Set is denoted as Target user u ' scores to the prediction of project, and calculation formula is as follows:
Wherein u1For Candidate Set UneighIn any user, simUBICF(u,u1) it is target user u ' and user u1The total phase of user Like degree,Indicate user u1Scoring to project i, N (Uneigh) indicate Candidate Set UneighIn to project i had evaluation user Number, p 'u′,iIndicate that target user u ' scores to the prediction of project i;
Step 4.3, it is scored according to target user the prediction of project, the project in project Candidate Set is ranked up, chosen pre- The highest preceding M project recommendation of assessment point obtains project recommendation result to target user.
CN201910128705.4A 2019-02-21 2019-02-21 Recommendation algorithm fusing project popularity and expert coefficient Active CN109977299B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910128705.4A CN109977299B (en) 2019-02-21 2019-02-21 Recommendation algorithm fusing project popularity and expert coefficient

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910128705.4A CN109977299B (en) 2019-02-21 2019-02-21 Recommendation algorithm fusing project popularity and expert coefficient

Publications (2)

Publication Number Publication Date
CN109977299A true CN109977299A (en) 2019-07-05
CN109977299B CN109977299B (en) 2022-12-27

Family

ID=67077170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910128705.4A Active CN109977299B (en) 2019-02-21 2019-02-21 Recommendation algorithm fusing project popularity and expert coefficient

Country Status (1)

Country Link
CN (1) CN109977299B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490686A (en) * 2019-07-08 2019-11-22 西北大学 A kind of building of commodity Rating Model, recommended method and system based on Time Perception
CN110910215A (en) * 2019-11-20 2020-03-24 深圳前海微众银行股份有限公司 Product recommendation method, device, equipment and computer-readable storage medium
CN111191707A (en) * 2019-12-25 2020-05-22 浙江工商大学 LFM training sample construction method fusing time attenuation factors
CN111486345A (en) * 2020-03-10 2020-08-04 安徽科杰粮保仓储设备有限公司 Grain depot underground pipe network liquid leakage on-line monitoring and early warning method and device
CN113497831A (en) * 2021-06-30 2021-10-12 西安交通大学 Content placement method and system based on feedback popularity under mobile edge network

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102479202A (en) * 2010-11-26 2012-05-30 卓望数码技术(深圳)有限公司 Recommendation system based on domain expert
CN104317900A (en) * 2014-10-24 2015-01-28 重庆邮电大学 Multiattribute collaborative filtering recommendation method oriented to social network
CN105868237A (en) * 2015-12-09 2016-08-17 乐视网信息技术(北京)股份有限公司 Multimedia data recommendation method and server
CN106021329A (en) * 2016-05-06 2016-10-12 西安电子科技大学 A user similarity-based sparse data collaborative filtering recommendation method
CN108205682A (en) * 2016-12-19 2018-06-26 同济大学 It is a kind of for the fusion content of personalized recommendation and the collaborative filtering method of behavior
CN108647724A (en) * 2018-05-11 2018-10-12 国网电子商务有限公司 A kind of user's recommendation method and device based on simulated annealing
US20180322206A1 (en) * 2017-05-05 2018-11-08 Microsoft Technology Licensing, Llc Personalized user-categorized recommendations
CN108804683A (en) * 2018-06-13 2018-11-13 重庆理工大学 Associate(d) matrix decomposes and the film of collaborative filtering recommends method
CN109166017A (en) * 2018-10-12 2019-01-08 平安科技(深圳)有限公司 Method for pushing, device, computer equipment and storage medium based on reunion class
CN109360057A (en) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 Information-pushing method, device, computer equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102479202A (en) * 2010-11-26 2012-05-30 卓望数码技术(深圳)有限公司 Recommendation system based on domain expert
CN104317900A (en) * 2014-10-24 2015-01-28 重庆邮电大学 Multiattribute collaborative filtering recommendation method oriented to social network
CN105868237A (en) * 2015-12-09 2016-08-17 乐视网信息技术(北京)股份有限公司 Multimedia data recommendation method and server
CN106021329A (en) * 2016-05-06 2016-10-12 西安电子科技大学 A user similarity-based sparse data collaborative filtering recommendation method
CN108205682A (en) * 2016-12-19 2018-06-26 同济大学 It is a kind of for the fusion content of personalized recommendation and the collaborative filtering method of behavior
US20180322206A1 (en) * 2017-05-05 2018-11-08 Microsoft Technology Licensing, Llc Personalized user-categorized recommendations
CN108647724A (en) * 2018-05-11 2018-10-12 国网电子商务有限公司 A kind of user's recommendation method and device based on simulated annealing
CN108804683A (en) * 2018-06-13 2018-11-13 重庆理工大学 Associate(d) matrix decomposes and the film of collaborative filtering recommends method
CN109166017A (en) * 2018-10-12 2019-01-08 平安科技(深圳)有限公司 Method for pushing, device, computer equipment and storage medium based on reunion class
CN109360057A (en) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 Information-pushing method, device, computer equipment and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
U LIJI 等: "Improved personalized recommendation based on user attributes clustering and score matrix filling", 《COMPUTER STANDARDS & INTERFACES》 *
吴一帆 等: "结合用户背景信息的协同过滤推荐算法", 《计算机应用》 *
王宇飞 等: "基于用户评分和项目类偏好的协同过滤推荐算法", 《软件导刊》 *
薛妍: "高校学生就业推荐算法研究及应用", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
黎安能: "基于hadoop的改进聚类协同过滤推荐算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490686A (en) * 2019-07-08 2019-11-22 西北大学 A kind of building of commodity Rating Model, recommended method and system based on Time Perception
CN110910215A (en) * 2019-11-20 2020-03-24 深圳前海微众银行股份有限公司 Product recommendation method, device, equipment and computer-readable storage medium
CN111191707A (en) * 2019-12-25 2020-05-22 浙江工商大学 LFM training sample construction method fusing time attenuation factors
CN111191707B (en) * 2019-12-25 2023-06-06 浙江工商大学 LFM training sample construction method integrating time attenuation factors
CN111486345A (en) * 2020-03-10 2020-08-04 安徽科杰粮保仓储设备有限公司 Grain depot underground pipe network liquid leakage on-line monitoring and early warning method and device
CN111486345B (en) * 2020-03-10 2021-08-24 安徽科杰粮保仓储设备有限公司 Grain depot underground pipe network liquid leakage on-line monitoring and early warning method and device
CN113497831A (en) * 2021-06-30 2021-10-12 西安交通大学 Content placement method and system based on feedback popularity under mobile edge network
CN113497831B (en) * 2021-06-30 2022-10-25 西安交通大学 Content placement method and system based on feedback popularity under mobile edge network

Also Published As

Publication number Publication date
CN109977299B (en) 2022-12-27

Similar Documents

Publication Publication Date Title
CN109977299A (en) A kind of proposed algorithm of convergence project temperature and expert's coefficient
Li et al. Using multidimensional clustering based collaborative filtering approach improving recommendation diversity
CN107833117B (en) Bayesian personalized sorting recommendation method considering tag information
CN102609523B (en) The collaborative filtering recommending method classified based on taxonomy of goods and user
CN107220365B (en) Accurate recommendation system and method based on collaborative filtering and association rule parallel processing
CN103186539B (en) A kind of method and system determining user group, information inquiry and recommendation
CN103678672B (en) Method for recommending information
CN103544216B (en) The information recommendation method and system of a kind of combination picture material and keyword
WO2016191959A1 (en) Time-varying collaborative filtering recommendation method
CN107122980B (en) Method and device for identifying categories to which commodities belong
CN103559622A (en) Characteristic-based collaborative filtering recommendation method
CN107256238B (en) personalized information recommendation method and information recommendation system under multiple constraint conditions
CN102411754A (en) Personalized recommendation method based on commodity property entropy
CN108665323A (en) A kind of integrated approach for finance product commending system
CN103136683A (en) Method and device for calculating product reference price and method and system for searching products
CN107329994A (en) A kind of improvement collaborative filtering recommending method based on user characteristics
CN107633444A (en) Commending system noise filtering methods based on comentropy and fuzzy C-means clustering
CN104778237A (en) Individual recommending method and system based on key users
CN108874916A (en) A kind of stacked combination collaborative filtering recommending method
CN109544231A (en) Logistics distribution service personalization recommended method based on Logistics Information Platform
CN106294788B (en) The recommendation method of Android application
Gong Employing User Attribute and Item Attribute to Enhance the Collaborative Filtering Recommendation.
Gholamian et al. Improving electronic customers' profile in recommender systems using data mining techniques
CN110020918B (en) Recommendation information generation method and system
CN108287902B (en) Recommendation system method based on data non-random missing mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant