CN109977299A - A kind of proposed algorithm of convergence project temperature and expert's coefficient - Google Patents
A kind of proposed algorithm of convergence project temperature and expert's coefficient Download PDFInfo
- Publication number
- CN109977299A CN109977299A CN201910128705.4A CN201910128705A CN109977299A CN 109977299 A CN109977299 A CN 109977299A CN 201910128705 A CN201910128705 A CN 201910128705A CN 109977299 A CN109977299 A CN 109977299A
- Authority
- CN
- China
- Prior art keywords
- user
- project
- similarity
- coefficient
- indicate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
User context is introduced the invention discloses the proposed algorithm of a kind of convergence project temperature and expert's coefficient, in the algorithm as the feature for constructing similar subgroup;Project temperature coefficient is introduced in each similar subgroup and expert opinion coefficient is remodified as to original rating matrix;For target user, is calculated in subgroup and recommend neighbor user;Recommend suitable project to target user according to the scoring of neighbor user.Inventive algorithm not only increases the recommendation accuracy of collaborative filtering, also reduces the calculation amount of Collaborative Filtering Recommendation Algorithm, there is important reference application value in personalized recommendation system field.
Description
Technical field
The invention belongs to personalized recommendation system field, it is related to a kind of convergence project temperature based on user context and expert
The Collaborative Filtering Recommendation Algorithm of coefficient.
Background technique
With the development of internet, information radix is exponentially increased, and user has no way of doing it when facing blast information, is caused
The problems such as information utilization is too low, useful information is lost without reason etc., really thinks then how to filter out user from massive information
The content wanted is current business and the main problem that development in science and technology faces.Recommender system is since eighties of last century proposes with regard to being always
The main means for solving problems, play increasingly important role in terms of alleviating the influence of information overload bring, pass through
It crosses the algorithm that prolonged research is applied to recommender system and is broadly divided into content-based recommendation algorithm, collaborative filtering recommending calculation
Method, mixing proposed algorithm, popularity proposed algorithm and advanced non-traditional proposed algorithm.Wherein Collaborative Filtering Recommendation Algorithm passes through
Grading, is clicked and the historical records such as consumption correctly model user preference and item characteristic, since its adaptability is good, logic
Simple and easy the advantages of realizing, is industrially widely applied.Although collaborative filtering is being applied compared to other algorithms
Upper performance is more outstanding but itself is there is also some defects, for example cold start-up, Deta sparseness and expansivity etc. are asked
Topic.Collaborative filtering be based on the certain information for having understood user in the case where according to user behavior find particular model,
To create recommendation, then unavoidably system will be caused to be cold-started because of data sparsity problem;Simultaneously because recommending
The article popularity situation bad to long-tail type Products Show;It is similar in searching that there are also the collaborative filterings based on similar type
Property when need between each user establish connection, then the operation time of system will increase.
In conclusion collaborative filtering there is also some problems in personalized recommendation system: to Deta sparseness mistake
Wait improving and recommending performance to be optimized in sensitive, system cold start-up, recommendation accuracy etc..
Summary of the invention
The purpose of the present invention is to propose to the proposed algorithms of a kind of convergence project temperature and expert's coefficient, recommend to calculate to improve
The accuracy of method reduces error and calculates cost.
In order to realize above-mentioned task, the invention adopts the following technical scheme:
A kind of proposed algorithm of convergence project temperature and expert's coefficient, comprising the following steps:
Step 1, according to user context information, using the similar subgroup of method building user context of cluster;
Step 2, the project score data for choosing all users in the similar subgroup of user context constitutes project rating matrix, benefit
Project rating matrix is modified with project temperature scoring coefficient, to reduce the sparsity of project rating matrix, is simplified
Project rating matrix afterwards;
Step 3, coefficient is recommended to correct the feature of simplified project rating matrix using expert, the project after being optimized
Rating matrix;User is calculated using the project rating matrix after optimization to score similarity, and user is scored similarity and background
Similarity carries out linear fit, obtains total similarity, constructs user's similarity matrix;
Step 4, according to user's similarity matrix, the Candidate Set of target user's score in predicting is obtained, by utilizing similarity
Average weighted mode predicts scoring of the target user to project, to obtain project recommendation result.
Further, step 1 specifically includes:
Step 1.1, using the Basic Information Table of all users as sample, user data set is constructed, by user data set amount
Change obtains a sample set list, and each of set point represents a sample after quantization;Select the threshold of two distances
Value: T1 and T2;
Step 1.2, appoint from list and take a point P, calculate point P at a distance from all subset centers;If do not deposited currently
In subset, then by point P as a subset center, and point P is deleted from list, otherwise go to step 1.3;
Step 1.3, such as fruit dot P at a distance from some subset center within T2, then point P deleted from list and by point
P is added in the subset;
Step 1.4, if the distance at fruit dot P and some subset center is between T1 and T2, then point P is added to the subset
In, but point P is not deleted from list;
Step 1.5, such as fruit dot P at a distance from all subset centers all except T1, then by point P as in a subset
The heart, while point P being deleted from list;
Step 1.6, before list is empty, continuous circulation step 1.2 to 1.5;After list is empty, thick cluster process is complete
At obtaining different subsets, all the points in each subset averaged, obtain the central point of each subset;
Step 1.7, the result slightly clustered is clustered using K-means algorithm, generates user context phase by cluster
Like subgroup;Wherein, the cluster initial value of K-means algorithm is the central point of each subset, the K value of K-means algorithm
Slightly to cluster the number for generating subset.
Further, the scoring of project temperature described in step 2 coefficient formulas are as follows:
Wherein, i indicates the project in classification c, and Z (c) indicates the item destination aggregation (mda) under classification c, Pu,iIndicate u pairs of user
The scoring of film i, ∑i∈Z(c)Pu,iIndicate overall score of the user u at classification c, ∑ PuIndicate that user u is total under all categories
Scoring, d (C) is the sum of all categories for the project that user u was commented on, and d (c) is the total of the project of user comment classification c
Number.
Further, recommend coefficient to correct the feature of simplified project rating matrix using expert described in step 3, obtain
Project rating matrix after to optimization, comprising:
Calculate the formula that expert recommends coefficient are as follows:
Wherein, Nu,cIndicate evaluation number of the user u in the project of type c, Nu,CIndicate user u in all types project
On evaluation number, tu,cIndicate that user u concentrates first time to have the project scoring time of type c with the user in user data
It scores the time difference recorded, e is the truth of a matter of natural logrithm;
Recommend coefficient to be added in simplified project rating matrix expert, is carried out with the feature to project rating matrix
Amendment, the project rating matrix after being optimized, calculation formula are as follows:
Ruc(tuc)=ru,c(tuc)×R
In above formula, ru,c(tu,c) it is that expert recommends coefficient, R is the simplified project rating matrix of step 2, Ruc(tuc) be
Project rating matrix after optimization.
Further, user's scoring similarity is calculated using the project rating matrix after optimization described in step 3, and will used
Score similarity and context similarity progress linear fit at family, obtains total similarity, constructs user's similarity matrix, comprising:
The scoring similarity of user is calculated using the project rating matrix after optimization, calculation formula is as follows:
Wherein, u and u1Respectively indicate any two user, pu,cIndicate Ruc(tuc) in scoring of the user u in c intermediate item,Indicate Ruc(tuc) in average score of the user u in all types project,Indicate Ruc(tuc) in user u1In c category
Scoring on mesh,Indicate Ruc(tuc) in user u1Average score in all types project, C indicate project category set;
It is calculated using user basic information meter in the user context in the similar subgroup of same user context between two two users
Similarity, calculation formula are as follows:
Wherein, ruWithIndicate user u and u1The background attribute feature that is obtained after vectorization of user basic information to
Amount;
User's scoring similarity, user context similarity are subjected to linear fit, obtain the total similarity of user:
simUBICF(u,u1)=λ simUB(u,u1)+(1-λ)simIC(u,u1)
Wherein, [0,1] fusion parameters λ ∈;
According to the above-mentioned total calculating formula of similarity of user, the use in the similar subgroup of family background between every two user is calculated
The total similarity in family, to constitute user's similarity matrix simu。
Further, the step 4 specifically includes:
Step 4.1, it for target user u ', is chosen in user's similarity matrix highest with target user's u ' similarity
The Candidate Set U of top n user composition target user's score in predictingneigh;
Step 4.2, by Candidate Set UneighIn the user deletion of items target user u ' that evaluated evaluated
Project, remaining project are the project recommendation Candidate Set for constituting target user;The Similarity-Weighted of project in Candidate Set is average
Minute mark is that target user u ' scores to the prediction of project, and calculation formula is as follows:
Wherein u1For Candidate Set UneighIn any user, simUBICF(u,u1) it is target user u ' and user u1User
Total similarity,Indicate user u1Scoring to project i, N (Uneigh) indicate Candidate Set UneighIn had evaluation to project i
User's number, p 'u′,iIndicate that target user u ' scores to the prediction of project i;
Step 4.3, it is scored according to target user the prediction of project, the project in project Candidate Set is ranked up, is selected
It takes prediction to score highest preceding M project recommendation to target user, obtains project recommendation result.
The present invention has following technical characterstic compared with prior art:
1. the present invention effectively increases the recommendation accuracy under high sparse sample, the concept for introducing similar background subgroup changes
The cold start-up problem of kind recommender system.
2. the present invention recommends coefficient to be modified reduction square to original rating matrix by convergence project temperature and expert
The degree of rarefication of battle array improves the accuracy recommended, and improved proposed algorithm is calculated than traditional collaborative filtering from square mean error amount
Method reduces by 30% or so;20% or so is reduced relative to the collaborative filtering RMSE based on cluster.
3. inventive algorithm reduces proposed algorithm in cost and exists from calculating due to introducing the concept of similar background subgroup
Calculate the calculating cost of adjacent user's collection.
Detailed description of the invention
Fig. 1 is the overall flow schematic diagram of the method for the present invention;
Fig. 2 is the recommendation accuracy curve obtained in emulation experiment of the invention for different fusion parameters λ;
Fig. 3 is the recommendation accuracy curve in emulation experiment of the invention in different N values;
Fig. 4 is for inventive algorithm (UBICF) with the existing collaborative filtering (CCF) based on user, based on cluster
Collaborative filtering (UCF) is recommending the comparison diagram in accuracy.
Specific embodiment
Rule of thumb, general similar users also have similar selection, generate similar background in such a way that user clusters
Subgroup, since the characteristic that new user can refer to is fewer, then when being recommended using collaborative filtering
Recommendation results just will appear deviation, so, this programme using similar users generally there are the empirical features of similar selection to propose phase
Like the concept of subgroup, user is attributed to similar user subgroup first in place of recommending to start and carries out pushing away in next step in subgroup
It recommends.The usual sparsity of the data set applied in recommendation is all very high, then carrying out recommendation Shi Huizeng using collaborative filtering
Intensive, this programme are reduced using the thought of project temperature and expert opinion to be recalculated to recommending data collection with this
Calculation amount and raising accuracy.
Based on thinking above, this programme proposes the improvement of the collaborative filtering based on cluster optimization, mainly for
Unknown subscriber recommend difficult problem and recommend accuracy and recommend the time done optimization generate it is a kind of based on user context and specially
Family's opinion similarity cluster clusters by user and generates similar users recommendation subgroup, produced in each subgroup according to K neighbor algorithm
Raw recommended candidate list, then scores according to weight estimation and generates the formal recommendation list of Top-N.Due to each recommendation row
It is all to be generated in the similar subgroup of user, then being all improved on recommending accuracy and advisory speed.In addition by
Make the cluster subgroup obtained in this way than simple in being added to the features such as user context and expert opinion when generating user subgroup
It is much better that performance of the subgroup in later period recommendation effect is clustered with the user that tradition cluster obtains.Specific steps of the invention are such as
Under:
A kind of proposed algorithm of convergence project temperature and expert's coefficient, comprising the following steps:
Step 1, according to user context information, using the similar subgroup of method building user context of cluster.
Clustering processing is carried out to user basic information table using the clustering algorithm combined in the present invention.Slightly gathered first
Class includes two steps, and the first step is quickly and data to be approximatively divided into some subsets, then by the point in subset with accurately
Calculation method cluster again, the Important Thought for clustering optimization is: subset initial center point K and region half is arranged to sample data set
Diameter, by data efficient be divided into several overlapping subset make all objects all fall within subset-cover in the range of;And it is right
The object in the same area is fallen in, new central point is recalculated and is repartitioned according to the distance between object and new central point
Object affiliated area;Circulation executes the process of " dividing subset calculates central point ", until the position of K central point no longer becomes
Change.It chooses target user to calculate at a distance from the center of different similar background users subgroup, obtains target user and different background phase
Like the similarity relationship of subgroup, finally sample is divided into different groups according to similarity.Specific step is as follows:
Step 1.1, using the Basic Information Table of all users as sample, user data set is constructed, by user data set amount
Change obtains a sample set list, and each of set point represents a sample after quantization;Select the threshold of two distances
Value: T1 and T2, wherein the value of T1 > T2, T1 and T2 can be determined with cross check.
Step 1.2, appoint from list and take a point P, calculate point P at a distance from all subset centers;If do not deposited currently
In subset, then by point P as a subset center, and point P is deleted from list, otherwise go to step 1.3;
Step 1.3, such as fruit dot P at a distance from some subset center within T2, then point P deleted from list and by point
P is added in the subset;
This step is to think that point P is close enough with this subset at this time, therefore point P cannot be re-used as other subsets
Center.
Step 1.4, if the distance at fruit dot P and some subset center is between T1 and T2, then point P is added to the subset
In, but point P is not deleted from list;
Indicate that point P can participate in the cluster process of next round in this way;
Step 1.5, such as fruit dot P at a distance from all subset centers all except T1, then by point P as in a subset
The heart, while point P being deleted from list;
Step 1.6, before list is empty, continuous circulation step 1.2 to 1.5;After list is empty, thick cluster process is complete
At obtaining different subsets, all the points in each subset averaged, obtain the central point of each subset.
After the completion of thick cluster, the central point of each subset and the number of subset are obtained, as what is carefully clustered in next step
Basic parameter.Thin cluster process uses K-means algorithm:
Step 1.7, the result that step 1.6 slightly clusters is clustered using K-means algorithm, generates user by cluster
The similar subgroup of background;Wherein, the cluster initial value of K-means algorithm is the central point of each subset described in step 1.6, K-
The K value of means algorithm is that step 1.6 slightly clusters the number for generating subset.
In the present embodiment, by taking public film data set movieLens-1M as an example, the explanation of different information tables in data set
As shown in table 1, table 2, table 3:
User basic information table in 1 MovieLens of table
Attribute-name | Explanation |
UserID | User's unique identification Unified number 1~6040 |
Gender | Address name, two-value type feature are worth for " F " or " M " |
Age | Age of user, discrete type feature, being worth is 1~56 |
Occupation | User's occupation, discrete type feature, 21 kinds |
Zip-code | Compressed code |
Film Basic Information Table in 2 MovieLens of table
Attribute-name | Explanation |
MovieID | Film unique identification, Unified number 1~3952 |
Title | Movie name |
Genres | Film types includes 18 kinds of different types of films |
User-film score information table in 3 MovieLens of table
Attribute-name | Explanation |
UserID | Customs Assigned Number |
MovieID | Film number |
Rating | Scoring [1,5] of the user to film |
Time | The time that user scores to film is worth for timestamp |
Step 2, the project score data for choosing all users in the similar subgroup of user context constitutes project rating matrix, benefit
Project rating matrix is modified with project temperature scoring coefficient, to reduce the sparsity of project rating matrix, is simplified
Project rating matrix afterwards.
In the present embodiment, the project refers to film, and user items score data extracts from public film data set
User-film score information table in movieLens-1M, as shown in table 3, by the common structure of project score data of all users
At project rating matrix.
In the present solution, the project temperature referred to is defined as:
A. accounting of the user to the scoring of specific type project in all items scoring;
B. user effectively scores the accounting in all items;
Then project temperature scoring coefficient formulas are as follows:
Wherein, i indicates the project in classification c, and Z (c) indicates the item destination aggregation (mda) under classification c, Pu,iIndicate u pairs of user
The scoring of film i, ∑i∈Z(c)Pu,iIndicate overall score of the user u at classification c, ∑ PuIndicate that user u is total under all categories
Scoring, d (C) is the sum of all categories for the project that user u was commented on, and d (c) is the total of the project of user comment classification c
Number.
According to above-mentioned project temperature scoring coefficient, by script using number of items as the sparse rating matrix dimensionality reduction of height of dimension
For using the project category matrix relatively low as the degree of rarefication of dimension, (note: the number of project is bound to be higher than item from the point of view of experience
Mesh classification);Specific method is to be handled using above-mentioned project temperature scoring coefficient formulas project rating matrix,
Obtained result is the revised simplified project rating matrix based on project temperature.
Step 3, coefficient is recommended to correct the feature of simplified project rating matrix using expert, the project after being optimized
Rating matrix;User is calculated using the project rating matrix after optimization to score similarity, and user is scored similarity and background
Similarity carries out linear fit.
After introducing project temperature scoring coefficient is modified project rating matrix, by the reality recommended film
Scene analysis show that the directive property that the selection tendency of user's viewing will receive expert user in group influences, therefore this programme draws
Enter the feature that expert recommends coefficient as revised scoring matrix, merges user context similarity and project temperature and expert recommends phase
It is new user's similarity calculation like degree.
Step 3.1, expert recommends coefficient is defined as:
A. user comments on number in the accounting of general comment to the certain types of project;
B. the time attenuation coefficient of user comment.
Then calculate the formula that expert recommends coefficient are as follows:
Wherein, Nu,cIndicate evaluation number of the user u in the project of type c, Nu,CIndicate user u in all types project
On evaluation number, tu,cIndicate that user u concentrates first time to have the project scoring time of type c with the user in user data
It scores the time difference recorded, e is the truth of a matter of natural logrithm.
Then,Indicate user u different type c project review number accounting,For the decaying of user comment
Coefficient, the coefficient indicate whether user is active in the recent period.R under the influence of attenuation coefficientu,c(tuc) be one and become between (0,1)
The function of time of change, the time is closer, and comment number is more, then indicating that expert's coefficient of the user u is bigger, pushes away to next step
It recommends more advantageous.
Step 3.2, coefficient is recommended to be added in simplified project rating matrix expert, to project rating matrix
Feature is modified, the project rating matrix after being optimized;
It is a number between (0,1) with time change since expert recommends coefficient, then can be used as simplification
The coefficient of project rating matrix afterwards, thus project temperature scoring coefficient and expert recommend coefficient to merge after the matrix that constructs
Calculation formula is as follows:
Ruc(tuc)=ru,c(tuc)×R
In above formula, ru,c(tu,c) it is that expert recommends coefficient, R is the simplified project rating matrix of step 2, Ruc(tuc) be
Project rating matrix after optimization.
Step 3.3, the scoring similarity of user is calculated using the project rating matrix after optimization, calculation formula is as follows:
Wherein, u and u1Respectively indicate any two user, pu,cIndicate Ruc(tuc) in scoring of the user u in c intermediate item,Indicate Ruc(tuc) in average score of the user u in all types project,Indicate Ruc(tuc) in user u1In c category
Scoring on mesh,Indicate Ruc(tuc) in user u1Average score in all types project, C indicate project category set.
Step 3.4, it is calculated using user basic information meter in the similar subgroup of same user context between two two users
User context similarity, calculation formula are as follows:
Wherein, ruWithIndicate user u and u1The background attribute feature that is obtained after vectorization of user basic information to
Amount.
Step 3.5, user's scoring similarity, user context similarity are subjected to linear fit, obtain the total similarity of user:
simUBICF(u,u1)=λ simUB(u,u1)+(1-λ)simIC(u,u1)
Wherein, [0,1] fusion parameters λ ∈.
According to the above-mentioned total calculating formula of similarity of user, the use in the similar subgroup of family background between every two user is calculated
The total similarity in family, to constitute user's similarity matrix simu。
Step 4, according to user's similarity matrix, the Candidate Set of target user's score in predicting is obtained, by utilizing similarity
Average weighted mode predicts scoring of the target user to project, to obtain project recommendation result.
Due to number of users and film quantitative proportion serious unbalance (wherein number of users in original user's film rating matrix
Amount is far smaller than film quantity) cause to have many places scoring blank (i.e. user is to the film without scoring), recommender system in matrix
Final goal be exactly to determine these films not scored whether are liked for user, the film is added to if liking
User recommends not add if not liking in set.
User's similarity matrix sim is obtained by step 3u, then having for each user one big according to similarity
Then the neighbor user set of small sequence chooses film score in predicting Candidate Set of the similarity top n user as target user,
According to user's similarity matrix simuIt is as follows with the process of project rating matrix prediction user's film scoring of user:
Step 4.1, it for target user u ', is chosen in user's similarity matrix highest with target user's u ' similarity
The Candidate Set U of top n user composition target user's score in predictingneigh;Wherein the size of N, which can according to need, is configured;
Step 4.2, by Candidate Set UneighIn the user deletion of items target user u ' that evaluated evaluated
Project, remaining project are the project recommendation Candidate Set for constituting target user;The Similarity-Weighted of project in Candidate Set is average
Minute mark is that target user u ' scores to the prediction of project, and calculation formula is as follows:
Wherein u1For Candidate Set UneighIn any user, simUBICF(u,u1) it is target user u ' and user u1User
Total similarity,Indicate user u1Scoring to project i, N (Uneigh) indicate Candidate Set UneighIn had evaluation to project i
User's number, p 'u′,iIndicate that target user u ' scores to the prediction of project i.
Step 4.3, it is scored according to target user the prediction of project, the project in project Candidate Set is ranked up, is selected
It takes prediction to score highest preceding M project recommendation to target user, obtains project recommendation result;The specific value of M can be according to need
It is set, such as value is 1-5.
Fig. 2 to Fig. 4 gives under the method for the present invention difference emulation experiment as a result, can see from test result, different
Fusion parameters λ recommendation accuracy of the invention is influenced smaller, and different N values is then to recommending the accuracy to have larger impact;
By Fig. 4, it can be seen that, under different N values, the present invention has on mean square error root RMSE value obviously compared to analogous algorithms
Reduction, show that recommendation of the invention has very big promotion compared to existing algorithm.
Claims (6)
1. a kind of proposed algorithm of convergence project temperature and expert's coefficient, which comprises the following steps:
Step 1, according to user context information, using the similar subgroup of method building user context of cluster;
Step 2, the project score data for choosing all users in the similar subgroup of user context constitutes project rating matrix, utilizes item
Mesh temperature scoring coefficient is modified project rating matrix, to reduce the sparsity of project rating matrix, after being simplified
Project rating matrix;
Step 3, coefficient is recommended to correct the feature of simplified project rating matrix using expert, the project scoring after being optimized
Matrix;User's scoring similarity is calculated using the project rating matrix after optimization, and user's scoring similarity is similar to background
Degree carries out linear fit, obtains total similarity, constructs user's similarity matrix;
Step 4, according to user's similarity matrix, the Candidate Set of target user's score in predicting is obtained, by utilizing Similarity-Weighted
Average mode predicts scoring of the target user to project, to obtain project recommendation result.
2. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that step 1 is specific
Include:
Step 1.1, using the Basic Information Table of all users as sample, user data set is constructed, user data set is quantified
To a sample set list, each of set point represents a sample after quantization;Select the threshold value of two distances: T1
And T2;
Step 1.2, appoint from list and take a point P, calculate point P at a distance from all subset centers;If there is currently no sons
Collection, then by point P as a subset center, and point P is deleted from list, otherwise goes to step 1.3;
Step 1.3, such as fruit dot P at a distance from some subset center within T2, then point P is deleted from list and adds point P
It is added in the subset;
Step 1.4, if the distance at fruit dot P and some subset center is between T1 and T2, then point P is added in the subset, but
Point P is not deleted from list;
Step 1.5, such as fruit dot P at a distance from all subset centers all except T1, then by point P as a subset center,
Point P is deleted from list simultaneously;
Step 1.6, before list is empty, continuous circulation step 1.2 to 1.5;After list is empty, thick cluster process is completed,
Different subsets is obtained, all the points in each subset are averaged, obtains the central point of each subset;
Step 1.7, the result slightly clustered is clustered using K-means algorithm, generates the similar son of user context by cluster
Group;Wherein, the cluster initial value of K-means algorithm is the central point of each subset, and the K value of K-means algorithm is thick
Cluster generates the number of subset.
3. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that described in step 2
Project temperature score coefficient formulas are as follows:
Wherein, i indicates the project in classification c, and Z (c) indicates the item destination aggregation (mda) under classification c, Pu,iIndicate user u to film i
Scoring, ∑i∈Z(c)Pu,iIndicate overall score of the user u at classification c, ∑ PuIndicate overall score of the user u under all categories,
D (C) is the sum of all categories for the project that user u was commented on, and d (c) is the sum of the project of user comment classification c.
4. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that described in step 3
Recommend coefficient to correct the feature of simplified project rating matrix using expert, the project rating matrix after being optimized, packet
It includes:
Calculate the formula that expert recommends coefficient are as follows:
Wherein, Nu,cIndicate evaluation number of the user u in the project of type c, Nu,CIndicate user u in all types project
Evaluate number, tu,cIndicate that user u concentrates first time to have scoring the project scoring time of type c and the user in user data
The time difference of record, e are the truth of a matter of natural logrithm;
Recommend coefficient to be added in simplified project rating matrix expert, is repaired with the feature to project rating matrix
Just, the project rating matrix after being optimized, calculation formula are as follows:
Ruc(tuc)=ru,c(tuc)×R
In above formula, ru,c(tu,c) it is that expert recommends coefficient, R is the simplified project rating matrix of step 2, Ruc(tuc) it is after optimizing
Project rating matrix.
5. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that described in step 3
Calculate user using the project rating matrix after optimization and score similarity, and by user score similarity and context similarity into
Row linear fit obtains total similarity, constructs user's similarity matrix, comprising:
The scoring similarity of user is calculated using the project rating matrix after optimization, calculation formula is as follows:
Wherein, u and u1Respectively indicate any two user, pu,cIndicate Ruc(tuc) in scoring of the user u in c intermediate item,Table
Show Ruc(tuc) in average score of the user u in all types project,Indicate Ruc(tuc) in user u1In c intermediate item
Scoring,Indicate Ruc(tuc) in user u1Average score in all types project, C indicate project category set;
It is calculated using user basic information meter similar in the user context in the similar subgroup of same user context between two two users
Degree, calculation formula are as follows:
Wherein, ruWithIndicate user u and u1The background attribute feature vector that is obtained after vectorization of user basic information;
User's scoring similarity, user context similarity are subjected to linear fit, obtain the total similarity of user:
simUBICF(u,u1)=λ simUB(u,u1)+(1-λ)simIC(u,u1)
Wherein, [0,1] fusion parameters λ ∈;
According to the above-mentioned total calculating formula of similarity of user, the user calculated in the similar subgroup of family background between every two user is total
Similarity, to constitute user's similarity matrix simu。
6. the proposed algorithm of convergence project temperature and expert's coefficient as described in claim 1, which is characterized in that the step
4 specifically include:
Step 4.1, it for target user u ', is chosen and the highest preceding N of target user u ' similarity in user's similarity matrix
A user constitutes the Candidate Set U of target user's score in predictingneigh;
Step 4.2, by Candidate Set UneighIn the project evaluated of the deletion of items target user u ' that evaluated of user,
Remaining project is the project recommendation Candidate Set for constituting target user;The Similarity-Weighted average mark of project in Candidate Set is denoted as
Target user u ' scores to the prediction of project, and calculation formula is as follows:
Wherein u1For Candidate Set UneighIn any user, simUBICF(u,u1) it is target user u ' and user u1The total phase of user
Like degree,Indicate user u1Scoring to project i, N (Uneigh) indicate Candidate Set UneighIn to project i had evaluation user
Number, p 'u′,iIndicate that target user u ' scores to the prediction of project i;
Step 4.3, it is scored according to target user the prediction of project, the project in project Candidate Set is ranked up, chosen pre-
The highest preceding M project recommendation of assessment point obtains project recommendation result to target user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910128705.4A CN109977299B (en) | 2019-02-21 | 2019-02-21 | Recommendation algorithm fusing project popularity and expert coefficient |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910128705.4A CN109977299B (en) | 2019-02-21 | 2019-02-21 | Recommendation algorithm fusing project popularity and expert coefficient |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977299A true CN109977299A (en) | 2019-07-05 |
CN109977299B CN109977299B (en) | 2022-12-27 |
Family
ID=67077170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910128705.4A Active CN109977299B (en) | 2019-02-21 | 2019-02-21 | Recommendation algorithm fusing project popularity and expert coefficient |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109977299B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110490686A (en) * | 2019-07-08 | 2019-11-22 | 西北大学 | A kind of building of commodity Rating Model, recommended method and system based on Time Perception |
CN110910215A (en) * | 2019-11-20 | 2020-03-24 | 深圳前海微众银行股份有限公司 | Product recommendation method, device, equipment and computer-readable storage medium |
CN111191707A (en) * | 2019-12-25 | 2020-05-22 | 浙江工商大学 | LFM training sample construction method fusing time attenuation factors |
CN111486345A (en) * | 2020-03-10 | 2020-08-04 | 安徽科杰粮保仓储设备有限公司 | Grain depot underground pipe network liquid leakage on-line monitoring and early warning method and device |
CN113497831A (en) * | 2021-06-30 | 2021-10-12 | 西安交通大学 | Content placement method and system based on feedback popularity under mobile edge network |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102479202A (en) * | 2010-11-26 | 2012-05-30 | 卓望数码技术(深圳)有限公司 | Recommendation system based on domain expert |
CN104317900A (en) * | 2014-10-24 | 2015-01-28 | 重庆邮电大学 | Multiattribute collaborative filtering recommendation method oriented to social network |
CN105868237A (en) * | 2015-12-09 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Multimedia data recommendation method and server |
CN106021329A (en) * | 2016-05-06 | 2016-10-12 | 西安电子科技大学 | A user similarity-based sparse data collaborative filtering recommendation method |
CN108205682A (en) * | 2016-12-19 | 2018-06-26 | 同济大学 | It is a kind of for the fusion content of personalized recommendation and the collaborative filtering method of behavior |
CN108647724A (en) * | 2018-05-11 | 2018-10-12 | 国网电子商务有限公司 | A kind of user's recommendation method and device based on simulated annealing |
US20180322206A1 (en) * | 2017-05-05 | 2018-11-08 | Microsoft Technology Licensing, Llc | Personalized user-categorized recommendations |
CN108804683A (en) * | 2018-06-13 | 2018-11-13 | 重庆理工大学 | Associate(d) matrix decomposes and the film of collaborative filtering recommends method |
CN109166017A (en) * | 2018-10-12 | 2019-01-08 | 平安科技(深圳)有限公司 | Method for pushing, device, computer equipment and storage medium based on reunion class |
CN109360057A (en) * | 2018-10-12 | 2019-02-19 | 平安科技(深圳)有限公司 | Information-pushing method, device, computer equipment and storage medium |
-
2019
- 2019-02-21 CN CN201910128705.4A patent/CN109977299B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102479202A (en) * | 2010-11-26 | 2012-05-30 | 卓望数码技术(深圳)有限公司 | Recommendation system based on domain expert |
CN104317900A (en) * | 2014-10-24 | 2015-01-28 | 重庆邮电大学 | Multiattribute collaborative filtering recommendation method oriented to social network |
CN105868237A (en) * | 2015-12-09 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Multimedia data recommendation method and server |
CN106021329A (en) * | 2016-05-06 | 2016-10-12 | 西安电子科技大学 | A user similarity-based sparse data collaborative filtering recommendation method |
CN108205682A (en) * | 2016-12-19 | 2018-06-26 | 同济大学 | It is a kind of for the fusion content of personalized recommendation and the collaborative filtering method of behavior |
US20180322206A1 (en) * | 2017-05-05 | 2018-11-08 | Microsoft Technology Licensing, Llc | Personalized user-categorized recommendations |
CN108647724A (en) * | 2018-05-11 | 2018-10-12 | 国网电子商务有限公司 | A kind of user's recommendation method and device based on simulated annealing |
CN108804683A (en) * | 2018-06-13 | 2018-11-13 | 重庆理工大学 | Associate(d) matrix decomposes and the film of collaborative filtering recommends method |
CN109166017A (en) * | 2018-10-12 | 2019-01-08 | 平安科技(深圳)有限公司 | Method for pushing, device, computer equipment and storage medium based on reunion class |
CN109360057A (en) * | 2018-10-12 | 2019-02-19 | 平安科技(深圳)有限公司 | Information-pushing method, device, computer equipment and storage medium |
Non-Patent Citations (5)
Title |
---|
U LIJI 等: "Improved personalized recommendation based on user attributes clustering and score matrix filling", 《COMPUTER STANDARDS & INTERFACES》 * |
吴一帆 等: "结合用户背景信息的协同过滤推荐算法", 《计算机应用》 * |
王宇飞 等: "基于用户评分和项目类偏好的协同过滤推荐算法", 《软件导刊》 * |
薛妍: "高校学生就业推荐算法研究及应用", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
黎安能: "基于hadoop的改进聚类协同过滤推荐算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110490686A (en) * | 2019-07-08 | 2019-11-22 | 西北大学 | A kind of building of commodity Rating Model, recommended method and system based on Time Perception |
CN110910215A (en) * | 2019-11-20 | 2020-03-24 | 深圳前海微众银行股份有限公司 | Product recommendation method, device, equipment and computer-readable storage medium |
CN111191707A (en) * | 2019-12-25 | 2020-05-22 | 浙江工商大学 | LFM training sample construction method fusing time attenuation factors |
CN111191707B (en) * | 2019-12-25 | 2023-06-06 | 浙江工商大学 | LFM training sample construction method integrating time attenuation factors |
CN111486345A (en) * | 2020-03-10 | 2020-08-04 | 安徽科杰粮保仓储设备有限公司 | Grain depot underground pipe network liquid leakage on-line monitoring and early warning method and device |
CN111486345B (en) * | 2020-03-10 | 2021-08-24 | 安徽科杰粮保仓储设备有限公司 | Grain depot underground pipe network liquid leakage on-line monitoring and early warning method and device |
CN113497831A (en) * | 2021-06-30 | 2021-10-12 | 西安交通大学 | Content placement method and system based on feedback popularity under mobile edge network |
CN113497831B (en) * | 2021-06-30 | 2022-10-25 | 西安交通大学 | Content placement method and system based on feedback popularity under mobile edge network |
Also Published As
Publication number | Publication date |
---|---|
CN109977299B (en) | 2022-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977299A (en) | A kind of proposed algorithm of convergence project temperature and expert's coefficient | |
Li et al. | Using multidimensional clustering based collaborative filtering approach improving recommendation diversity | |
CN107833117B (en) | Bayesian personalized sorting recommendation method considering tag information | |
CN102609523B (en) | The collaborative filtering recommending method classified based on taxonomy of goods and user | |
CN107220365B (en) | Accurate recommendation system and method based on collaborative filtering and association rule parallel processing | |
CN103186539B (en) | A kind of method and system determining user group, information inquiry and recommendation | |
CN103678672B (en) | Method for recommending information | |
CN103544216B (en) | The information recommendation method and system of a kind of combination picture material and keyword | |
WO2016191959A1 (en) | Time-varying collaborative filtering recommendation method | |
CN107122980B (en) | Method and device for identifying categories to which commodities belong | |
CN103559622A (en) | Characteristic-based collaborative filtering recommendation method | |
CN107256238B (en) | personalized information recommendation method and information recommendation system under multiple constraint conditions | |
CN102411754A (en) | Personalized recommendation method based on commodity property entropy | |
CN108665323A (en) | A kind of integrated approach for finance product commending system | |
CN103136683A (en) | Method and device for calculating product reference price and method and system for searching products | |
CN107329994A (en) | A kind of improvement collaborative filtering recommending method based on user characteristics | |
CN107633444A (en) | Commending system noise filtering methods based on comentropy and fuzzy C-means clustering | |
CN104778237A (en) | Individual recommending method and system based on key users | |
CN108874916A (en) | A kind of stacked combination collaborative filtering recommending method | |
CN109544231A (en) | Logistics distribution service personalization recommended method based on Logistics Information Platform | |
CN106294788B (en) | The recommendation method of Android application | |
Gong | Employing User Attribute and Item Attribute to Enhance the Collaborative Filtering Recommendation. | |
Gholamian et al. | Improving electronic customers' profile in recommender systems using data mining techniques | |
CN110020918B (en) | Recommendation information generation method and system | |
CN108287902B (en) | Recommendation system method based on data non-random missing mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |