CN108389113A - A kind of collaborative filtering recommending method and system - Google Patents

A kind of collaborative filtering recommending method and system Download PDF

Info

Publication number
CN108389113A
CN108389113A CN201810240236.0A CN201810240236A CN108389113A CN 108389113 A CN108389113 A CN 108389113A CN 201810240236 A CN201810240236 A CN 201810240236A CN 108389113 A CN108389113 A CN 108389113A
Authority
CN
China
Prior art keywords
project
similarity
scoring
preset
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810240236.0A
Other languages
Chinese (zh)
Other versions
CN108389113B (en
Inventor
胡超
谭北海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201810240236.0A priority Critical patent/CN108389113B/en
Publication of CN108389113A publication Critical patent/CN108389113A/en
Application granted granted Critical
Publication of CN108389113B publication Critical patent/CN108389113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/17Function evaluation by approximation methods, e.g. inter- or extrapolation, smoothing, least mean square method
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Algebra (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of collaborative filtering recommending method and system, the similarity to be scored according to project is merged with the similarity of item label, has devised new item similarity measure method, can improve the accuracy rate that project recommendation is carried out to user.The technical issues of it is not high enough to solve the result accuracy rate that existing collaborative filtering recommending mode is recommended, information similar with the interest preference of new user can not be searched according to scoring and recommends, is easy to cause user experience to reduce because of data sparsity problem.

Description

A kind of collaborative filtering recommending method and system
Technical field
The present invention relates to data mining recommended technology field more particularly to a kind of collaborative filtering recommending method and systems.
Background technology
With the development of big data, the content on network is more and more abundant, and people are on network constantly to needed for oneself The information wanted is searched, and marine greatly in boundless and indistinct information, and the information that accurately find oneself needs is one relatively difficult Thing, need to expend more time and efforts.
Commending system is a kind of information filtering system of special shape, commending system by analyze user historical interest and Preference information can determine the project that user the present and the future may like in project space, and then actively be carried to user For corresponding project recommendation service.Information retrieval process is initiated by user different from general information retrieval service, commending system can Actively to provide a user the recommended suggestion about information resources.Information retrieval service is in the interactive process with user, generally The historical interest of user is not indicated, analyzed and utilized, and commending system can actively be remembered in the interactive process of user The historical interest information for employing family, models the information requirement of user, forms the knowledge mould about user interest and preference Type, and the information recommendation service final according to the model realization.
Currently, most commonly used recommendation method is collaborative filtering recommending method, collaborative filtering recommending is divided into be based on label Recommendation and project-based recommendation.Its core concept of project-based collaborative filtering recommending is to recommend those and they to user The similar article of article liked before, but project-based collaborative filtering recommending does not utilize the contents attribute of article to calculate The similarity of article, it mainly calculates the similarity of article by analyzing the behavior record of user.Traditional project-based association Be the single feedback information for having considered user's evaluation with filtered recommendation, i.e., project scoring similarity, it is known that scoring note Record is seldom, and recommendation effect is not fine.Collaborative filtering recommending based on label, label are the letters of description object attribute Breath, it embody article some attributes and user to some views of project, be the important information source for reflecting user data.It is logical Label computational item purpose similarity is crossed, the degree of association and similarity of article can be preferably embodied.Traditional association based on label Same filtered recommendation only solely considers the similarity of item label when calculating item label similarity, not from The historical data and record of user is started with, the effect for recommending out also unobvious.
Still the collaborative filtering recommending based on label, the result of recommendation are accurate for either project-based collaborative filtering recommending Rate is all not high enough, and dependent on the collaborative filtering recommending method of score information, can not be searched according to score information with new user's The similar information of interest preference is simultaneously recommended, and is easy to cause user experience to reduce because of data sparsity problem.
Invention content
An embodiment of the present invention provides a kind of collaborative filtering recommending method and systems, are pushed away for solving existing collaborative filtering The result accuracy rate that the mode of recommending is recommended is not high enough, can not search information similar with the interest preference of new user according to scoring and go forward side by side Row is recommended, the technical issues of being easy to cause user experience to reduce because of data sparsity problem.
A kind of collaborative filtering recommending method provided by the invention, including:
S1:According to user's score data of all preset projects and the preset project of acquisition, structure project scoring square Battle array;
S2:According to the project rating matrix, calculating single user is poor to the scoring of preset project described in each two, passes through First Sigmoid functions obtain the first similarity, and calculate the single user to the scoring of the single preset project with comment The difference for dividing the scoring intermediate value of range, obtains the second similarity, and calculate the single user to every by the 2nd Sigmoid functions The scoring of two preset projects and single user pass through third to the difference of the grade average of all preset projects Sigmoid functions obtain third phase like degree;
S3:Product calculating is carried out according to first similarity, second similarity and the third similarity, is obtained The first item scoring similarity of preset project described in each two;
S4:Any active ues of preset project described in each two are punished by penalty, calculate preset item described in each two Purpose any active ues number accounts for the proportion of all scoring user numbers of preset project described in each two, obtains pre- described in each two Set the second item scoring similarity of project;
S5:Product calculating is carried out to first item scoring similarity and second item scoring similarity, is obtained Third item scoring similarity;
S6:The item label collection of all preset projects is converted to m dimension value type label vectors, according to similarity degree Quantity algorithm calculates the item label similarity of the preset project two-by-two;
S7:According to the preset weight that third item scoring similarity is occupied with the item label similarity, calculate Item label collaborative filtering similarity.
Preferably, first similarity is:
Wherein, Proximity (Rui,Ruj) be project i and project j first similarity, RuiIt is user u to project i Scoring, RujScoring for user u to project j,For the first Sigmoid functions.
Preferably, second similarity is:
Wherein, Significance (Rui,Ruj) be project i and project j second similarity, RuiIt is user u to item The scoring of mesh i, RujScoring for user u to project j, RmedFor score range intermediate value,For the 2nd Sigmoid functions.
Preferably, the third similarity is:
Wherein, Singularity (Rui,Ruj) be project i and project j the third similarity, RuiIt is user u to item The scoring of mesh i, RujScoring for user u to project j,For the average value that user u scores to all items,For the 3rd Sigmoid functions.
Preferably, the second item scoring similarity is:
Wherein, Nu(i, j) is indicated not only to have commented on project i but also was commented on the number of users of project j, and N (i) and N (j) are respectively represented Commented on the number of users of project i and project j, UijFor all user's manifolds of project i and project j.
Preferably, further include after step S7:
S8:The nearest of the preset project is calculated by k nearest neighbor algorithm according to the item label collaborative filtering similarity Neighbours collect, and calculating user according to the nearest-neighbors collection scores to the prediction for the preset project not scored, and generation pushes away Recommend list.
A kind of Collaborative Filtering Recommendation System provided by the invention, including:
Matrix construction unit is used for user's score data of all preset projects and the preset project according to acquisition, Structure project rating matrix;
Score computing unit, for calculating single user to preset project described in each two according to the project rating matrix Scoring it is poor, the first similarity is obtained by the first Sigmoid functions, and calculate single user to the single preset project The difference of the scoring intermediate value of scoring and preset scoring range, obtains the second similarity, and calculate single by the 2nd Sigmoid functions User to the difference of the grade average of all preset projects, leads to the scoring of preset project described in each two and single user It crosses the 3rd Sigmoid functions and obtains third phase like degree;
First computing unit, for according to first similarity, second similarity and the third similarity into Row product calculates, and obtains the first item scoring similarity of preset project described in each two;
Second computing unit, any active ues for punishing preset project described in each two by penalty, meter Any active ues number for calculating preset project described in each two accounts for all scoring user numbers of preset project described in each two Proportion, obtain preset project described in each two second item scoring similarity;
Third computing unit, for being carried out to first item scoring similarity and second item scoring similarity Product calculates, and obtains third item scoring similarity;
4th computing unit, for by the item label collection of all preset projects be converted into m dimension value type labels to Amount calculates the item label similarity of the preset project two-by-two according to measuring similarity algorithm;
5th computing unit, for scoring what similarity was occupied with the item label similarity according to the third item Preset weight calculates item label collaborative filtering similarity.
Preferably, first similarity is:
Wherein, Proximity (Rui,Ruj) be project i and project j first similarity, RuiIt is user u to project i Scoring, RujScoring for user u to project j,For the first Sigmoid functions.
Preferably, second similarity is:
Wherein, Significance (Rui,Ruj) be project i and project j second similarity, RuiIt is user u to item The scoring of mesh i, RujScoring for user u to project j, RmedFor score range intermediate value, For the 2nd Sigmoid functions.
Preferably, the second item scoring similarity is:
Wherein, Nu(i, j) is indicated not only to have commented on project i but also was commented on the number of users of project j, and N (i) and N (j) are respectively represented Commented on the number of users of project i and project j, UijFor all user's manifolds of project i and project j.
A kind of collaborative filtering recommending method provided by the invention, by calculating single user to preset project described in each two Scoring it is poor, the first similarity is obtained according to the first Sigmoid functions, by calculating single user to the single preset project Scoring with it is preset scoring range scoring intermediate value difference and the 2nd Sigmoid functions obtain the second similarity, pass through calculate list A user is to the scoring of preset project described in each two and single user to the difference of the grade average of all preset projects Third similarity is obtained with the 3rd Sigmoid functions, then the first similarity, the second similarity and third similarity are multiplied Product calculates, and obtains first item scoring similarity;The active use of preset project described in each two is punished by penalty Family, any active ues number for calculating preset project described in each two account for all scoring users of preset project described in each two The proportion of number obtains the second item scoring similarity of preset project described in each two;It scores further according to first item similar Degree and second item scoring similarity carry out product calculating, obtain third item scoring similarity;Meanwhile by all preset projects Item label collection be converted into m dimension value type label vectors, the project of preset project two-by-two is calculated according to measuring similarity algorithm Label similarity;According to the preset weight that third item scoring similarity is occupied with item label similarity, item label is calculated Collaborative filtering similarity.The similarity that the present invention scores according to project is merged with the similarity of item label, is had devised New item label collaborative filtering similarity, can improve the accuracy rate of recommendation, can also to carried out scoring user and not into The user of row scoring provides corresponding recommend.It is not high enough to solve the result accuracy rate that existing collaborative filtering recommending mode is recommended, Information similar with the interest preference of new user can not be searched according to scoring and is recommended, and be easy to lead because of data sparsity problem The technical issues of causing user experience to reduce.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention without having to pay creative labor, may be used also for those of ordinary skill in the art To obtain other attached drawings according to these attached drawings.
Fig. 1 is a kind of flow diagram of one embodiment of collaborative filtering recommending method provided in an embodiment of the present invention;
Fig. 2 is a kind of flow signal of another embodiment of collaborative filtering recommending method provided in an embodiment of the present invention Figure;
Fig. 3 is a kind of structural schematic diagram of one embodiment of Collaborative Filtering Recommendation System provided in an embodiment of the present invention.
Specific implementation mode
An embodiment of the present invention provides a kind of collaborative filtering recommending method and systems, are pushed away for solving existing collaborative filtering The result accuracy rate that the mode of recommending is recommended is not high enough, can not search information similar with the interest preference of new user according to scoring and go forward side by side Row is recommended, the technical issues of being easy to cause user experience to reduce because of data sparsity problem.
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that disclosed below Embodiment be only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, this field All other embodiment that those of ordinary skill is obtained without making creative work, belongs to protection of the present invention Range.
Referring to Fig. 1, Fig. 1 is a kind of flow chart of one embodiment of collaborative filtering recommending method provided by the invention, A kind of one embodiment of collaborative filtering recommending method of the present invention, including:
Step 101:According to user's score data of all preset projects and all preset projects of acquisition, structure project is commented Sub-matrix.
It should be noted that can be using all preset projects as row vector, by user's scoring number of all preset projects According to as column vector, project rating matrix is built, can obviously reflect each user to each preset by project rating matrix The scoring situation of project.
Step 102:According to project rating matrix, calculating single user is poor to the scoring of the preset project of each two, passes through first Sigmoid functions obtain the first similarity, and calculate scoring and preset scoring range of the single user to single preset project The difference for the intermediate value that scores, obtains the second similarity, and calculate single user to the preset project of each two by the 2nd Sigmoid functions Scoring and single user to the difference of the grade average of all preset projects, third phase is obtained by the 3rd Sigmoid functions Like degree.
It should be noted that preset scoring range can be { 1,2,3,4,5 }, then intermediate value is 3, and preset scoring range can It is preset to be carried out according to actual use situation.Sigmoid functions are also referred to as S sigmoid growth curves, and defined formula is:Since singly properties, the Sigmoid functions such as increasing and the increasing of inverse function list are often used as the threshold value of neural network for it Function, by variable mappings to 0, between 1.
Step 103:Product calculating is carried out according to the first similarity, the second similarity and third similarity, obtains each two The first item scoring similarity of preset project.
It should be noted that the first item scoring similarity between two projects of project i and project j can be defined as:Wherein, UijFor all set for evaluating project i and project j user, PSSu(Rui, Ruj)=Proximity (Rui,Ruj)·Significance(Rui,Ruj)·Singularity(Rui,Ruj) be user u to project Scoring correlation between i and j.Proximity(Rui,Ruj) it is the first similarity, Significance (Rui,Ruj) it is the second phase Like degree, Singularity (Rui,Ruj) it is third similarity.
Step 104:Any active ues of the preset project of each two are punished by penalty, calculate the preset project of each two Any active ues number accounts for the proportion of all scoring user numbers of the preset project of each two, obtains the second of the preset project of each two Project scoring similarity.
Website is patronized from time to time it should be noted that any active ues refer to those, and brings some to be worth for website User.In the present invention, to cut down influence of the user for often going to score to project to calculating project scoring similarity, pass through Penalty 1/ [In (1+ | Nu(i, j) |)] any active ues are punished, calculate any active ues of the preset project of each two Number accounts for the proportion of all scoring user numbers of the preset project of each two, obtains the second item scoring phase of the preset project of each two Like degree.
Step 105:Product calculating is carried out to first item scoring similarity and second item scoring similarity, obtains third Project scoring similarity.
It should be noted that third item scoring similarity is:IPSS (i, j)=PSS (i, j) IIF (i, j), wherein PSS (i, j) is first item scoring similarity, and IIF (i, j) is second item scoring similarity.
Step 106:The item label collection of all preset projects is converted to m dimension value type label vectors, according to similarity Metric algorithm calculates the item label similarity of preset project two-by-two.
It should be noted that item label similarity relied on is label in data set.It, will in order to facilitate Unify legislation Item label collection is converted to the numeric type label vector of m dimensions.Assuming that project i1Tally set corresponding to numeric type label to Amount is t (t1,t2,t3...tm), project i2Tally set corresponding to numeric type label vector be s (s1,s2,s3...sm).More Body can be, for example, the tally set of film Movie1 is { Action, Children, Comedy }, the label of film Movie2 Collection is { Action, Children, Drama }, between two films the calculating of similarity it is similar can be converted into its text label collection The calculating of degree.It is very difficult due to calculating the similarity between text label collection, therefore it is converted into simply by complicated text Numeric type vector is calculated with facilitating.In MovieLens data sets, all films are divided into 19 types by the expert of related field Type, therefore the label of film shares 19 kinds.Each film describes its characteristic attribute with this tally set, can be tieed up with one 19 Numeric type vector indicate.Vector all indicates that 1 expression film belongs to this type, and 0 indicates film not per one-dimensional with 0 or 1 Belong to this type.In this way, the tally set of film has just been converted to corresponding numeric type label vector.It is corresponding by calculating Similarity between numeric type label vector can obtain the similarity between film.It is similar between evaluation type vector Property there are many method, such as Euclidean distance method, cosine similarity method, Pearson correlation coefficient method.The present invention is real It applies example and selects similitude between cosine similarity method evaluation type label vector when calculating item label similarity.
Wherein indicate the inner product between two vectors.In cosine similarity metric method, cosine value is bigger, vector folder With regard to smaller, the similarity degree between project is also higher at angle.
It should be noted that item label similarity is:Wherein, i is item The label vector of mesh i, j are the label vector of project j.
It should also be noted that, precedence relationship is not present between step 106 and step 101 to step 105.
Step 107:According to the preset weight that third item scoring similarity is occupied with item label similarity, project is calculated Label collaborative filtering similarity.
It should be noted that after obtaining third item scoring similarity and item label similarity, by this third item The similarity that scores is combined with item label similarity according to contribution rate (preset weight α), is just formd and is cooperateed with based on item label Filter similarity:
ITPSS (i, j)=α TSIM (i, j)+(1- α) IPSS (i, j)
New item similarity is combined by the item similarity and project original item similarity of label, α ∈ (0,1).
A kind of collaborative filtering recommending method provided in an embodiment of the present invention, by calculating single user to the preset item of each two Purpose scoring is poor, and the first similarity is obtained according to the first Sigmoid functions, by calculating single user to single preset project The difference of the scoring intermediate value of scoring and preset scoring range and the 2nd Sigmoid functions obtain the second similarity, single by calculating User is to the scoring of the preset project of each two and single user to the difference and third of the grade average of all preset projects Sigmoid functions obtain third similarity, then carry out product meter to the first similarity, the second similarity and third similarity It calculates, obtains first item scoring similarity;Any active ues of the preset project of each two are punished by penalty, calculate each two Any active ues number of preset project accounts for the proportion of all scoring user numbers of the preset project of each two, and it is preset to obtain each two The second item scoring similarity of project;Product is carried out further according to first item scoring similarity and second item scoring similarity It calculates, obtains third item scoring similarity;Meanwhile converting the item label collection of all preset projects to m dimension value type marks Label vector, the item label similarity of preset project two-by-two is calculated according to measuring similarity algorithm;According to third item scoring phase Like the preset weight that degree occupies with item label similarity, item label collaborative filtering similarity is calculated.The present invention is according to project The similarity of scoring is merged with the similarity of item label, has devised new item label collaborative filtering similarity, can To improve the accuracy rate recommended, corresponding recommend can also be provided to having carried out scoring user and the user not scored.Solution The result accuracy rate that existing collaborative filtering recommending mode is recommended of having determined is not high enough, and the interest with new user can not be searched according to scoring The similar information of preference is simultaneously recommended, the technical issues of being easy to cause user experience to reduce because of data sparsity problem.
It is a kind of one embodiment of collaborative filtering recommending method provided by the invention above, is provided by the invention below A kind of another embodiment of collaborative filtering recommending method.
Referring to Fig. 2, Fig. 2 is a kind of flow signal of one embodiment of collaborative filtering recommending method provided by the invention Figure, a kind of collaborative filtering recommending method provided by the invention, including:
Step 201:According to user's score data of all preset projects and all preset projects of acquisition, structure project is commented Sub-matrix.
Step 202:According to project rating matrix, calculating single user is poor to the scoring of the preset project of each two, passes through first Sigmoid functions obtain the first similarity, and calculate scoring and preset scoring range of the single user to single preset project The difference for the intermediate value that scores, obtains the second similarity, and calculate single user to the preset project of each two by the 2nd Sigmoid functions Scoring and single user to the difference of the grade average of all preset projects, third phase is obtained by the 3rd Sigmoid functions Like degree.
Step 203:Product calculating is carried out according to the first similarity, the second similarity and third similarity, obtains each two The first item scoring similarity of preset project.
Step 204:Any active ues of the preset project of each two are punished by penalty, calculate the preset project of each two Any active ues number accounts for the proportion of all scoring user numbers of the preset project of each two, obtains the second of the preset project of each two Project scoring similarity.
Step 205:Product calculating is carried out to first item scoring similarity and second item scoring similarity, obtains third Project scoring similarity.
Step 206:The item label collection of all preset projects is converted to m dimension value type label vectors, according to similarity Metric algorithm calculates the item label similarity of preset project two-by-two.
Step 207:According to the preset weight that third item scoring similarity is occupied with item label similarity, project is calculated Label collaborative filtering similarity.
Further, the first similarity is:
Wherein, Proximity (Rui,Ruj) be project i and project j the first similarity, RuiProject i is commented for user u Point, RujScoring for user u to project j,For the first Sigmoid functions.
It should be noted that can be obtained according to the property of the first similarity formula and Sigmoid functions, Proximity (Rui,Ruj) value (0,1/2], if scoring difference bigger, Proximity (Rui,Ruj) value will be smaller, and then two Correlation degree between project, similitude are smaller.
Further, the second similarity is:
Wherein, Significance (Rui,Ruj) be project i and project j the second similarity, RuiIt is user u to project i Scoring, RujScoring for user u to project j, RmedFor score range intermediate value, For the 2nd Sigmoid functions.
It should be noted that RmedWhat is represented is the intermediate value of scoring, if the range of scoring is { 1,2,3,4,5 }, then Rmed =3.Two, then scorings remoter from intermediate value of scoring are can be seen that from the property of the second similarity formula combination Sigmoid functions It is more significant.Significance(Rui,Ruj) value [1/2,1) between change, two scorings are remoter with intermediate value, then Significance(Rui,Ruj) value is bigger, the similitude for calculating scoring is also more significant.
Further, third similarity is:
Wherein, Singularity (Rui,Ruj) be project i and project j third similarity, RuiIt is user u to project i's Scoring, RujScoring for user u to project j,For the average value that user u scores to all items,For the 3rd Sigmoid functions.
It should be noted that can be seen that Singularity from the property of third similarity formula combination Sigmoid functions (Rui,Ruj) value (0,1/2] variation, ifValue is bigger, then Singularity (Rui,Ruj) value is smaller, Illustrate that two item similarities are lower.
Further, second item scoring similarity is:
Wherein, Nu(i, j) is indicated not only to have commented on project i but also was commented on the number of users of project j, and N (i) and N (j) are respectively represented Commented on the number of users of project i and project j, UijFor all user's manifolds of project i and project j.
It should be noted that during the present invention is implemented, often go the user to score project to calculating project to cut down Score similarity influence, by penalty 1/ [In (and 1+ | Nu(i, j) |)] any active ues are punished, calculate each two Any active ues number of preset project accounts for the proportion of all scoring user numbers of the preset project of each two, and it is preset to obtain each two The second item scoring similarity of project:Wherein NuItem was both commented in (i, j) expression Mesh i commented on the number of users of project j again, and N (i) and N (j) respectively represent the number of users for commenting on project i and project j.
Further, further include:
Step 208:The nearest-neighbors of preset project are calculated by k nearest neighbor algorithm according to item label collaborative filtering similarity Collection calculates user according to nearest-neighbors collection and scores the prediction for the preset project not scored, generates recommendation list.
It should be noted that k nearest neighbor algorithm (k-Nearest Neighbor, KNN) is a theoretically comparative maturity One of method and simplest machine learning algorithm.The thinking of this method is:If k of the sample in feature space Most of in a most like sample (i.e. closest in feature space) belong to some classification, then the sample also belongs to this Classification.K nearest neighbor algorithm is the prior art, herein without being described in detail.
It scores the prediction for the preset project not scored it should be noted that calculating user according to nearest-neighbors collection, Wherein, the calculation formula of the prediction scoring of u couples of project i of calculating user is:
That is it is similarity, that is, ITPSS (i, j) of project i and j that project i, which has k neighbour's project, sim (i, j),.S is arest neighbors Set, RujRepresent scorings of the user u to project j.Prediction scoring obtained by calculation is established one and is pushed away according to scoring height List is recommended, project recommendation is carried out to user.
It is to be pushed away below to a kind of collaborative filtering to a kind of explanation of another embodiment of collaborative filtering recommending method above The one embodiment for recommending system illustrates.
Referring to Fig. 3, Fig. 3 is a kind of one embodiment of Collaborative Filtering Recommendation System, a kind of collaboration provided by the invention Filtered recommendation system, including:
Matrix construction unit 301, for according to all preset projects of acquisition and user's score data of preset project, structure Build project rating matrix;
Score computing unit 302, is commented the preset project of each two for calculating single user according to project rating matrix It is point poor, the first similarity is obtained by the first Sigmoid functions, and calculate single user to the scoring of single preset project in advance The difference for commenting on the scoring intermediate value of point range obtains the second similarity by the 2nd Sigmoid functions, and calculates single user to every The scoring of two preset projects and single user pass through the 3rd Sigmoid letters to the difference of the grade average of all preset projects Number obtains third similarity;
First computing unit 303, by being carried out based on product according to the first similarity, the second similarity and third similarity It calculates, obtains the first item scoring similarity of the preset project of each two;
Second computing unit 304, any active ues for punishing the preset project of each two by penalty calculate every two Any active ues number of a preset project accounts for the proportion of all scoring user numbers of the preset project of each two, and it is pre- to obtain each two Set the second item scoring similarity of project;
Third computing unit 305, for carrying out product to first item scoring similarity and second item scoring similarity It calculates, obtains third item scoring similarity;
4th computing unit 306, for by the item label collection of all preset projects be converted into m dimension value type labels to Amount calculates the item label similarity of preset project two-by-two according to measuring similarity algorithm;
5th computing unit 307, it is preset for being occupied with item label similarity according to third item scoring similarity Weight calculates item label collaborative filtering similarity.
Further, the first similarity is:
Wherein, Proximity (Rui,Ruj) be project i and project j the first similarity, RuiProject i is commented for user u Point, RujScoring for user u to project j,For the first Sigmoid functions.
Further, the second similarity is:
Wherein, Significance (Rui,Ruj) be project i and project j the second similarity, RuiIt is user u to project i Scoring, RujScoring for user u to project j, RmedFor score range intermediate value, For the 2nd Sigmoid functions.
Further, second item scoring similarity is:
Wherein, Nu(i, j) is indicated not only to have commented on project i but also was commented on the number of users of project j, and N (i) and N (j) are respectively represented Commented on the number of users of project i and project j, UijFor all user's manifolds of project i and project j.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to before Stating embodiment, invention is explained in detail, it will be understood by those of ordinary skill in the art that:It still can be to preceding The technical solution recorded in each embodiment is stated to modify or equivalent replacement of some of the technical features;And these Modification or replacement, the spirit and scope for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the division of module, Only a kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple module or components can be with In conjunction with or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be by some interfaces, the INDIRECT COUPLING of device or module or Communication connection can be electrical, machinery or other forms.
The module illustrated as separating component may or may not be physically separated, and be shown as module Component may or may not be physical module, you can be located at a place, or may be distributed over multiple networks In module.Some or all of module therein can be selected according to the actual needs to achieve the purpose of the solution of this embodiment.
In addition, each function module in each embodiment of the present invention can be integrated in a processing module, it can also That modules physically exist alone, can also two or more modules be integrated in a module.Above-mentioned integrated mould The form that hardware had both may be used in block is realized, can also be realized in the form of software function module.
If integrated module is realized and when sold or used as an independent product in the form of software function module, can To be stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention substantially or Say that all or part of the part that contributes to existing technology or the technical solution can embody in the form of software products Out, which is stored in a storage medium, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes all or part of each embodiment method of the present invention Step.And storage medium above-mentioned includes:It is USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random Access various Jie that can store program code such as memory (RAM, Random Access Memory), magnetic disc or CD Matter.

Claims (10)

1. a kind of collaborative filtering recommending method, which is characterized in that including:
S1:According to user's score data of all preset projects and the preset project of acquisition, project rating matrix is built;
S2:According to the project rating matrix, calculating single user is poor to the scoring of preset project described in each two, passes through first Sigmoid functions obtain the first similarity, and calculate the single user to the scoring of the single preset project with comment in advance The difference for dividing the scoring intermediate value of range, obtains the second similarity, and calculate the single user to every by the 2nd Sigmoid functions The scoring of two preset projects and single user pass through third to the difference of the grade average of all preset projects Sigmoid functions obtain third phase like degree;
S3:Product calculating is carried out according to first similarity, second similarity and the third similarity, obtains every two The first item scoring similarity of a preset project;
S4:Any active ues of preset project described in each two are punished by penalty, calculate preset project described in each two Any active ues number accounts for the proportion of all scoring user numbers of preset project described in each two, obtains preset item described in each two Purpose second item scoring similarity;
S5:Product calculating is carried out to first item scoring similarity and second item scoring similarity, obtains third Project scoring similarity;
S6:It converts the item label collection of all preset projects to m dimension value type label vectors, is calculated according to measuring similarity Method calculates the item label similarity of the preset project two-by-two;
S7:According to the preset weight that third item scoring similarity is occupied with the item label similarity, project is calculated Label collaborative filtering similarity.
2. collaborative filtering recommending method according to claim 1, which is characterized in that first similarity is:
Wherein, Proximity (Rui,Ruj) be project i and project j first similarity, RuiProject i is commented for user u Point, RujScoring for user u to project j,For the first Sigmoid functions.
3. collaborative filtering recommending method according to claim 1 or 2, which is characterized in that second similarity is:
Wherein, Significance (Rui,Ruj) be project i and project j second similarity, RuiIt is user u to project i Scoring, RujScoring for user u to project j, RmedFor score range intermediate value, For the 2nd Sigmoid functions.
4. collaborative filtering recommending method as claimed in any of claims 1 to 2, which is characterized in that the third phase It is like degree:
Wherein, Singularity (Rui,Ruj) be project i and project j the third similarity, RuiIt is user u to project i's Scoring, RujScoring for user u to project j,For the average value that user u scores to all items,For the 3rd Sigmoid functions.
5. collaborative filtering recommending method according to claim 1, which is characterized in that the second item scoring similarity For:
Wherein, Nu(i, j) is indicated not only to have commented on project i but also was commented on the number of users of project j, and N (i) and N (j) respectively represent comment Cross the number of users of project i and project j, UijFor all user's manifolds of project i and project j.
6. collaborative filtering recommending method according to claim 1 or 5, which is characterized in that further include after step S7:
S8:The nearest-neighbors of the preset project are calculated by k nearest neighbor algorithm according to the item label collaborative filtering similarity Collection calculates user according to the nearest-neighbors collection and scores the prediction for the preset project not scored, generates and recommend row Table.
7. a kind of Collaborative Filtering Recommendation System, which is characterized in that including:
Matrix construction unit is used for user's score data of all preset projects and the preset project according to acquisition, structure Project rating matrix;
Score computing unit, is commented preset project described in each two for calculating single user according to the project rating matrix It is poor to divide, and the first similarity is obtained by the first Sigmoid functions, and calculates scoring of the single user to the single preset project And the difference of the scoring intermediate value of preset scoring range, obtains the second similarity, and calculate single user by the 2nd Sigmoid functions To the scoring of preset project described in each two and single user to the difference of the grade average of all preset projects, pass through Three Sigmoid functions obtain third phase like degree;
First computing unit, for being multiplied according to first similarity, second similarity and the third similarity Product calculates, and obtains the first item scoring similarity of preset project described in each two;
Second computing unit, any active ues for punishing preset project described in each two by penalty calculate every Any active ues number of two preset projects accounts for the ratio of all scoring user numbers of preset project described in each two Weight obtains the second item scoring similarity of preset project described in each two;
Third computing unit, for carrying out product to first item scoring similarity and second item scoring similarity It calculates, obtains third item scoring similarity;
4th computing unit, for converting the item label collection of all preset projects to m dimension value type label vectors, root The item label similarity of the preset project two-by-two is calculated according to measuring similarity algorithm;
5th computing unit, it is preset for being occupied with the item label similarity according to third item scoring similarity Weight calculates item label collaborative filtering similarity.
8. Collaborative Filtering Recommendation System according to claim 7, which is characterized in that first similarity is:
Wherein, Proximity (Rui,Ruj) be project i and project j first similarity, RuiProject i is commented for user u Point, RujScoring for user u to project j,For the first Sigmoid functions.
9. Collaborative Filtering Recommendation System according to claim 7 or 8, which is characterized in that second similarity is:
Wherein, Significance (Rui,Ruj) be project i and project j second similarity, RuiIt is user u to project i Scoring, RujScoring for user u to project j, RmedFor score range intermediate value, For the 2nd Sigmoid functions.
10. Collaborative Filtering Recommendation System according to claim 7 or 8, which is characterized in that the second item scoring is similar Degree is:
Wherein, Nu(i, j) is indicated not only to have commented on project i but also was commented on the number of users of project j, and N (i) and N (j) respectively represent comment Cross the number of users of project i and project j, UijFor all user's manifolds of project i and project j.
CN201810240236.0A 2018-03-22 2018-03-22 Collaborative filtering recommendation method and system Active CN108389113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810240236.0A CN108389113B (en) 2018-03-22 2018-03-22 Collaborative filtering recommendation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810240236.0A CN108389113B (en) 2018-03-22 2018-03-22 Collaborative filtering recommendation method and system

Publications (2)

Publication Number Publication Date
CN108389113A true CN108389113A (en) 2018-08-10
CN108389113B CN108389113B (en) 2022-04-19

Family

ID=63068024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810240236.0A Active CN108389113B (en) 2018-03-22 2018-03-22 Collaborative filtering recommendation method and system

Country Status (1)

Country Link
CN (1) CN108389113B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241405A (en) * 2018-08-13 2019-01-18 华中师范大学 A kind of associated education resource collaborative filtering recommending method of knowledge based and system
CN109766913A (en) * 2018-12-11 2019-05-17 东软集团股份有限公司 Tenant group method, apparatus, computer readable storage medium and electronic equipment
CN112069419A (en) * 2020-09-08 2020-12-11 杭州电子科技大学 Cross-correlation collaborative filtering method integrating user weak trace behavior preference
CN113722443A (en) * 2021-09-10 2021-11-30 焦点科技股份有限公司 Label recommendation method and system integrating text similarity and collaborative filtering

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722552A (en) * 2012-05-28 2012-10-10 重庆大学 Learning rate regulating method in collaborative filtering model
CN106021329A (en) * 2016-05-06 2016-10-12 西安电子科技大学 A user similarity-based sparse data collaborative filtering recommendation method
CN107330461A (en) * 2017-06-27 2017-11-07 安徽师范大学 Collaborative filtering recommending method based on emotion with trust

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722552A (en) * 2012-05-28 2012-10-10 重庆大学 Learning rate regulating method in collaborative filtering model
CN106021329A (en) * 2016-05-06 2016-10-12 西安电子科技大学 A user similarity-based sparse data collaborative filtering recommendation method
CN107330461A (en) * 2017-06-27 2017-11-07 安徽师范大学 Collaborative filtering recommending method based on emotion with trust

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAOJIAO YAO等: "Recommend Algorithm Combined User-user Neighborhood Approach with Latent Factor Model", 《INTERNATIONAL CONFERENCE ON MECHATRONICS AND INTELLIGENT ROBOTICS》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241405A (en) * 2018-08-13 2019-01-18 华中师范大学 A kind of associated education resource collaborative filtering recommending method of knowledge based and system
CN109766913A (en) * 2018-12-11 2019-05-17 东软集团股份有限公司 Tenant group method, apparatus, computer readable storage medium and electronic equipment
CN112069419A (en) * 2020-09-08 2020-12-11 杭州电子科技大学 Cross-correlation collaborative filtering method integrating user weak trace behavior preference
CN113722443A (en) * 2021-09-10 2021-11-30 焦点科技股份有限公司 Label recommendation method and system integrating text similarity and collaborative filtering
CN113722443B (en) * 2021-09-10 2024-04-19 焦点科技股份有限公司 Label recommendation method and system integrating text similarity and collaborative filtering

Also Published As

Publication number Publication date
CN108389113B (en) 2022-04-19

Similar Documents

Publication Publication Date Title
CN105224699B (en) News recommendation method and device
CN110162706B (en) Personalized recommendation method and system based on interactive data clustering
Li et al. Using multidimensional clustering based collaborative filtering approach improving recommendation diversity
CN108665323B (en) Integration method for financial product recommendation system
CN108389113A (en) A kind of collaborative filtering recommending method and system
CN104899246B (en) Collaborative filtering recommending method based on blurring mechanism user scoring neighborhood information
CN107894998B (en) Video recommendation method and device
CN107220365A (en) Accurate commending system and method based on collaborative filtering and correlation rule parallel processing
KR101098871B1 (en) APPARATUS AND METHOD FOR MEASURING CONTENTS SIMILARITY BASED ON FEEDBACK INFORMATION OF RANKED USER and Computer Readable Recording Medium Storing Program thereof
CN104063481A (en) Film individuation recommendation method based on user real-time interest vectors
CN102063433A (en) Method and device for recommending related items
CN104298787A (en) Individual recommendation method and device based on fusion strategy
CN107256238B (en) personalized information recommendation method and information recommendation system under multiple constraint conditions
Kommineni et al. Machine learning based efficient recommendation system for book selection using user based collaborative filtering algorithm
CN106686460B (en) Video program recommendation method and video program recommendation device
CN107391670A (en) A kind of mixing recommendation method for merging collaborative filtering and user property filtering
CN103136683A (en) Method and device for calculating product reference price and method and system for searching products
CN104216993A (en) Tag-co-occurred tag clustering method
CN106919699A (en) A kind of recommendation method for personalized information towards large-scale consumer
CN110347935A (en) Personalized film and television project recommended method and system based on user interest variation
CN109977299A (en) A kind of proposed algorithm of convergence project temperature and expert's coefficient
CN104899321A (en) Collaborative filtering recommendation method based on item attribute score mean value
CN115712780A (en) Information pushing method and device based on cloud computing and big data
Devika et al. Book recommendation system
Li et al. Multidimensional clustering based collaborative filtering approach for diversified recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant