A kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms
Technical field
The invention belongs to book recommendation technical fields, and in particular to a kind of books based on weighted blend k- nearest neighbor algorithms
Recommend method and system.
Background technology
With the development of information technology and internet, people gradually from the epoch of absence of information entered into information overload when
Generation.Many times, our problems faceds not instead of substance shortage, absence of information, these things are too many, and let us dim eyesight is entangled
Disorderly, do not know how to select.In face of magnanimity information, be currently, there are of both problem, on the one hand, how from the information of overload
In find oneself really interested content;On the other hand, the information that how informant makes them provide is interested
People notices rather than is submerged in the information of magnanimity.
In order to solve problem of information overload, there is classified catalogue and search engine.They be all information and user it
Between establish matched, user can find interested information by search key.However, there is also offices for search engine
Sex-limited, first, the result that it is provided is not usually personalized, and different people are scanned for the same keyword, return
As a result it is often the same, and interpersonal taste is often different;Therefore, search engine can not be accurate
Ground is that different user filters information;Another limitation of search engine is exactly that it requires user that must have clearly to the demand of oneself
Clear understanding, and can state out with keyword, however, user is sometimes there is certain demands, these demands they
Oneself is not yet, it is realized that at this time search engine is with regard to helpless.Although both tools can help user very fast
Find their may interested information.But these tools cannot all be directed to different users and provide personalized service.
Commending system is that another help information and user carry out matched means.It is different with search engine to be,
Without the keyword outside user's amount of imports, it can be recorded commending system according to the previous historical behavior of user, actively excavate user's
Hobby helps user to find potential point of interest, and by dependent merchandise or information recommendation to user.By thus according to each
What the characteristics of user, was recommended, so it disclosure satisfy that personalized requirement, recommend to meet their individual characteies for different users
The product of change demand, allows information more accurately to show in front of the user, meanwhile, it is also less dependent on user and is actively entered
Information go filtering information.
In book recommendation field, seldom in view of different proposed algorithms is weighted mixing, but due to different
Proposed algorithm has different advantage and disadvantage, tends not to obtain good book recommendation result using single proposed algorithm.k-
Nearest neighbor algorithm is divided into the k- nearest neighbor algorithms based on user and the k- nearest neighbor algorithms based on article, and previous book recommendation is normal
It is realized using the algorithm based on user or based on article, and is recommended just not taking into account based on user and be based in this way
The advantage of the k- nearest neighbor algorithms of article.
Invention content
It is an object of the present invention to provide a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms, the calculations
Method can be directed to the scoring to books of different user, recommend its possible interested books to different user individuals,
The similarity for considering object involved in system comprehensively, improves the accuracy of recommendation,
To achieve the above object, the present invention uses following technical scheme:
A kind of book recommendation method based on weighted blend k- nearest neighbor algorithms of the present invention, includes the following steps:
Step 1, user's history books scoring behavioral data is randomly divided into M part according to being uniformly distributed, chooses portion conduct
Test set is used as training set by remaining M-1 parts, is established based on use on the training set that user's history books score behavioral data
Family and k- arest neighbors recommended models based on article;
Step 2, it is established and is used on the training set that user's history books score behavioral data by k- arest neighbors recommended models
Family interest model generates Recommended Books list, then the test set by combining user's history books scoring behavioral data, calculates most
The number k of similar users is in the case of initial value, the accuracy rate and recall rate of k- nearest neighbor algorithms;
Step 3, the number k for the most like user for establishing the set algorithm in k- arest neighbors recommended models is updated successively
Value calculates the book recommendation list under different value of K;It is nearest based on user and the k- based on article and in the case of calculating different value of K
The accuracy rate and recall rate of adjacent algorithm;
Step 4, the accuracy rate corresponding to each different value of K is added with recall rate, obtains based on user and is based on article
K- nearest neighbor algorithms performance index value;Corresponding to the maximum value for taking the performance index value of the k- nearest neighbor algorithms based on user
Parameter k optimal algorithm parameter k of the value as certain user in the k- nearest neighbor algorithms based on user value;Similarly, it takes
The value of parameter k corresponding to the maximum value of multigroup performance index value of k- nearest neighbor algorithms based on article is as certain user in base
The value of optimal algorithm parameter k in the k- nearest neighbor algorithms of article;
Step 5, certain user is inputted in the optimal algorithm performance index value of the k- nearest neighbor algorithms based on user and is based on object
The performance index value of the k- nearest neighbor algorithms of product distributes the k- nearest neighbor algorithms based on user by the two values for the user
With the weights of the k- nearest neighbor algorithms based on article;
Step 6, certain user is inputted in the value of the optimized parameter k of the k- nearest neighbor algorithms based on user and based on the k- of article
The value of the optimized parameter k of nearest neighbor algorithm is that user generates based on user's using the k- arest neighbors recommended models in step (1)
The book recommendation list of the book recommendation list of k- nearest neighbor algorithms and the k- nearest neighbor algorithms based on article;Then according to the use
The weights that the k- nearest neighbor algorithms based on user and the k- nearest neighbor algorithms based on article are distributed in family are multiplied by book recommendation row
The quantity N of books in table is calculated in final mixing recommendation list, Recommended Books caused by the k- nearest neighbor algorithms based on user
Account for mixing books recommendation list quantity and k- nearest neighbor algorithms based on article caused by Recommended Books account for mixing books and push away
The quantity for recommending list finally obtains mixing books recommendation list.
As a further improvement on the present invention, the establishment step of the k- arest neighbors recommended models in step 1 is as follows:
Step 1.1, the user's books rating matrix for being m*n by the training set processing of user's history books scoring behavioral data
R;
Step 1.2, the similarity between user is calculated using Pearson correlation coefficient;Using between homologous factors calculating article
Similarity;
Step 1.3, for each user, the similarity of the user and other users are arranged by sequence from big to small
The similarity of the books and other books similarly for every books, is ranked up by sequence by sequence from big to small;
Step 1.4, according to similarity calculation as a result, combination algorithm parameter k, generates Candidate Recommendation books list, recycling
The calculation formula for predicting scoring calculates the prediction scoring of every books in Candidate Recommendation books list, and by prediction scoring
Sequence from big to small is ranked up Candidate Recommendation books list, takes former books groups of Candidate Recommendation books sorted lists
At final book recommendation list, the Recommended Books list of the k- nearest neighbor algorithms based on user is generated;
Step 1.5, similarly, the candidate book recommendation list of the k- nearest neighbor algorithms based on article is generated, candidate is calculated and pushes away
The prediction scoring of every books in books list is recommended, then by the sequence of prediction scoring from big to small to Candidate Recommendation books list
It is ranked up;It takes preceding this books of P of Candidate Recommendation books sorted lists to form final book recommendation list, generates and be based on article
K- nearest neighbor algorithms Recommended Books list.
As a further improvement on the present invention, the calculating formula of similarity in step 1.2 between user is as follows:
In above-mentioned formula, P (u, v) indicates useruSimilarity between user v, IuAnd IvRespectively indicate user u and
The books set that user v scored, ruiAnd rviScorings of the user u to the scoring and user v of article i to books i is indicated respectively,
WithThe average score to books of user u and user v is indicated respectively;
The similarity between article is calculated, the similarity measurement used is homologous factors, wherein element tknIndicate books k and figure
The degree of association between book n, homologous factors T calculating process are as follows:
Work as useruIt is scored books i, j, k, to the composition submatrix T of books i, j, k in homologous factors TijkIn
Each element add numerical value 1;
Aforesaid operations are carried out for all user, then by all results addeds, and by the element in homologous factors T
It is normalized, normalization formula calculates as follows:
In above-mentioned formula, t 'ijIndicate the similarity between books i and books j;
By normalized homologous factors, to obtain the similarity between article.
As a further improvement on the present invention, predict that the calculation formula of scoring is as follows in step 1.5:
In above formula,WithIt is average scores of the user u and user u ' to article, sim (u, u ') is user u and user
Similarity between u ', N are the set of article (neighbours) composition most like with article i.
As a further improvement on the present invention, the accuracy rate and recall rate of the k- nearest neighbor algorithms in step 2, calculation formula
As follows:
Step 2.1, accuracy rate
In formula, Precision (U (u) indicate for user u, the accuracy rate of the k- nearest neighbor algorithms based on user,
(I (u) indicates that, for user u, the accuracy rate of the k- nearest neighbor algorithms based on article, R (U (u)) is indicated based on use to Precision
The k- arest neighbors proposed algorithms at family are the book recommendation list that user u is generated, and R (I (u)) indicates the k- arest neighbors based on article
Proposed algorithm is the book recommendation list that user u is generated, and T (u) indicates that the recommendation list for the article that user u scored, U indicate
All users;
Step 2.2, recall rate
In formula, Recall (U (u) indicate for user u, the recall rate of the k- nearest neighbor algorithms based on user, Recall (I
(u) indicate that, for user u, the recall rate of the k- nearest neighbor algorithms based on article, R (U (u)) indicates the k- arest neighbors based on user
Proposed algorithm is the book recommendation list that user u is generated, and R (I (u)) indicates that the k- arest neighbors proposed algorithms based on article are to use
The book recommendation list that family u is generated, T (u) indicate that the recommendation list for the article that user u scored, U indicate all users.
As a further improvement on the present invention, in step 5, weight computing formula is as follows:
In formula, Weight (U (u)), Weight (I (u)) indicate respectively user u to based on user k- nearest neighbor algorithms and
Weights Pre (U (u)), the Re (U (u)) of k- nearest neighbor algorithms based on article indicate k- arest neighbors of the user u based on user respectively
The accuracy rate and recall rate of algorithm, Pre (I (u)), Re (I (u)) indicate k- nearest neighbor algorithms of the user u based on article respectively
Accuracy rate and recall rate.
A kind of book recommendation system based on weighted blend k- nearest neighbor algorithms, including consumer articles information collecting layer, deposit
Reservoir, recommended engine module, interface layer;
The accumulation layer, the data for using and generating for storage system include the essential information and use of user, books
Family behavioural information;
The consumer articles information collecting layer, connect with accumulation layer, for be responsible for typing and safeguard user, books base
This information and user behavior information;
The recommended engine module, connect with accumulation layer, on the basis of being used for user to the historical behavior data of article
It is calculated, generates recommendation list;Recommended using the k- arest neighbors recommended models based on user and the k- arest neighbors based on article
Model construction recommended engine;
The interface layer, connect with recommended engine module, accumulation layer and front end display unit communicates, for calculated
Data need to pass to front end display unit, and user passes the scoring of books back by the acquisition of front end display unit, and interface layer is
Front end display unit, which calls, provides required data, and the user behavior data that front end display unit transmits is transferred to accumulation layer deposit
With.
Compared with prior art, the present invention haing the following advantages:
The k- nearest neighbor algorithms for the weighted blend that the present invention uses have k- nearest neighbor algorithms based on user and are based on object
The advantage of the k- nearest neighbor algorithms of product can generate recommendation to the history scoring record of article using user, and have higher
Recommendation performance.Since single k- nearest neighbor algorithms only take into account the similarity i.e. similarity of user of single object in system
Or the similarity of article, but without the similarity of object involved by comprehensive gauging system, and the k- arest neighbors of weighted blend is calculated
Method not only allows for the similitude between user, while also contemplating the similitude between article, what final hybrid algorithm calculated
The information of the existing Recommended Books calculated based on user's similarity in book recommendation list is also based on article similarity operation
The information of the Recommended Books gone out, therefore the Recommended Books information that the k- nearest neighbor algorithms of weighted blend calculate considers more comprehensively
The similarity of object involved in system, improves the accuracy of recommendation.The advantage of personalized weighted blend algorithm is also embodied in
Power is assigned personalizedly, i.e., the weights that different user assigns k- nearest neighbor algorithms are different.By the books for analyzing different user
History scoring record is that different user assigns difference to the k- last algorithms based on user and the k- nearest neighbor algorithms based on article
Weight, and can according to user's books history score record changes constantly adjust the weight that algorithm is assigned make weighting mix
The weights of hop algorithm remain optimal.The advantage of personalized weighted blend algorithm is also embodied in involved by k- nearest neighbor algorithms
Core parameter k optimization.By the way that the books history scoring record of different user is divided into training set and test set, foundation pushes away
Performance indicator accuracy rate and recall rate are recommended, selects the value of optimal parameter k for different user, and can remember according to user's history
The value of the core parameter k of the variation dynamic adjustment algorithm of record, improves the performance of proposed algorithm.
The commending system of the present invention using the k- nearest neighbor algorithms of weighted blend there is the k- arest neighbors based on user to calculate
The advantage of method and k- nearest neighbor algorithms based on article.The similarity of system object is utilized in the algorithm comprehensively, and can root
According to the value and weights of the core parameter k of the variation dynamic adjustment algorithm of user's history record, has and preferably recommend performance.
Description of the drawings
Fig. 1 be the present invention a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms training most
The flow chart of excellent parameter k value;
Fig. 2 is the recommendation of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention
Journey flow chart;
Fig. 3 be the present invention a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms based on
The collaborative filtering flow chart at family;
Fig. 4 is a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention based on object
The collaborative filtering flow chart of product;
Fig. 5 is the weights point of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention
With flow chart;
Fig. 6 is that the mixing of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention pushes away
Recommend the flow chart of realization;
Fig. 7 is the system frame of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention
Frame figure;
Fig. 8 is the ER figures of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention;
Fig. 9 is that the books of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention push away
Recommend module design figure;
Figure 10 is the books of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention
Proposed algorithm block diagram;
Figure 11 is the user of a kind of book recommendation method and system based on weighted blend k- nearest neighbor algorithms of the present invention
The mixing recommendation list of cytun.
Specific implementation mode
The present invention is described in detail below in conjunction with the accompanying drawings:
As shown in Figure 1:A kind of book recommendation method based on weighted blend k- nearest neighbor algorithms of the present invention, including following step
Suddenly:
Step 1, first, user's history books scoring behavioral data is randomly divided into M part according to being uniformly distributed, chooses portion
As test set, training set is used as by remaining M-1 parts.Base is established on the training set that user's history books score behavioral data
In user and k- arest neighbors recommended models based on article, as shown in Fig. 2, the establishment step of recommended models is as follows:
Step 1.1, the user's books rating matrix for being m*n by the training set processing of user's history books scoring behavioral data
R, wherein m is expressed as m user, and n is expressed as this books of n, ruiIndicate scorings of the user u to article i.
Step 1.2, the similarity between user is calculated, used similarity measurement is Pearson correlation coefficient, is calculated
Formula is as follows:
In above-mentioned formula, P (u, v) indicates the similarity between user u and user v, IuAnd IvRespectively indicate user u and
The books set that user v scored, ruiAnd rviScorings of the user u to the scoring and user v of article i to books i is indicated respectively,
WithThe average score to books of user u and user v is indicated respectively.
The similarity between article is calculated, used similarity measurement is homologous factors, wherein element tknIndicate books k with
The degree of association between books n, homologous factors T calculating process are as follows:
When user u scores to books i, j, k, to the composition submatrix T of books i, j, k in homologous factors TijkIn
Each element add numerical value 1.
Aforesaid operations are carried out for all user, then by all results addeds, and by the element in homologous factors T
It is normalized, normalization formula calculates as follows:
In above-mentioned formula, t 'ijIndicate the similarity between books i and books j.By normalized homologous factors,
To obtain the similarity between article.
Step 1.3, for each user, the similarity of the user and other users are arranged by sequence from big to small
The similarity of the books and other books similarly for every books, is ranked up by sequence by sequence from big to small.
Step 1.4, as shown in figure 3, generating the Recommended Books list of the k- nearest neighbor algorithms based on user.It is assumed to be user
U generates Recommended Books list.First, the initial value of setup parameter k values is 20, and parameter k is meant that, of most like user
Number.Take neighbor user collection of preceding 20 users of family u sequencing of similarity lists as user u.Then, by 20 neighbor users
It has scored but the Candidate Recommendation books list of books that user u does not score as user u, and in Candidate Recommendation books list
Books score, and the calculation formula of books prediction scoring is as follows:
In above-mentioned formula,Indicate useruPrediction scoring to books i,WithIt is user u and user u ' to books
Average score, sim (u, u ') is the similarity between user u and user u ', ru′iIndicate scorings of the user u ' to books i, with
The set M of user's composition most like user u.
The list of Candidate Recommendation books is ranked up from big to small by the prediction scoring of candidate books, setup parameter N is 10, ginseng
Number N is meant that the number of books in book recommendation list.Preceding 10 books of Candidate Recommendation books sorted lists are taken to form most
Whole book recommendation list.
Step 1.5, as shown in figure 4, generating the Recommended Books list of the k- nearest neighbor algorithms based on article.
It is assumed to be user u and generates Recommended Books list.First, the initial value of setup parameter k values is 20, the meaning of parameter k
It is the number of most like books.Using all books not scored of user u as Candidate Recommendation books list, certain this candidate is taken
Preceding 20 books of the sequencing of similarity list for the books that Recommended Books i and user u has been evaluated are as certain this Candidate Recommendation books
Neighbor picture book fair, and score this Candidate Recommendation books, the calculation formula of books prediction scoring is as follows:
In above formula,Indicate that user u scores to the prediction of books i,It is user u and the average score to books, sim
(i, j) is the similarity between books i and books j, rujIndicate scorings of the user u to books j, D is the figure most like with books i
The set of book composition.
According to above step calculate Candidate Recommendation books list in every books prediction scoring, then by prediction scoring from
Small sequence is arrived greatly to be ranked up Candidate Recommendation books list.Then, setup parameter N is 10, and parameter N is meant that books push away
Recommend the number of books in list.Finally, preceding 10 books of Candidate Recommendation books sorted lists is taken to form final book recommendation
List.
Step 2, it can be established and be used on the training set that user's history books score behavioral data by above six steps
Family interest model generates Recommended Books list, then the test set by combining user's history books scoring behavioral data, calculates and exist
In the case of initial value k=20, the registration of agenda in predictive behavior and test set, the i.e. accuracy rate of k- nearest neighbor algorithms
And recall rate, specific formula for calculation are as follows:
Step 2.1, accuracy rate
In above-mentioned formula, Precision (U (u) indicate for user u, the k- nearest neighbor algorithms based on user it is accurate
Rate, (I (u) indicates that for user u, the accuracy rate of the k- nearest neighbor algorithms based on article, R (U (u)) expressions are based on to Precision
The k- arest neighbors proposed algorithms of user are the book recommendation list that user u is generated, and R (I (u)) indicates that the k- based on article is nearest
Adjacent proposed algorithm is the book recommendation list that user u is generated, and T (u) indicates the recommendation list for the article that user u scored, U tables
Show all users.
Step 2.2, recall rate
In above-mentioned formula, Recall (U (u) indicate for user u, the recall rate of the k- nearest neighbor algorithms based on user,
(I (u) indicates that, for user u, the recall rate of the k- nearest neighbor algorithms based on article, R (U (u)) is indicated based on user's to Recall
K- arest neighbors proposed algorithms are the book recommendation list that user u is generated, and R (I (u)) indicates that the k- arest neighbors based on article is recommended
Algorithm is the book recommendation list that user u is generated, and T (u) indicates that user u, the recommendation list of the article to score, U indicate institute
There is user.
Step 3, the core that step 1 establishes algorithm set in the step 1.4 in recommended models and step 1.5 is updated successively
Heart k values are updated to 30,40,50, calculate under different value of K, book recommendation list.Then step 3 is repeated, k=30, k=are calculated
40, in the case of k=50, the k- nearest neighbor algorithms based on user with the accuracys rate of the k- nearest neighbor algorithms based on article and call together
The rate of returning.
Step 4, it by above 3 steps, finally obtains, in the case of k=20, k=30, k=40, k=50, based on use
The accuracy rate and recall rate of the k- nearest neighbor algorithms at family and the k- nearest neighbor algorithms based on article amount to 8 groups of accuracys rate and recall
Rate.Then, the accuracy rate corresponding to k=20, k=30, k=40, k=50 is added with recall rate, obtains the k- based on user
The performance index value of nearest neighbor algorithm and k- nearest neighbor algorithms based on article.Next, the k- arest neighbors based on user is calculated
Method, in the case of different value of K, 4 groups of performance indicators are ranked up by sequence from big to small, similarly, by the k- based on article
Nearest neighbor algorithm, in the case of different value of K, 4 groups of performance index values are ranked up by sequence from big to small.Finally, base is taken
The value of parameter k corresponding to the maximum value of 4 groups of performance index values of the k- nearest neighbor algorithms of user is as user based on use
The value of optimal algorithm parameter k in the k- nearest neighbor algorithms at family.Similarly, 4 groups of property of the k- nearest neighbor algorithms based on article are taken
Optimal calculation of the value of parameter k corresponding to the maximum value of energy index value as user in the k- nearest neighbor algorithms based on article
The value of method parameter k.
The purpose of above four step is optimal algorithm parameter k to be trained for different users, and obtain optimized parameter k values institute
The performance index value of corresponding algorithm, i.e. accuracy rate add recall rate.Next, the performance of the algorithm obtained using above four step is referred to
Scale value to k- nearest neighbor algorithm of the k- nearest neighbor algorithms family based on user based on article carries out tax power.
Step 5, certain user is inputted in the optimal algorithm performance index value of the k- nearest neighbor algorithms based on user and is based on object
The performance index value of the k- nearest neighbor algorithms of product.By the two values the k- nearest neighbor algorithms based on user are distributed for the user
With the weights of the k- nearest neighbor algorithms based on article, weight computing formula is as follows:
In above formula, Weight (U (u)), Weight (I (u)) indicate that user u calculates the k- arest neighbors based on user respectively
The weights of method and k- nearest neighbor algorithms based on article, Pre (U (u)), Re (U (u)) indicate k-s of the user u based on user respectively
The accuracy rate and recall rate of nearest neighbor algorithm, Pre (I (u)), Re (I (u)) indicate k- arest neighbors of the user u based on article respectively
The accuracy rate and recall rate of algorithm, and calculated result is subjected to the processing that rounds up, using final result as certain user couple
The weights that k- nearest neighbor algorithms based on user and the k- nearest neighbor algorithms based on article are distributed.The flow chart of the step such as figure below
Shown in 5:
Step 6, as shown in fig. 6, input certain user the optimized parameter k of the k- nearest neighbor algorithms based on user value and base
It is that user generates based on user's using the recommended models in step 1 in the value of the optimized parameter k of the k- nearest neighbor algorithms of article
The book recommendation list of the book recommendation list of k- nearest neighbor algorithms and the k- nearest neighbor algorithms based on article.Then according to the use
The weights that the k- nearest neighbor algorithms based on user and the k- nearest neighbor algorithms based on article are distributed in family calculate final mixing and push away
It recommends in list, Recommended Books caused by the k- nearest neighbor algorithms based on user account for the quantity of mixing books recommendation list and are based on
Recommended Books caused by the k- nearest neighbor algorithms of article account for the quantity of mixing books recommendation list, and calculation formula is as follows:
N (U (u))=Weight (U (u)) × N (13)
N (U (u))=Weight (I (u)) × N (14)
In above-mentioned formula, N (U (u)) indicates that Recommended Books caused by the k- nearest neighbor algorithms based on user account for mixing
The quantity of book recommendation list, N (I (u)) indicate that Recommended Books caused by the k- nearest neighbor algorithms based on article account for combination chart
The quantity of book recommendation list, Weight (U (u)), Weight (I (u)) indicate that user u calculates the k- arest neighbors based on user respectively
Method and weights to the k- nearest neighbor algorithms based on article, N indicate the quantity of books in book recommendation list.By result of calculation into
Row round finally obtains mixing books recommendation list.
Generating the principle of mixing recommendation list is, the obtained N ' of number N for recommending article are multiplied by with the weights of algorithm, then
Take a books of preceding N ' of the recommendation list of the algorithm as the books in final mixing recommendation list, two algorithms take difference respectively
The books of quantity are as final mixing recommendation list.
The present invention also provides a kind of book recommendation system based on weighted blend k- nearest neighbor algorithms, commending system wants face
To two important objects:The core of user and article, system is recommended engine, it links together user and article.Books
The groundwork of commending system is to provide a user a book recommendation list.General user, article, recommended engine three parts
As soon as forming a complete commending system, this section respectively explains the modelling of system general frame and these three parts
It states, specific system framework figure is illustrated in fig. 7 shown below:
Accumulation layer is used for the data that storage system is used and generated, mainly the essential information including user, books, books,
User behavior information.
It is responsible for typing and safeguards user, the essential information of books and user behavior information in consumer articles information collecting layer.
Recommended engine calculates on the basis of historical behavior data of the user to article, generates recommendation list.This is
Using Collaborative Filtering Recommendation Algorithm, specifically used is k- arest neighbors collaborative filtering based on user and is based on object system
The k- arest neighbors collaborative filterings of product build recommended engine.Since algorithm respectively has quality, for such situation, system introduces
Recommendation method is mixed, the algorithm based on k- arest neighbors is weighted mixing.
Interface layer is responsible for the communication of system and Front End.Because system operation needs to pass in backstage, calculated data
Front end displaying is passed, user is obtained the scoring of books by front end and passed back, and the work of interface layer is exactly to call to provide for front end
The user behavior data that front end is transmitted is transferred to accumulation layer deposit to use by required data.
Database uses the behavioral data of SQLite storage users and the master data of user, books.The data of this system
Inventory is in user, books entity, and there are the relationship of multi-to-multi between user and books, there are multi-to-multis between user and user
Relationship.These entity attributes are illustrated in fig. 8 shown below:
Schemed based on this ER, can be designed that corresponding database table:
(a) system object information collection
System object includes user information, books essential information.User information is divided into two parts acquisition, and first part comes
The information filled in when from user's registration;Second part is calculated according to existing subscriber's behavioral data by system background
, wherein the numerical value of the core parameter of the involved k- arest neighbors based on article and the k- nearest neighbor algorithms based on user, with
And their weights are assigned when mixing both algorithms respectively.
For new user, since there is no user behavior data, background program will be unable to train the optimal ginseng of algorithm for it
Number, assigns the weights of algorithms of different, and therefore, according to recommending emulation experiment to be drawn a conclusion, the optimized parameter of most of user is all
It is 50, is 9 and 1, sheet for the weights that the k- nearest neighbor algorithms based on article and the k- nearest neighbor algorithms based on user are assigned
The optimized parameter k value default settings of new user are just 50 by system, and weights default setting is 9 and 1, and user's specific object is as follows
Shown in table 1:
Table 1
The acquisition of books essential information is to utilize crawler technology, is generated by bean cotyledon api interface, books specific object is such as
Shown in the following table 2:
Table 2
The acquisition of user behavior record:
To being exactly that user behavior records used in algorithm in commending system, this system generally refers to user to books
Scoring record.The acquisition of the record is to retrieve books on system foreground by user, and score books, the score data
It will be transmitted in background data base by interface, user behavior records the specific interior of involved user-books grade form
Hold, as shown in table 3 below:
Table 3
Recommending module mainly establishes user, article recommended models, training algorithm core parameter, computational algorithm weights, is
User recommends may interested books.The module is the nucleus module of this book recommendation system, and function is exactly for user
Realize the recommendation of books list.The module is related to two proposed algorithms altogether, is respectively, k- arest neighbors based on article and is based on
The k- arest neighbors collaborative filterings of article.The recommendation function of independent algorithm not only may be implemented in the module, also by being based on using
The weighted blend of the k- arest neighbors at family and the k- nearest neighbor algorithms based on article realizes the function that mixing is recommended.Recommending module is set
Meter figure, is illustrated in fig. 9 shown below:
This system collects user behavior record first, and is generated user-article rating matrix to carry out related operation,
To realize that the function of being recommended user, proposed algorithm design frame chart are illustrated in fig. 10 shown below:
Embodiment:
Underneath with sharing the data set of 129334 user's books scoring record, wherein relate to altogether 265 books and
1968 users.
(1) first, which is randomly divided into 3 parts according to being uniformly distributed, chooses portion as test set, it will be remaining
2 parts are used as training set.The following table 4 and table 5 illustrate the training set and test set of user cytun.
Table 4
Table 5
(2) it is established on the training set that the history books of user cytun score behavioral data based on user and is based on article
K- arest neighbors recommended models, in conjunction with user cytun history books score behavioral data test set calculate k=20, k=
30, in the case of k=40, k=50, the accuracy rate of the k- nearest neighbor algorithms based on user and the k- nearest neighbor algorithms based on article
And recall rate, amount to 8 groups of accuracys rate and recall rate.Shown in table 6 and table 7 specific as follows, table 6 is the parameter based on user cytun
Performance, table 7 are the performance parameters based on article cytun.
Table 6
Table 7
Using user cytun in the optimal algorithm performance index value of the k- nearest neighbor algorithms based on user and based on article
The performance index value of k- nearest neighbor algorithms is that user cytun distributes the k- nearest neighbor algorithms based on user and the k- based on article
The weights of nearest neighbor algorithm are obtained according to weight computing formula, and the weights of the k- nearest neighbor algorithms based on user are 2, are based on article
K- nearest neighbor algorithms weights be 8.
Input user cytun the value of the optimized parameter k of the k- nearest neighbor algorithms based on user and k- based on article most
The value of the optimized parameter k of nearest neighbor algorithm is pushed away using the books that recommended models are k- nearest neighbor algorithms of user's generation based on user
Recommend the book recommendation list of list and the k- nearest neighbor algorithms based on article;Then nearest to the k- based on user according to the user
The weights that adjacent algorithm and k- nearest neighbor algorithms based on article are distributed calculate in final mixing recommendation list, based on user's
Recommended Books caused by k- nearest neighbor algorithms account for the quantity of mixing books recommendation list and the k- nearest neighbor algorithms based on article
Generated Recommended Books account for the quantity of mixing books recommendation list, and obtaining the k- arest neighbors based on user according to calculation formula calculates
The quantity that Recommended Books caused by method account for mixing books recommendation list is 2, produced by the k- nearest neighbor algorithms based on article
Recommended Books account for mixing books recommendation list quantity be 8.Final mixing books recommendation list is illustrated in fig. 11 shown below.
The proposed algorithm that the present invention applies is to be based on weighted blend k- nearest neighbor algorithms, specifically used to have arrived based on user's
K- arest neighbors and k- nearest neighbor algorithms based on article, these algorithms can be directed to the scoring to books of different readers, to not
Same reader recommends its may interested books personalizedly.In collaborative filtering, the parameter of selection is different, can be to pushing away
The effect recommended generates different influences.The present invention is also the method by test experiment, is had trained based on k- arest neighbors for user
Core parameter k in algorithm, in order to the performance for improving proposed algorithm, to obtain optimal recommendation results.The present invention is real
The book recommendation based on single collaborative filtering is showed, and based on user k- arest neighbors and will be based in collaborative filtering
Article k- nearest neighbor algorithms are weighted mixing, realize mixing and recommend.Finally personalized book recommendation system is made
A design realizes the function of books history scoring record queries, book information inquiry and book recommendation, and on data set
Recommendation emulation experiment is carried out.
More than, only presently preferred embodiments of the present invention is not limited only to the practical range of the present invention, all according to the scope of the invention
The equivalence changes done of content and modification, all should be the technology scope of the present invention.