CN110263257A - Multi-source heterogeneous data mixing recommended models based on deep learning - Google Patents
Multi-source heterogeneous data mixing recommended models based on deep learning Download PDFInfo
- Publication number
- CN110263257A CN110263257A CN201910547320.1A CN201910547320A CN110263257A CN 110263257 A CN110263257 A CN 110263257A CN 201910547320 A CN201910547320 A CN 201910547320A CN 110263257 A CN110263257 A CN 110263257A
- Authority
- CN
- China
- Prior art keywords
- user
- article
- comment
- model
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Accounting & Taxation (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Finance (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Deep learning has been widely used in image and audio identification, text classification and has indicated that fields, the recommender systems based on deep learning such as study also become the research hotspot of scholars in recent years.Deep learning model all achieves fabulous effect in the expression study of the specific datas such as image, text, avoids complicated Feature Engineering, and the abstract characteristics of the Nonlinear Multi time of available isomeric data indicate, overcome the heterogeneity of a variety of data.Currently, the deep learning recommended models of fusion scoring, comment and social networks not yet propose.This patent is based on deep learning algorithm, give the recommended flowsheet with stronger expansion, it analyzes different data and is suitble to the related algorithm used and principle, it is proposed the final loss function in conjunction with comment, scoring and social information according to the loss function of different data, improves the accuracy of recommendation results.
Description
Technical field
Deep learning has been widely used in image and audio identification, text classification and has indicated the fields such as study in recent years,
Recommender system based on deep learning also becomes the research hotspot of scholars.Deep learning model is in certain numbers such as image, texts
According to expression study in all achieve fabulous effect, avoid complicated Feature Engineering, available isomeric data it is non-thread
Property multi-level abstract characteristics indicate, overcome the heterogeneity of a variety of data.Currently, merging scoring, comment and social networks
Deep learning recommended models not yet propose.This patent is based on deep learning algorithm, gives pushing away with stronger expansion
Recommend model.
Background technique
Current deep learning model cannot still be recommended in conjunction with scoring, comment and social network information.Because of multi-source
The character representation of isomeric data still has difficulty, social information and other users and article interactive information and can not directly merge.If
Can be using the expression of deep learning method study different isomerization data and unified into a deep learning model by them, it will solve
Research is indicated in the shortcomings that selecting algorithms of different that needs on algorithm fusion, and using deep learning learning characteristic before certainly, will
Significantly improve the accuracy of recommendation results.The advantages of in order to make full use of these three data, the spy that this patent fusion is scored, commented on
During levying and social information being added to training, the multi-source heterogeneous data recommendation model based on deep learning is proposed.
For comment data, traditional topic model cannot accurately indicate the feature of text, and this patent passes through PV-DBOW mould
Type learns to comment on the character representation of document, and PV-DBOW assumes the independence between the word in document, and use the document
Predict the word each observed.PV-DBOW represents each document by a dense vector, this vector is trained to come pre-
Survey the word in document.For score data, traditional matrix disassembling method faces Sparse and the lower difficulty of accuracy rate,
This patent is scored using neural metwork training, can preferably embody the feature of user and article.For social network data, originally
Patent to based on BPR to the social relationship information for increasing user in learning method so that the sampling of this method more adduction
Reason, improves the accuracy of recommendation results.
Summary of the invention
The recommended models for being capable of handling multi-source heterogeneous data are proposed based on deep learning, model has accuracy height, can expand
The advantages that malleability is strong.Model has selected the text fragment based on deep learning to indicate learning method, devises based on scoring study
The neural network of user and article characteristics, and by social networks constraint based on pair study.Depth is based on due to existing
The method of the text representation study of habit have been relatively mature, it is possible to directly use existing network, by its training result and
Other Fusion Features are trained together, so that obtaining more accurate fusion feature indicates.Score data and text data are not
It is same, it can directly learn to obtain the feature of user and article, so not needing the vector of study scoring indicates, without
Practise substance feature.
Multi-source heterogeneous data recommendation model based on deep learning includes three kinds of data: comment, scoring and social networks, often
The characteristics of kind data have itself, embodies the feature of user or article from different perspectives.Learn a variety of numbers by depth model
According to vector indicate, the fusion feature of user or article is then obtained by concatenated method.Comment on feature instantiation user couple
The attitude of article can also be used to the attribute for indicating article, the mark sheet that model passes through PV-DBOW algorithm study comment paragraph
Show, then weighted superposition obtains user or the vector of article and indicates.Scoring is characterized in the overall evaluation of the user to article, embodies use
Family can learn the satisfactory level of article using BPR to the nonlinear characteristic of user and article.Social networks embodies
Friend relation between user affects the interaction between user and article indirectly, using social network relationships can reinforce to
The constraint of family buying behavior, to further increase recommendation results accuracy.
The above method contains following steps:
(1) it Text character extraction: is indicated using the feature vector of PV-DBOW model learning text fragment.What model used
It is distributed bag of words (DistributedBag-of-Words), which predicts paragraph using a paragraph vector
In word obtained through stochastical sampling.
(2) score feature extraction: the neural network connected entirely using two layers learns scoring of the user to article.With text
Unlike feature learning model, this method can directly obtain user and the feature vector of article and indicate, rather than directly mention
Take the feature of scoring.
(3) consumer articles Fusion Features: the comment text feature acquired according to (1) can be commented what every user issued
User characteristics are obtained by feature vector weighted sum, the comment feature vector weighted sum that article receives is obtained into article characteristics.
The text of user and scoring Fusion Features are finally obtained into the fusion feature of user using fusion function, by the text of article and commented
Fusion Features are divided to obtain the fusion feature of article.
(4) based on the optimization of BPR: sampling to obtain the triple with user preference based on social networks, according to Bayes
Theoretical optimization obtains optimum model parameter.
(5) recommend: the feature vector of user and article is input in model by the model parameter acquired according to step (4)
Recommend article for user.
In step (1) Text character extraction, including following four steps:
1. Text Pretreatment
Each paragraph is indicated using an one-hot vector, represents the column in paragraph matrix.It will be in comment text
Word carries out duplicate removal, is added in dictionary, and each word is indicated using unique one-hot vector.After the completion of building, square is commented on
Each column can uniquely correspond to a comment in battle array.
2. word samples
Model predicts the word in paragraph using a paragraph vector, wherein the word in paragraph by paragraph with
Machine samples to obtain.Each word is counted as independent presence in paragraph, and the sequence of word does not influence the study knot of paragraph vector
Fruit.
3. optimizing
Use step 1. in paragraph vector as input, the word for using step 2. to sample as export, constantly
Iteration, training paragraph vector model.The model is constructed based on neural network, using softmax classifier, using under stochastic gradient
Drop method acquires model parameter.
4. the character representation of comment text
After the completion of training, each column of paragraph matrix are the feature vector of every section of comment.It is 1. defined using step every
The one-hot vector sum matrix multiple of section comment, can be obtained the character representation when previous paragraphs.
Step (2) scoring feature extraction includes following three step:
1. neural network constructs
The neural network that scoring Feature Selection Model is connected entirely based on two layers, uses ELU activation primitive.Neural network it is defeated
Enter the result being multiplied for the feature of user with the characteristic element of article.Neural network output is scoring of the user to article.
2. consumer articles characteristic optimization
According to objective function, the scoring feature vector of user and article is continued to optimize using stochastic gradient descent method, is reduced
Loss.When prediction scoring and practical scoring are closer to, the scoring feature of user and article is can be obtained in deconditioning.
Step (3) is merged the feature of step (1) and step (2) to obtain new fusion feature.
Scoring feature represents overall assessment of the user to article, simple clear, and contains the difference of user in commenting on
Viewpoint, in further detail.Step (3) will be commented on and scoring feature merges, and more being enriched comprehensive user characteristics indicates.
The method of fusion using will comment on feature vector and scoring feature vector connect, the character representation merged to
Amount.
The optimization of step (4) based on BPR mainly comprises the steps of:
1. triple generates
Because the preference of user often has similitude with their good friend, the reflection in reality is exactly that user more holds
The article for easily selecting its good friend to buy or have a preference for.The similitude of this preference between user is applied to the sampling of BPR model
, by more reasonably constraining to sampling process, the triple for being more in line with user behavior is obtained, to improve subsequent mould
Type training and the accuracy recommended.
2. model optimization
It is proposed that unified objective function carries out model optimization.The fusion function of multi-source heterogeneous data has been proposed above,
Need to construct objective function according to fusion feature now, enable in learning process fusion feature more accurate representation user or
The feature of article.Stochastic gradient descent method can be used to be solved, existing deep learning frame is all integrated with stochastic gradient
Descent algorithm can obtain the feature vector of final user and article by called side Faku County.
Step (5) is that user recommends interested article.
Each user does not buy with other or the feature vector of article that does not browse is multiplied and the user can be obtained to the object
The preference-score of product, score is higher to be represented user and is more possible to buy or browse the article.The score descending of all items is arranged
Arrange the Top-N recommendation list for taking top n that can acquire user.
Detailed description of the invention
Fig. 1 is the mixed recommendation model flow figure based on multi-source heterogeneous data.
Specific embodiment
According to the method introduction in specification, implements the recommended models based on multi-source heterogeneous data and needs following steps:
(1) Text character extraction
1. Text Pretreatment
Use duvUser u is indicated to the comment text of article v, the word that comment text includes is indicated using w, pass through
The feature vector of user and article that user learns the comment of article use u1And v1It indicates, the feature vector of paragraph makes
Use duvIt indicates, term vector is indicated using w, the word of all comments is stored in dictionary V.The dimension of these feature vectors
Number is all K.
2. word samples
For each paragraph, text filed, some words of stochastical sampling sampling from the region are randomly selected, as
The result of training classifier.Text filed size and the number for choosing word in this region are manually set.
3. optimizing
Every section of comment can be all mapped in a random higher-dimension semantic space, then be carried out to the word for including in paragraph
Prediction is optimized by study, and obtaining more accurate paragraph feature vector indicates.According to bag of words it is assumed that each word w exists
Document duvThe probability of middle appearance is calculated using softmax:
Wherein w ' indicates that the whole words for belonging to dictionary V, exp are indicated using e as the exponential function at bottom.It can be with by this formula
Acquire the probability that any word occurs in document.During actually there is word probability in maximization, the solution of gradient solution
Expense is larger.In order to reduce the expense of calculating, often using the method for negative sampling in calculating process, in the word not occurred
According to a predefined noise profile come sampling section word, approximate calculation is carried out as negative sample, rather than uses dictionary
In all word.Based on the strategy of negative sampling, then the objective function of PV-DBOW is defined as:
The combination of all word with document is all added by above formula, whereinIt is word w in document duvMiddle appearance
Number, if not occurring functional value be 0.What is represented is sigmoid function, and t is the number of negative sample,What is indicated is in noise profile PVIn,Expectation.
4. the character representation of comment text
According to above-mentioned objective function, the character representation d of available documentuvAnd it is proposed in this paper based on conventional machines
Similar in the recommended models of learning method, the feature vector of user and article can be indicated according to the feature vector of comment.No
It crosses the character representation of user herein and article no longer to be calculated by the average of comment feature vector, but passes through subsequent Models Sets
Learn at optimization.
By the feature vector weighting summation of all comments of user and normalization obtains the user characteristics factor:
Wherein DuIndicate all comment numbers of user u, p'ukIndicate total probability of the user on topic k, WuvIndicate user u
For the weight of i-th of comment of sending, pukIt is its normalized expression.The characterization factor of user u are as follows:
pu=(pu1,...,puK)
User characteristics factor dimension is K.The article characteristics factor can be used similar formula and calculate:
Wherein DvIndicate all comment numbers that article receives, q'vkIndicate total probability of the article on topic k, qvkIt is that it is returned
One expression changed, WuvIndicate article v for the weight of u-th of the comment received.The characterization factor of article are as follows:
qv=(qv1,...,qvK)
K is that the number of dimensions of article and user are consistent.
Wherein WuvTo comment on duvIt, could be to the important journey of different comments by weight for the weight of user u and article i
Degree makes differentiation, to construct reasonable user and article characteristics.
(2) scoring feature extraction
1. neural network constructs
The neural network connected entirely using two layers trains to obtain final user to article score.Use can be directly obtained
The feature vector of family and article indicates.Define ruiScoring of the user u to article i is indicated, then for the r that arbitrarily scoresuiHave
User ruWith corresponding article riIt is corresponding to it.Then available two layers of neural network prediction formula:
rui=φ (U2·φ(U1(ru⊙ri)+c1)+c2)
Wherein ⊙ indicates that element multiplication, φ (x) are ELU activation primitive, U1、U2、c1And c2To need the weight that learns and partially
Poor parameter.
2. consumer articles characteristic optimization
Objective function is to predict scoring and square subtracted each other that really score, and Optimal Parameters make objective function minimum
Obtaining optimal user and article indicates.
(3) consumer articles Fusion Features
The feature vector of user and article are constructed by the interactive information between user and article.It is proposed a fusion feature
Function f (), it is assumed that the character representation that learn is x from scoring and text data1,x2, then then may be used by fusion function
To obtain fused feature:
X=f (x1,x2)
Wherein x is fused feature.It is merged using simple series system, user can be enhanced and article is special
The scalability of sign, this has great importance for the model based on multi-source heterogeneous data.So obtained by function f ()
Feature is
(4) based on the optimization of BPR
1. triple generates
The user is bought or browsed by each user u according to the purchase of user or browsing record and social networks
The article crossed is defined as i, user is defined as j from not in contact with the article crossed, then the article that the user good friend was bought defines
For p.All article set are defined as D in system, then user u purchase and the article set browsed are defined as Du, user
The article that good friend bought is defined as Dp.The article that user preference can most be represented is article D that user has bought firsti,
It is secondary, according to the similitude of good friend's preference, user be likely to buy its good friend bought but the article D that does not buy of the userp\
Du.Finally, the article that user most unlikely buys be D (Du∪Dp).User and article are constructed according to social network information
Triple can be represented as training set, training set T:
T:=(u, i, j) | i ∈ (Du∪Dp),j∈D\(Du∪Dp)}
Wherein (u, i, j) is consumer articles triple, represents user u and is greater than article j for the preference of article i.Its
Middle article i belongs to the article that user bought or the direct good friend of user bought, and it is to buy and it is straight that article j, which belongs to user,
Connect the article that friend did not also buy.Thus the direct friend relation based on user constructs consumer articles triple, is used for
The training of subsequent BPR model.
2. model optimization
It is greater than article j according to preference of the definition known users u of front to article i.It is indicated using function g ()
In conjunction with the loss function that user and article characteristics indicate, g () is defined as sigmoid function to calculate user couple herein herein
Difference preference's degree of different articles, then there is g (u, i, j)=σ (uTi-uTj).So entirely merging pushing away for multi-source heterogeneous data
The objective function for recommending model is defined as:
Wherein W is the weight parameter of every kind of model, indicates the weight ginseng of every comment of user in learning model in comment
Number is different from, and needs to obtain by study.And in the model that scoring calculates, that acquire is directly the spy of user and article
Sign, that is, weight parameter are set as 1, do not need to be updated by optimization object function.What wherein Θ was represented is mould
The other parameters for needing to learn in type, Θ={ Θ1,Θ2}={ { w, duv},{U1,U2,c1,c2,ru,ri}}.λ is every kind of model
Punishment parameter, their value is all on [0,1] section.It is added to negative sign before the objective function of Rating Model, because commenting
The objective function of sub-model needs to minimize, and the objective function of overall model is to maximize.
(5) recommend
Personalized recommendation list can be multiplied to obtain by user with the feature vector of article:
S=uTv
Each user does not buy with other or the feature vector of article that does not browse is multiplied and the user can be obtained to the object
The preference-score of product, score is higher to be represented user and is more possible to buy or browse the article.The score descending of all items is arranged
Arrange the Top-N recommendation list for taking top n that can acquire user.
Claims (11)
1. proposing the recommended models for being capable of handling multi-source heterogeneous data based on deep learning, model has accuracy high, expansible
The advantages that property is strong.The above method contains following steps:
(1) it Text character extraction: is indicated using the feature vector of PV-DBOW model learning text fragment;Model using point
The bag of words (Distributed Bag-of-Words) of cloth, the model predicted using a paragraph vector in paragraph with
The word that machine samples;
(2) score feature extraction: the neural network connected entirely using two layers learns scoring of the user to article;With text feature
Unlike learning model, this method can directly obtain user and the feature vector of article and indicate, rather than directly extracts and comment
The feature divided;
(3) consumer articles Fusion Features: the comment text feature acquired according to (1), the comment that every user can be issued are special
Sign vector weighted sum obtains user characteristics, the comment feature vector weighted sum that article receives is obtained article characteristics, finally
The text of user and scoring Fusion Features are obtained into the fusion feature of user using fusion function, the text of article and scoring is special
Sign fusion obtains the fusion feature of article;
(4) based on the optimization of BPR: sampling to obtain the triple with user preference based on social networks, according to bayesian theory
Optimization obtains optimum model parameter;
(5) recommend: the feature vector of user and article are input in model to use by the model parameter acquired according to step (4)
Recommend article in family.
2. (1) Text character extraction step described in claim 1, Text Pretreatment therein uses duvTo indicate user
To the comment text of article v, the word that comment text includes is indicated u using w, is learnt by user to the comment of article
The feature vector of user and article uses u1And v1It indicates, the feature vector of paragraph uses duvIt indicates, term vector uses w to come
It indicates, the word of all comments is stored in dictionary V;The number of dimensions of these feature vectors is all K.
3. (1) Text character extraction step described in claim 1, word sampling therein is for each paragraph, at random
Choose text filed, some words of stochastical sampling sampling, the result as training classifier from the region;It is text filed
Size and in this region choose word number manually set.
4. (1) Text character extraction step described in claim 1, every section of comment can be all mapped in optimization therein
In the higher-dimension semantic space random to one, then the word for including in paragraph is predicted, is optimized by study, is obtained more
Accurate paragraph feature vector indicates;According to bag of words it is assumed that each word w in document duvThe probability of middle appearance uses
Softmax is calculated:
Wherein w ' expression belongs to whole words of dictionary V, and exp is indicated using e as the exponential function at bottom;It can be in the hope of by this formula
The probability that any word occurs in document;During actually there is word probability in maximization, the solution expense of gradient solution
It is larger;In order to reduce the expense of calculating, often using the method for negative sampling in calculating process, the basis in the word not occurred
One predefined noise profile carrys out sampling section word, carries out approximate calculation as negative sample, rather than uses institute in dictionary
Some words;Based on the strategy of negative sampling, then the objective function of PV-DBOW is defined as:
The combination of all word with document is all added by above formula, whereinIt is word w in document duvTime of middle appearance
Number, functional value is 0 if not occurring;What is represented is sigmoid function, and t is the number of negative sample,What is indicated is in noise profile PVIn,Expectation.
5. (1) Text character extraction step described in claim 1, the character representation of comment text therein specifically just like
Lower feature: according to above-mentioned objective function, the character representation d of available documentuvAnd it is proposed in this paper based on conventional machines
Similar in the recommended models of learning method, the feature vector of user and article can be indicated according to the feature vector of comment.But
The character representation of user and article is no longer calculated by the average of comment feature vector herein, but passes through subsequent model integrated
Optimize to learn.
By the feature vector weighting summation of all comments of user and normalization obtains the user characteristics factor:
Wherein DuIndicate all comment numbers of user u, p 'ukIndicate total probability of the user on topic k, WuvIndicate user u for
The weight of i-th of the comment issued, pukIt is its normalized expression;The characterization factor of user u are as follows:
pu=(pu1..., puK)
User characteristics factor dimension is K;The article characteristics factor can be used similar formula and calculate:
Wherein DvIndicate all comment numbers that article receives, q 'vkIndicate total probability of the article on topic k, qvkIt is its normalization
Expression, WuvIndicate article v for the weight of u-th of the comment received;The characterization factor of article are as follows:
qv=(qv1..., qvK)
K is that the number of dimensions of article and user are consistent;Wherein WuvTo comment on duvFor the weight of user u and article i, pass through
Weight could make differentiation to the significance level of different comments, to construct reasonable user and article characteristics.
The characteristic extraction step 6. (2) described in claim 1 score, neural network building therein use two layers of full connection
Neural network train to obtain final user to article score;The feature vector table of user and article can be directly obtained
Show;Define ruiScoring of the user u to article i is indicated, then for the r that arbitrarily scoresuiThere is user ruWith corresponding article ri
It is corresponding to it;Then available two layers of neural network prediction formula:
rui=φ (U2·φ(U1(ru⊙ri)+c1)+c2)
Wherein ⊙ indicates that element multiplication, φ (x) are ELU activation primitive, U1、U2、c1And c2For weight and the deviation ginseng for needing to learn
Number.
The characteristic extraction step 7. (2) described in claim 1 score, consumer articles characteristic optimization objective function therein are
Make objective function minimum that optimal user and object can be obtained for prediction scoring and square subtracted each other that really score, Optimal Parameters
Product indicate.
8. (3) user characteristics fusion steps described in claim 1 are constructed by the interactive information between user and article
The feature vector of user and article;It is proposed the function f () an of fusion feature, it is assumed that from scoring and text data in study to
Character representation be x1, x2, then passing through fusion function then available fused feature:
X=f (x1, x2)
Wherein x is fused feature;It is merged using simple series system, user and article characteristics can be enhanced
Scalability, this has great importance for the model based on multi-source heterogeneous data;The feature so obtained by function f ()
For
9. the Optimization Steps of (4) based on BPR described in claim 1, triple therein generates the purchase according to user
Or the article that the user buys or browsed is defined as i, by user for each user u by browsing record and social networks
It is defined as j from not in contact with the article crossed, then the article that the user good friend bought is defined as p;All article collection in system
Conjunction is defined as D, then user u purchase and the article set browsed are defined as Du, the article that user good friend bought is defined as
Dp;The article that user preference can most be represented is article D that user has bought firsti;Secondly, according to the similitude of good friend's preference,
User be likely to buy its good friend bought but the article D that does not buy of the userp\Du;Finally, user most unlikely buys
Article be D (Du∪Dp);The triple of user and article is constructed as training set, the training set according to social network information
T can be represented as:
T:=(u, i, j) | i ∈ (Du∪Dp), j ∈ D (Du∪Dp)}
Wherein (u, i, j) is consumer articles triple, represents user u and is greater than article j for the preference of article i;Wherein object
Product i belongs to the article that user bought or the direct good friend of user bought, and it is to buy and it is directly good that article j, which belongs to user,
The article that friend did not also buy;Thus the direct friend relation based on user constructs consumer articles triple, for subsequent
The training of BPR model.
10. the Optimization Steps of (4) based on BPR described in claim 1, model optimization therein according to the definition of front
Know that user u is greater than article j to the preference of article i;It indicates user and article characteristics is combined to indicate using function g ()
Loss function, by g () is defined as sigmoid function herein herein to calculate user to difference preference's degree of different articles,
So there is g (u, i, j)=σ (uTi-uTj);So the objective function for entirely merging the recommended models of multi-source heterogeneous data is defined
Are as follows:
Wherein W is the weight parameter of every kind of model, indicates the weight parameter of every comment of user in learning model all in comment
It is not identical, it needs to obtain by study;And in the model that scoring calculates, what is acquired is directly the feature of user and article,
It is exactly that weight parameter is set as 1, does not need to be updated by optimization object function;What wherein Θ was represented is needed in model
The other parameters to be learnt, Θ={ Θ1, Θ2}={ { w, duv, { U1, U2, c1, c2, ru, ri}};λ is the punishment of every kind of model
Parameter, their value is all on [0,1] section;It is added to negative sign before the objective function of Rating Model, because of Rating Model
Objective function need to minimize, and the objective function of overall model be maximize.
11. (5) recommendation step described in claim 1, personalized recommendation list can pass through user and article
Feature vector is multiplied to obtain:
S=uTv
Each user does not buy with other or the feature vector of article that does not browse is multiplied and the user can be obtained to the article
Preference-score, score is higher to be represented user and is more possible to buy or browse the article;The score descending arrangement of all items is taken
Top n can acquire the Tbp-N recommendation list of user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910547320.1A CN110263257B (en) | 2019-06-24 | 2019-06-24 | Deep learning based recommendation method for processing multi-source heterogeneous data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910547320.1A CN110263257B (en) | 2019-06-24 | 2019-06-24 | Deep learning based recommendation method for processing multi-source heterogeneous data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110263257A true CN110263257A (en) | 2019-09-20 |
CN110263257B CN110263257B (en) | 2021-08-17 |
Family
ID=67920670
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910547320.1A Active CN110263257B (en) | 2019-06-24 | 2019-06-24 | Deep learning based recommendation method for processing multi-source heterogeneous data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263257B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111045716A (en) * | 2019-11-04 | 2020-04-21 | 中山大学 | Related patch recommendation method based on heterogeneous data |
CN111046672A (en) * | 2019-12-11 | 2020-04-21 | 山东众阳健康科技集团有限公司 | Multi-scene text abstract generation method |
CN111274406A (en) * | 2020-03-02 | 2020-06-12 | 湘潭大学 | Text classification method based on deep learning hybrid model |
CN111612573A (en) * | 2020-04-30 | 2020-09-01 | 杭州电子科技大学 | Recommendation system scoring recommendation prediction method based on full Bayesian method |
CN112232929A (en) * | 2020-11-05 | 2021-01-15 | 南京工业大学 | Multi-modal diversity recommendation list generation method for complementary articles |
CN112364258A (en) * | 2020-11-23 | 2021-02-12 | 北京明略软件***有限公司 | Map-based recommendation method, system, storage medium and electronic device |
CN112967101A (en) * | 2021-04-07 | 2021-06-15 | 重庆大学 | Collaborative filtering article recommendation method based on multi-interaction information of social users |
CN113064965A (en) * | 2021-03-23 | 2021-07-02 | 南京航空航天大学 | Intelligent recommendation method for similar cases of civil aviation unplanned events based on deep learning |
WO2021159776A1 (en) * | 2020-02-13 | 2021-08-19 | 腾讯科技(深圳)有限公司 | Artificial intelligence-based recommendation method and apparatus, electronic device, and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103399858A (en) * | 2013-07-01 | 2013-11-20 | 吉林大学 | Socialization collaborative filtering recommendation method based on trust |
US20140081965A1 (en) * | 2006-09-22 | 2014-03-20 | John Nicholas Gross | Content recommendations for Social Networks |
CN103778260A (en) * | 2014-03-03 | 2014-05-07 | 哈尔滨工业大学 | Individualized microblog information recommending system and method |
CN106022869A (en) * | 2016-05-12 | 2016-10-12 | 北京邮电大学 | Consumption object recommending method and consumption object recommending device |
CN106600482A (en) * | 2016-12-30 | 2017-04-26 | 西北工业大学 | Multi-source social data fusion multi-angle travel information perception and intelligent recommendation method |
CN107025606A (en) * | 2017-03-29 | 2017-08-08 | 西安电子科技大学 | The item recommendation method of score data and trusting relationship is combined in a kind of social networks |
CN108595527A (en) * | 2018-03-28 | 2018-09-28 | 中山大学 | A kind of personalized recommendation method and system of the multi-source heterogeneous information of fusion |
-
2019
- 2019-06-24 CN CN201910547320.1A patent/CN110263257B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140081965A1 (en) * | 2006-09-22 | 2014-03-20 | John Nicholas Gross | Content recommendations for Social Networks |
CN103399858A (en) * | 2013-07-01 | 2013-11-20 | 吉林大学 | Socialization collaborative filtering recommendation method based on trust |
CN103778260A (en) * | 2014-03-03 | 2014-05-07 | 哈尔滨工业大学 | Individualized microblog information recommending system and method |
CN106022869A (en) * | 2016-05-12 | 2016-10-12 | 北京邮电大学 | Consumption object recommending method and consumption object recommending device |
CN106600482A (en) * | 2016-12-30 | 2017-04-26 | 西北工业大学 | Multi-source social data fusion multi-angle travel information perception and intelligent recommendation method |
CN107025606A (en) * | 2017-03-29 | 2017-08-08 | 西安电子科技大学 | The item recommendation method of score data and trusting relationship is combined in a kind of social networks |
CN108595527A (en) * | 2018-03-28 | 2018-09-28 | 中山大学 | A kind of personalized recommendation method and system of the multi-source heterogeneous information of fusion |
Non-Patent Citations (3)
Title |
---|
JI Z ET AL.: "Recommendation Based on Review Texts and Social Communities: A Hybrid Model", 《IEEE ACCESS》 * |
冀振燕 等: "融合多源异构数据的混合推荐模型", 《北京邮电大学学报》 * |
冀振燕等: "个性化图像检索和推荐 ", 《北京邮电大学学报》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111045716A (en) * | 2019-11-04 | 2020-04-21 | 中山大学 | Related patch recommendation method based on heterogeneous data |
CN111045716B (en) * | 2019-11-04 | 2022-02-22 | 中山大学 | Related patch recommendation method based on heterogeneous data |
CN111046672A (en) * | 2019-12-11 | 2020-04-21 | 山东众阳健康科技集团有限公司 | Multi-scene text abstract generation method |
WO2021159776A1 (en) * | 2020-02-13 | 2021-08-19 | 腾讯科技(深圳)有限公司 | Artificial intelligence-based recommendation method and apparatus, electronic device, and storage medium |
CN111274406A (en) * | 2020-03-02 | 2020-06-12 | 湘潭大学 | Text classification method based on deep learning hybrid model |
CN111612573A (en) * | 2020-04-30 | 2020-09-01 | 杭州电子科技大学 | Recommendation system scoring recommendation prediction method based on full Bayesian method |
CN111612573B (en) * | 2020-04-30 | 2023-04-25 | 杭州电子科技大学 | Recommendation system scoring recommendation prediction method based on full Bayesian method |
CN112232929A (en) * | 2020-11-05 | 2021-01-15 | 南京工业大学 | Multi-modal diversity recommendation list generation method for complementary articles |
CN112364258A (en) * | 2020-11-23 | 2021-02-12 | 北京明略软件***有限公司 | Map-based recommendation method, system, storage medium and electronic device |
CN112364258B (en) * | 2020-11-23 | 2024-02-27 | 北京明略软件***有限公司 | Recommendation method and system based on map, storage medium and electronic equipment |
CN113064965A (en) * | 2021-03-23 | 2021-07-02 | 南京航空航天大学 | Intelligent recommendation method for similar cases of civil aviation unplanned events based on deep learning |
CN112967101A (en) * | 2021-04-07 | 2021-06-15 | 重庆大学 | Collaborative filtering article recommendation method based on multi-interaction information of social users |
Also Published As
Publication number | Publication date |
---|---|
CN110263257B (en) | 2021-08-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263257A (en) | Multi-source heterogeneous data mixing recommended models based on deep learning | |
CN108763362B (en) | Local model weighted fusion Top-N movie recommendation method based on random anchor point pair selection | |
CN102609523B (en) | The collaborative filtering recommending method classified based on taxonomy of goods and user | |
CN110674407B (en) | Hybrid recommendation method based on graph convolution neural network | |
CN110458627B (en) | Commodity sequence personalized recommendation method for dynamic preference of user | |
CN103778214B (en) | A kind of item property clustering method based on user comment | |
CN108363804A (en) | Local model weighted fusion Top-N movie recommendation method based on user clustering | |
CN110110181A (en) | A kind of garment coordination recommended method based on user styles and scene preference | |
CN109145112A (en) | A kind of comment on commodity classification method based on global information attention mechanism | |
CN103617289B (en) | Micro-blog recommendation method based on user characteristics and cyberrelationship | |
CN107220365A (en) | Accurate commending system and method based on collaborative filtering and correlation rule parallel processing | |
CN102968506A (en) | Personalized collaborative filtering recommendation method based on extension characteristic vectors | |
CN103426102A (en) | Commodity feature recommending method based on body classification | |
CN105913296A (en) | Customized recommendation method based on graphs | |
CN107330727A (en) | A kind of personalized recommendation method based on hidden semantic model | |
CN109146626A (en) | A kind of fashion clothing collocation recommended method based on user's dynamic interest analysis | |
CN109902229B (en) | Comment-based interpretable recommendation method | |
CN109584006B (en) | Cross-platform commodity matching method based on deep matching model | |
CN106951471A (en) | A kind of construction method of the label prediction of the development trend model based on SVM | |
CN105138508A (en) | Preference diffusion based context recommendation system | |
CN106157156A (en) | A kind of cooperation recommending system based on communities of users | |
CN109670909A (en) | A kind of travelling products recommended method decomposed based on probability matrix with Fusion Features | |
CN109933721A (en) | A kind of interpretable recommended method merging user concealed article preference and implicit trust | |
CN110415063A (en) | Method of Commodity Recommendation, device, electronic equipment and readable medium | |
CN109903138A (en) | A kind of individual commodity recommendation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20190920 Assignee: Institute of Software, Chinese Academy of Sciences Assignor: Beijing Jiaotong University Contract record no.: X2022990000602 Denomination of invention: Recommendation method for processing multi-source heterogeneous data based on deep learning Granted publication date: 20210817 License type: Common License Record date: 20220905 |