CN110263257A - Multi-source heterogeneous data mixing recommended models based on deep learning - Google Patents

Multi-source heterogeneous data mixing recommended models based on deep learning Download PDF

Info

Publication number
CN110263257A
CN110263257A CN201910547320.1A CN201910547320A CN110263257A CN 110263257 A CN110263257 A CN 110263257A CN 201910547320 A CN201910547320 A CN 201910547320A CN 110263257 A CN110263257 A CN 110263257A
Authority
CN
China
Prior art keywords
user
article
comment
model
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910547320.1A
Other languages
Chinese (zh)
Other versions
CN110263257B (en
Inventor
冀振燕
宋晓军
赵颖斯
皮怀雨
李俊东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN201910547320.1A priority Critical patent/CN110263257B/en
Publication of CN110263257A publication Critical patent/CN110263257A/en
Application granted granted Critical
Publication of CN110263257B publication Critical patent/CN110263257B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Finance (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Deep learning has been widely used in image and audio identification, text classification and has indicated that fields, the recommender systems based on deep learning such as study also become the research hotspot of scholars in recent years.Deep learning model all achieves fabulous effect in the expression study of the specific datas such as image, text, avoids complicated Feature Engineering, and the abstract characteristics of the Nonlinear Multi time of available isomeric data indicate, overcome the heterogeneity of a variety of data.Currently, the deep learning recommended models of fusion scoring, comment and social networks not yet propose.This patent is based on deep learning algorithm, give the recommended flowsheet with stronger expansion, it analyzes different data and is suitble to the related algorithm used and principle, it is proposed the final loss function in conjunction with comment, scoring and social information according to the loss function of different data, improves the accuracy of recommendation results.

Description

Multi-source heterogeneous data mixing recommended models based on deep learning
Technical field
Deep learning has been widely used in image and audio identification, text classification and has indicated the fields such as study in recent years, Recommender system based on deep learning also becomes the research hotspot of scholars.Deep learning model is in certain numbers such as image, texts According to expression study in all achieve fabulous effect, avoid complicated Feature Engineering, available isomeric data it is non-thread Property multi-level abstract characteristics indicate, overcome the heterogeneity of a variety of data.Currently, merging scoring, comment and social networks Deep learning recommended models not yet propose.This patent is based on deep learning algorithm, gives pushing away with stronger expansion Recommend model.
Background technique
Current deep learning model cannot still be recommended in conjunction with scoring, comment and social network information.Because of multi-source The character representation of isomeric data still has difficulty, social information and other users and article interactive information and can not directly merge.If Can be using the expression of deep learning method study different isomerization data and unified into a deep learning model by them, it will solve Research is indicated in the shortcomings that selecting algorithms of different that needs on algorithm fusion, and using deep learning learning characteristic before certainly, will Significantly improve the accuracy of recommendation results.The advantages of in order to make full use of these three data, the spy that this patent fusion is scored, commented on During levying and social information being added to training, the multi-source heterogeneous data recommendation model based on deep learning is proposed.
For comment data, traditional topic model cannot accurately indicate the feature of text, and this patent passes through PV-DBOW mould Type learns to comment on the character representation of document, and PV-DBOW assumes the independence between the word in document, and use the document Predict the word each observed.PV-DBOW represents each document by a dense vector, this vector is trained to come pre- Survey the word in document.For score data, traditional matrix disassembling method faces Sparse and the lower difficulty of accuracy rate, This patent is scored using neural metwork training, can preferably embody the feature of user and article.For social network data, originally Patent to based on BPR to the social relationship information for increasing user in learning method so that the sampling of this method more adduction Reason, improves the accuracy of recommendation results.
Summary of the invention
The recommended models for being capable of handling multi-source heterogeneous data are proposed based on deep learning, model has accuracy height, can expand The advantages that malleability is strong.Model has selected the text fragment based on deep learning to indicate learning method, devises based on scoring study The neural network of user and article characteristics, and by social networks constraint based on pair study.Depth is based on due to existing The method of the text representation study of habit have been relatively mature, it is possible to directly use existing network, by its training result and Other Fusion Features are trained together, so that obtaining more accurate fusion feature indicates.Score data and text data are not It is same, it can directly learn to obtain the feature of user and article, so not needing the vector of study scoring indicates, without Practise substance feature.
Multi-source heterogeneous data recommendation model based on deep learning includes three kinds of data: comment, scoring and social networks, often The characteristics of kind data have itself, embodies the feature of user or article from different perspectives.Learn a variety of numbers by depth model According to vector indicate, the fusion feature of user or article is then obtained by concatenated method.Comment on feature instantiation user couple The attitude of article can also be used to the attribute for indicating article, the mark sheet that model passes through PV-DBOW algorithm study comment paragraph Show, then weighted superposition obtains user or the vector of article and indicates.Scoring is characterized in the overall evaluation of the user to article, embodies use Family can learn the satisfactory level of article using BPR to the nonlinear characteristic of user and article.Social networks embodies Friend relation between user affects the interaction between user and article indirectly, using social network relationships can reinforce to The constraint of family buying behavior, to further increase recommendation results accuracy.
The above method contains following steps:
(1) it Text character extraction: is indicated using the feature vector of PV-DBOW model learning text fragment.What model used It is distributed bag of words (DistributedBag-of-Words), which predicts paragraph using a paragraph vector In word obtained through stochastical sampling.
(2) score feature extraction: the neural network connected entirely using two layers learns scoring of the user to article.With text Unlike feature learning model, this method can directly obtain user and the feature vector of article and indicate, rather than directly mention Take the feature of scoring.
(3) consumer articles Fusion Features: the comment text feature acquired according to (1) can be commented what every user issued User characteristics are obtained by feature vector weighted sum, the comment feature vector weighted sum that article receives is obtained into article characteristics. The text of user and scoring Fusion Features are finally obtained into the fusion feature of user using fusion function, by the text of article and commented Fusion Features are divided to obtain the fusion feature of article.
(4) based on the optimization of BPR: sampling to obtain the triple with user preference based on social networks, according to Bayes Theoretical optimization obtains optimum model parameter.
(5) recommend: the feature vector of user and article is input in model by the model parameter acquired according to step (4) Recommend article for user.
In step (1) Text character extraction, including following four steps:
1. Text Pretreatment
Each paragraph is indicated using an one-hot vector, represents the column in paragraph matrix.It will be in comment text Word carries out duplicate removal, is added in dictionary, and each word is indicated using unique one-hot vector.After the completion of building, square is commented on Each column can uniquely correspond to a comment in battle array.
2. word samples
Model predicts the word in paragraph using a paragraph vector, wherein the word in paragraph by paragraph with Machine samples to obtain.Each word is counted as independent presence in paragraph, and the sequence of word does not influence the study knot of paragraph vector Fruit.
3. optimizing
Use step 1. in paragraph vector as input, the word for using step 2. to sample as export, constantly Iteration, training paragraph vector model.The model is constructed based on neural network, using softmax classifier, using under stochastic gradient Drop method acquires model parameter.
4. the character representation of comment text
After the completion of training, each column of paragraph matrix are the feature vector of every section of comment.It is 1. defined using step every The one-hot vector sum matrix multiple of section comment, can be obtained the character representation when previous paragraphs.
Step (2) scoring feature extraction includes following three step:
1. neural network constructs
The neural network that scoring Feature Selection Model is connected entirely based on two layers, uses ELU activation primitive.Neural network it is defeated Enter the result being multiplied for the feature of user with the characteristic element of article.Neural network output is scoring of the user to article.
2. consumer articles characteristic optimization
According to objective function, the scoring feature vector of user and article is continued to optimize using stochastic gradient descent method, is reduced Loss.When prediction scoring and practical scoring are closer to, the scoring feature of user and article is can be obtained in deconditioning.
Step (3) is merged the feature of step (1) and step (2) to obtain new fusion feature.
Scoring feature represents overall assessment of the user to article, simple clear, and contains the difference of user in commenting on Viewpoint, in further detail.Step (3) will be commented on and scoring feature merges, and more being enriched comprehensive user characteristics indicates. The method of fusion using will comment on feature vector and scoring feature vector connect, the character representation merged to Amount.
The optimization of step (4) based on BPR mainly comprises the steps of:
1. triple generates
Because the preference of user often has similitude with their good friend, the reflection in reality is exactly that user more holds The article for easily selecting its good friend to buy or have a preference for.The similitude of this preference between user is applied to the sampling of BPR model , by more reasonably constraining to sampling process, the triple for being more in line with user behavior is obtained, to improve subsequent mould Type training and the accuracy recommended.
2. model optimization
It is proposed that unified objective function carries out model optimization.The fusion function of multi-source heterogeneous data has been proposed above, Need to construct objective function according to fusion feature now, enable in learning process fusion feature more accurate representation user or The feature of article.Stochastic gradient descent method can be used to be solved, existing deep learning frame is all integrated with stochastic gradient Descent algorithm can obtain the feature vector of final user and article by called side Faku County.
Step (5) is that user recommends interested article.
Each user does not buy with other or the feature vector of article that does not browse is multiplied and the user can be obtained to the object The preference-score of product, score is higher to be represented user and is more possible to buy or browse the article.The score descending of all items is arranged Arrange the Top-N recommendation list for taking top n that can acquire user.
Detailed description of the invention
Fig. 1 is the mixed recommendation model flow figure based on multi-source heterogeneous data.
Specific embodiment
According to the method introduction in specification, implements the recommended models based on multi-source heterogeneous data and needs following steps:
(1) Text character extraction
1. Text Pretreatment
Use duvUser u is indicated to the comment text of article v, the word that comment text includes is indicated using w, pass through The feature vector of user and article that user learns the comment of article use u1And v1It indicates, the feature vector of paragraph makes Use duvIt indicates, term vector is indicated using w, the word of all comments is stored in dictionary V.The dimension of these feature vectors Number is all K.
2. word samples
For each paragraph, text filed, some words of stochastical sampling sampling from the region are randomly selected, as The result of training classifier.Text filed size and the number for choosing word in this region are manually set.
3. optimizing
Every section of comment can be all mapped in a random higher-dimension semantic space, then be carried out to the word for including in paragraph Prediction is optimized by study, and obtaining more accurate paragraph feature vector indicates.According to bag of words it is assumed that each word w exists Document duvThe probability of middle appearance is calculated using softmax:
Wherein w ' indicates that the whole words for belonging to dictionary V, exp are indicated using e as the exponential function at bottom.It can be with by this formula Acquire the probability that any word occurs in document.During actually there is word probability in maximization, the solution of gradient solution Expense is larger.In order to reduce the expense of calculating, often using the method for negative sampling in calculating process, in the word not occurred According to a predefined noise profile come sampling section word, approximate calculation is carried out as negative sample, rather than uses dictionary In all word.Based on the strategy of negative sampling, then the objective function of PV-DBOW is defined as:
The combination of all word with document is all added by above formula, whereinIt is word w in document duvMiddle appearance Number, if not occurring functional value be 0.What is represented is sigmoid function, and t is the number of negative sample,What is indicated is in noise profile PVIn,Expectation.
4. the character representation of comment text
According to above-mentioned objective function, the character representation d of available documentuvAnd it is proposed in this paper based on conventional machines Similar in the recommended models of learning method, the feature vector of user and article can be indicated according to the feature vector of comment.No It crosses the character representation of user herein and article no longer to be calculated by the average of comment feature vector, but passes through subsequent Models Sets Learn at optimization.
By the feature vector weighting summation of all comments of user and normalization obtains the user characteristics factor:
Wherein DuIndicate all comment numbers of user u, p'ukIndicate total probability of the user on topic k, WuvIndicate user u For the weight of i-th of comment of sending, pukIt is its normalized expression.The characterization factor of user u are as follows:
pu=(pu1,...,puK)
User characteristics factor dimension is K.The article characteristics factor can be used similar formula and calculate:
Wherein DvIndicate all comment numbers that article receives, q'vkIndicate total probability of the article on topic k, qvkIt is that it is returned One expression changed, WuvIndicate article v for the weight of u-th of the comment received.The characterization factor of article are as follows:
qv=(qv1,...,qvK)
K is that the number of dimensions of article and user are consistent.
Wherein WuvTo comment on duvIt, could be to the important journey of different comments by weight for the weight of user u and article i Degree makes differentiation, to construct reasonable user and article characteristics.
(2) scoring feature extraction
1. neural network constructs
The neural network connected entirely using two layers trains to obtain final user to article score.Use can be directly obtained The feature vector of family and article indicates.Define ruiScoring of the user u to article i is indicated, then for the r that arbitrarily scoresuiHave User ruWith corresponding article riIt is corresponding to it.Then available two layers of neural network prediction formula:
rui=φ (U2·φ(U1(ru⊙ri)+c1)+c2)
Wherein ⊙ indicates that element multiplication, φ (x) are ELU activation primitive, U1、U2、c1And c2To need the weight that learns and partially Poor parameter.
2. consumer articles characteristic optimization
Objective function is to predict scoring and square subtracted each other that really score, and Optimal Parameters make objective function minimum Obtaining optimal user and article indicates.
(3) consumer articles Fusion Features
The feature vector of user and article are constructed by the interactive information between user and article.It is proposed a fusion feature Function f (), it is assumed that the character representation that learn is x from scoring and text data1,x2, then then may be used by fusion function To obtain fused feature:
X=f (x1,x2)
Wherein x is fused feature.It is merged using simple series system, user can be enhanced and article is special The scalability of sign, this has great importance for the model based on multi-source heterogeneous data.So obtained by function f () Feature is
(4) based on the optimization of BPR
1. triple generates
The user is bought or browsed by each user u according to the purchase of user or browsing record and social networks The article crossed is defined as i, user is defined as j from not in contact with the article crossed, then the article that the user good friend was bought defines For p.All article set are defined as D in system, then user u purchase and the article set browsed are defined as Du, user The article that good friend bought is defined as Dp.The article that user preference can most be represented is article D that user has bought firsti, It is secondary, according to the similitude of good friend's preference, user be likely to buy its good friend bought but the article D that does not buy of the userp\ Du.Finally, the article that user most unlikely buys be D (Du∪Dp).User and article are constructed according to social network information Triple can be represented as training set, training set T:
T:=(u, i, j) | i ∈ (Du∪Dp),j∈D\(Du∪Dp)}
Wherein (u, i, j) is consumer articles triple, represents user u and is greater than article j for the preference of article i.Its Middle article i belongs to the article that user bought or the direct good friend of user bought, and it is to buy and it is straight that article j, which belongs to user, Connect the article that friend did not also buy.Thus the direct friend relation based on user constructs consumer articles triple, is used for The training of subsequent BPR model.
2. model optimization
It is greater than article j according to preference of the definition known users u of front to article i.It is indicated using function g () In conjunction with the loss function that user and article characteristics indicate, g () is defined as sigmoid function to calculate user couple herein herein Difference preference's degree of different articles, then there is g (u, i, j)=σ (uTi-uTj).So entirely merging pushing away for multi-source heterogeneous data The objective function for recommending model is defined as:
Wherein W is the weight parameter of every kind of model, indicates the weight ginseng of every comment of user in learning model in comment Number is different from, and needs to obtain by study.And in the model that scoring calculates, that acquire is directly the spy of user and article Sign, that is, weight parameter are set as 1, do not need to be updated by optimization object function.What wherein Θ was represented is mould The other parameters for needing to learn in type, Θ={ Θ12}={ { w, duv},{U1,U2,c1,c2,ru,ri}}.λ is every kind of model Punishment parameter, their value is all on [0,1] section.It is added to negative sign before the objective function of Rating Model, because commenting The objective function of sub-model needs to minimize, and the objective function of overall model is to maximize.
(5) recommend
Personalized recommendation list can be multiplied to obtain by user with the feature vector of article:
S=uTv
Each user does not buy with other or the feature vector of article that does not browse is multiplied and the user can be obtained to the object The preference-score of product, score is higher to be represented user and is more possible to buy or browse the article.The score descending of all items is arranged Arrange the Top-N recommendation list for taking top n that can acquire user.

Claims (11)

1. proposing the recommended models for being capable of handling multi-source heterogeneous data based on deep learning, model has accuracy high, expansible The advantages that property is strong.The above method contains following steps:
(1) it Text character extraction: is indicated using the feature vector of PV-DBOW model learning text fragment;Model using point The bag of words (Distributed Bag-of-Words) of cloth, the model predicted using a paragraph vector in paragraph with The word that machine samples;
(2) score feature extraction: the neural network connected entirely using two layers learns scoring of the user to article;With text feature Unlike learning model, this method can directly obtain user and the feature vector of article and indicate, rather than directly extracts and comment The feature divided;
(3) consumer articles Fusion Features: the comment text feature acquired according to (1), the comment that every user can be issued are special Sign vector weighted sum obtains user characteristics, the comment feature vector weighted sum that article receives is obtained article characteristics, finally The text of user and scoring Fusion Features are obtained into the fusion feature of user using fusion function, the text of article and scoring is special Sign fusion obtains the fusion feature of article;
(4) based on the optimization of BPR: sampling to obtain the triple with user preference based on social networks, according to bayesian theory Optimization obtains optimum model parameter;
(5) recommend: the feature vector of user and article are input in model to use by the model parameter acquired according to step (4) Recommend article in family.
2. (1) Text character extraction step described in claim 1, Text Pretreatment therein uses duvTo indicate user To the comment text of article v, the word that comment text includes is indicated u using w, is learnt by user to the comment of article The feature vector of user and article uses u1And v1It indicates, the feature vector of paragraph uses duvIt indicates, term vector uses w to come It indicates, the word of all comments is stored in dictionary V;The number of dimensions of these feature vectors is all K.
3. (1) Text character extraction step described in claim 1, word sampling therein is for each paragraph, at random Choose text filed, some words of stochastical sampling sampling, the result as training classifier from the region;It is text filed Size and in this region choose word number manually set.
4. (1) Text character extraction step described in claim 1, every section of comment can be all mapped in optimization therein In the higher-dimension semantic space random to one, then the word for including in paragraph is predicted, is optimized by study, is obtained more Accurate paragraph feature vector indicates;According to bag of words it is assumed that each word w in document duvThe probability of middle appearance uses Softmax is calculated:
Wherein w ' expression belongs to whole words of dictionary V, and exp is indicated using e as the exponential function at bottom;It can be in the hope of by this formula The probability that any word occurs in document;During actually there is word probability in maximization, the solution expense of gradient solution It is larger;In order to reduce the expense of calculating, often using the method for negative sampling in calculating process, the basis in the word not occurred One predefined noise profile carrys out sampling section word, carries out approximate calculation as negative sample, rather than uses institute in dictionary Some words;Based on the strategy of negative sampling, then the objective function of PV-DBOW is defined as:
The combination of all word with document is all added by above formula, whereinIt is word w in document duvTime of middle appearance Number, functional value is 0 if not occurring;What is represented is sigmoid function, and t is the number of negative sample,What is indicated is in noise profile PVIn,Expectation.
5. (1) Text character extraction step described in claim 1, the character representation of comment text therein specifically just like Lower feature: according to above-mentioned objective function, the character representation d of available documentuvAnd it is proposed in this paper based on conventional machines Similar in the recommended models of learning method, the feature vector of user and article can be indicated according to the feature vector of comment.But The character representation of user and article is no longer calculated by the average of comment feature vector herein, but passes through subsequent model integrated Optimize to learn.
By the feature vector weighting summation of all comments of user and normalization obtains the user characteristics factor:
Wherein DuIndicate all comment numbers of user u, p 'ukIndicate total probability of the user on topic k, WuvIndicate user u for The weight of i-th of the comment issued, pukIt is its normalized expression;The characterization factor of user u are as follows:
pu=(pu1..., puK)
User characteristics factor dimension is K;The article characteristics factor can be used similar formula and calculate:
Wherein DvIndicate all comment numbers that article receives, q 'vkIndicate total probability of the article on topic k, qvkIt is its normalization Expression, WuvIndicate article v for the weight of u-th of the comment received;The characterization factor of article are as follows:
qv=(qv1..., qvK)
K is that the number of dimensions of article and user are consistent;Wherein WuvTo comment on duvFor the weight of user u and article i, pass through Weight could make differentiation to the significance level of different comments, to construct reasonable user and article characteristics.
The characteristic extraction step 6. (2) described in claim 1 score, neural network building therein use two layers of full connection Neural network train to obtain final user to article score;The feature vector table of user and article can be directly obtained Show;Define ruiScoring of the user u to article i is indicated, then for the r that arbitrarily scoresuiThere is user ruWith corresponding article ri It is corresponding to it;Then available two layers of neural network prediction formula:
rui=φ (U2·φ(U1(ru⊙ri)+c1)+c2)
Wherein ⊙ indicates that element multiplication, φ (x) are ELU activation primitive, U1、U2、c1And c2For weight and the deviation ginseng for needing to learn Number.
The characteristic extraction step 7. (2) described in claim 1 score, consumer articles characteristic optimization objective function therein are Make objective function minimum that optimal user and object can be obtained for prediction scoring and square subtracted each other that really score, Optimal Parameters Product indicate.
8. (3) user characteristics fusion steps described in claim 1 are constructed by the interactive information between user and article The feature vector of user and article;It is proposed the function f () an of fusion feature, it is assumed that from scoring and text data in study to Character representation be x1, x2, then passing through fusion function then available fused feature:
X=f (x1, x2)
Wherein x is fused feature;It is merged using simple series system, user and article characteristics can be enhanced Scalability, this has great importance for the model based on multi-source heterogeneous data;The feature so obtained by function f () For
9. the Optimization Steps of (4) based on BPR described in claim 1, triple therein generates the purchase according to user Or the article that the user buys or browsed is defined as i, by user for each user u by browsing record and social networks It is defined as j from not in contact with the article crossed, then the article that the user good friend bought is defined as p;All article collection in system Conjunction is defined as D, then user u purchase and the article set browsed are defined as Du, the article that user good friend bought is defined as Dp;The article that user preference can most be represented is article D that user has bought firsti;Secondly, according to the similitude of good friend's preference, User be likely to buy its good friend bought but the article D that does not buy of the userp\Du;Finally, user most unlikely buys Article be D (Du∪Dp);The triple of user and article is constructed as training set, the training set according to social network information T can be represented as:
T:=(u, i, j) | i ∈ (Du∪Dp), j ∈ D (Du∪Dp)}
Wherein (u, i, j) is consumer articles triple, represents user u and is greater than article j for the preference of article i;Wherein object Product i belongs to the article that user bought or the direct good friend of user bought, and it is to buy and it is directly good that article j, which belongs to user, The article that friend did not also buy;Thus the direct friend relation based on user constructs consumer articles triple, for subsequent The training of BPR model.
10. the Optimization Steps of (4) based on BPR described in claim 1, model optimization therein according to the definition of front Know that user u is greater than article j to the preference of article i;It indicates user and article characteristics is combined to indicate using function g () Loss function, by g () is defined as sigmoid function herein herein to calculate user to difference preference's degree of different articles, So there is g (u, i, j)=σ (uTi-uTj);So the objective function for entirely merging the recommended models of multi-source heterogeneous data is defined Are as follows:
Wherein W is the weight parameter of every kind of model, indicates the weight parameter of every comment of user in learning model all in comment It is not identical, it needs to obtain by study;And in the model that scoring calculates, what is acquired is directly the feature of user and article, It is exactly that weight parameter is set as 1, does not need to be updated by optimization object function;What wherein Θ was represented is needed in model The other parameters to be learnt, Θ={ Θ1, Θ2}={ { w, duv, { U1, U2, c1, c2, ru, ri}};λ is the punishment of every kind of model Parameter, their value is all on [0,1] section;It is added to negative sign before the objective function of Rating Model, because of Rating Model Objective function need to minimize, and the objective function of overall model be maximize.
11. (5) recommendation step described in claim 1, personalized recommendation list can pass through user and article Feature vector is multiplied to obtain:
S=uTv
Each user does not buy with other or the feature vector of article that does not browse is multiplied and the user can be obtained to the article Preference-score, score is higher to be represented user and is more possible to buy or browse the article;The score descending arrangement of all items is taken Top n can acquire the Tbp-N recommendation list of user.
CN201910547320.1A 2019-06-24 2019-06-24 Deep learning based recommendation method for processing multi-source heterogeneous data Active CN110263257B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910547320.1A CN110263257B (en) 2019-06-24 2019-06-24 Deep learning based recommendation method for processing multi-source heterogeneous data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910547320.1A CN110263257B (en) 2019-06-24 2019-06-24 Deep learning based recommendation method for processing multi-source heterogeneous data

Publications (2)

Publication Number Publication Date
CN110263257A true CN110263257A (en) 2019-09-20
CN110263257B CN110263257B (en) 2021-08-17

Family

ID=67920670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910547320.1A Active CN110263257B (en) 2019-06-24 2019-06-24 Deep learning based recommendation method for processing multi-source heterogeneous data

Country Status (1)

Country Link
CN (1) CN110263257B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111045716A (en) * 2019-11-04 2020-04-21 中山大学 Related patch recommendation method based on heterogeneous data
CN111046672A (en) * 2019-12-11 2020-04-21 山东众阳健康科技集团有限公司 Multi-scene text abstract generation method
CN111274406A (en) * 2020-03-02 2020-06-12 湘潭大学 Text classification method based on deep learning hybrid model
CN111612573A (en) * 2020-04-30 2020-09-01 杭州电子科技大学 Recommendation system scoring recommendation prediction method based on full Bayesian method
CN112232929A (en) * 2020-11-05 2021-01-15 南京工业大学 Multi-modal diversity recommendation list generation method for complementary articles
CN112364258A (en) * 2020-11-23 2021-02-12 北京明略软件***有限公司 Map-based recommendation method, system, storage medium and electronic device
CN112967101A (en) * 2021-04-07 2021-06-15 重庆大学 Collaborative filtering article recommendation method based on multi-interaction information of social users
CN113064965A (en) * 2021-03-23 2021-07-02 南京航空航天大学 Intelligent recommendation method for similar cases of civil aviation unplanned events based on deep learning
WO2021159776A1 (en) * 2020-02-13 2021-08-19 腾讯科技(深圳)有限公司 Artificial intelligence-based recommendation method and apparatus, electronic device, and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399858A (en) * 2013-07-01 2013-11-20 吉林大学 Socialization collaborative filtering recommendation method based on trust
US20140081965A1 (en) * 2006-09-22 2014-03-20 John Nicholas Gross Content recommendations for Social Networks
CN103778260A (en) * 2014-03-03 2014-05-07 哈尔滨工业大学 Individualized microblog information recommending system and method
CN106022869A (en) * 2016-05-12 2016-10-12 北京邮电大学 Consumption object recommending method and consumption object recommending device
CN106600482A (en) * 2016-12-30 2017-04-26 西北工业大学 Multi-source social data fusion multi-angle travel information perception and intelligent recommendation method
CN107025606A (en) * 2017-03-29 2017-08-08 西安电子科技大学 The item recommendation method of score data and trusting relationship is combined in a kind of social networks
CN108595527A (en) * 2018-03-28 2018-09-28 中山大学 A kind of personalized recommendation method and system of the multi-source heterogeneous information of fusion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081965A1 (en) * 2006-09-22 2014-03-20 John Nicholas Gross Content recommendations for Social Networks
CN103399858A (en) * 2013-07-01 2013-11-20 吉林大学 Socialization collaborative filtering recommendation method based on trust
CN103778260A (en) * 2014-03-03 2014-05-07 哈尔滨工业大学 Individualized microblog information recommending system and method
CN106022869A (en) * 2016-05-12 2016-10-12 北京邮电大学 Consumption object recommending method and consumption object recommending device
CN106600482A (en) * 2016-12-30 2017-04-26 西北工业大学 Multi-source social data fusion multi-angle travel information perception and intelligent recommendation method
CN107025606A (en) * 2017-03-29 2017-08-08 西安电子科技大学 The item recommendation method of score data and trusting relationship is combined in a kind of social networks
CN108595527A (en) * 2018-03-28 2018-09-28 中山大学 A kind of personalized recommendation method and system of the multi-source heterogeneous information of fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JI Z ET AL.: "Recommendation Based on Review Texts and Social Communities: A Hybrid Model", 《IEEE ACCESS》 *
冀振燕 等: "融合多源异构数据的混合推荐模型", 《北京邮电大学学报》 *
冀振燕等: "个性化图像检索和推荐 ", 《北京邮电大学学报》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111045716A (en) * 2019-11-04 2020-04-21 中山大学 Related patch recommendation method based on heterogeneous data
CN111045716B (en) * 2019-11-04 2022-02-22 中山大学 Related patch recommendation method based on heterogeneous data
CN111046672A (en) * 2019-12-11 2020-04-21 山东众阳健康科技集团有限公司 Multi-scene text abstract generation method
WO2021159776A1 (en) * 2020-02-13 2021-08-19 腾讯科技(深圳)有限公司 Artificial intelligence-based recommendation method and apparatus, electronic device, and storage medium
CN111274406A (en) * 2020-03-02 2020-06-12 湘潭大学 Text classification method based on deep learning hybrid model
CN111612573A (en) * 2020-04-30 2020-09-01 杭州电子科技大学 Recommendation system scoring recommendation prediction method based on full Bayesian method
CN111612573B (en) * 2020-04-30 2023-04-25 杭州电子科技大学 Recommendation system scoring recommendation prediction method based on full Bayesian method
CN112232929A (en) * 2020-11-05 2021-01-15 南京工业大学 Multi-modal diversity recommendation list generation method for complementary articles
CN112364258A (en) * 2020-11-23 2021-02-12 北京明略软件***有限公司 Map-based recommendation method, system, storage medium and electronic device
CN112364258B (en) * 2020-11-23 2024-02-27 北京明略软件***有限公司 Recommendation method and system based on map, storage medium and electronic equipment
CN113064965A (en) * 2021-03-23 2021-07-02 南京航空航天大学 Intelligent recommendation method for similar cases of civil aviation unplanned events based on deep learning
CN112967101A (en) * 2021-04-07 2021-06-15 重庆大学 Collaborative filtering article recommendation method based on multi-interaction information of social users

Also Published As

Publication number Publication date
CN110263257B (en) 2021-08-17

Similar Documents

Publication Publication Date Title
CN110263257A (en) Multi-source heterogeneous data mixing recommended models based on deep learning
CN108763362B (en) Local model weighted fusion Top-N movie recommendation method based on random anchor point pair selection
CN102609523B (en) The collaborative filtering recommending method classified based on taxonomy of goods and user
CN110674407B (en) Hybrid recommendation method based on graph convolution neural network
CN110458627B (en) Commodity sequence personalized recommendation method for dynamic preference of user
CN103778214B (en) A kind of item property clustering method based on user comment
CN108363804A (en) Local model weighted fusion Top-N movie recommendation method based on user clustering
CN110110181A (en) A kind of garment coordination recommended method based on user styles and scene preference
CN109145112A (en) A kind of comment on commodity classification method based on global information attention mechanism
CN103617289B (en) Micro-blog recommendation method based on user characteristics and cyberrelationship
CN107220365A (en) Accurate commending system and method based on collaborative filtering and correlation rule parallel processing
CN102968506A (en) Personalized collaborative filtering recommendation method based on extension characteristic vectors
CN103426102A (en) Commodity feature recommending method based on body classification
CN105913296A (en) Customized recommendation method based on graphs
CN107330727A (en) A kind of personalized recommendation method based on hidden semantic model
CN109146626A (en) A kind of fashion clothing collocation recommended method based on user's dynamic interest analysis
CN109902229B (en) Comment-based interpretable recommendation method
CN109584006B (en) Cross-platform commodity matching method based on deep matching model
CN106951471A (en) A kind of construction method of the label prediction of the development trend model based on SVM
CN105138508A (en) Preference diffusion based context recommendation system
CN106157156A (en) A kind of cooperation recommending system based on communities of users
CN109670909A (en) A kind of travelling products recommended method decomposed based on probability matrix with Fusion Features
CN109933721A (en) A kind of interpretable recommended method merging user concealed article preference and implicit trust
CN110415063A (en) Method of Commodity Recommendation, device, electronic equipment and readable medium
CN109903138A (en) A kind of individual commodity recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190920

Assignee: Institute of Software, Chinese Academy of Sciences

Assignor: Beijing Jiaotong University

Contract record no.: X2022990000602

Denomination of invention: Recommendation method for processing multi-source heterogeneous data based on deep learning

Granted publication date: 20210817

License type: Common License

Record date: 20220905