CN109033463A - A kind of community's question and answer content recommendation method based on end-to-end memory network - Google Patents

A kind of community's question and answer content recommendation method based on end-to-end memory network Download PDF

Info

Publication number
CN109033463A
CN109033463A CN201811008620.4A CN201811008620A CN109033463A CN 109033463 A CN109033463 A CN 109033463A CN 201811008620 A CN201811008620 A CN 201811008620A CN 109033463 A CN109033463 A CN 109033463A
Authority
CN
China
Prior art keywords
title
vector
community
question
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811008620.4A
Other languages
Chinese (zh)
Other versions
CN109033463B (en
Inventor
陈细玉
林穗
孙为军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201811008620.4A priority Critical patent/CN109033463B/en
Publication of CN109033463A publication Critical patent/CN109033463A/en
Application granted granted Critical
Publication of CN109033463B publication Critical patent/CN109033463B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of community's question and answer content recommendation method based on end-to-end memory network, first acquisition title are pre-processed as data set and to data set, and data set is divided into training set, verifying collects and test set;Then end-to-end memory network model is established according to data set;Finally using stochastic gradient descent (SGD) Optimized model that there is AdaGrad to update rule.

Description

A kind of community's question and answer content recommendation method based on end-to-end memory network
Technical field
The present invention relates to commending contents fields, more particularly, to a kind of community's question and answer based on end-to-end memory network Content recommendation method.
Background technique
Web Community's question and answer are that current people solve the problems, such as and share the main using platform of knowledge experience, for example know, Range of information is extensive, is not but that everyone is interested, therefore needs the interested commending contents of user increasing user to user Viscosity.
Summary of the invention
Present invention aim to address said one or multiple defects, propose a kind of community based on end-to-end memory network Question and answer content recommendation method.
To realize the above goal of the invention, the technical solution adopted is that:
A kind of community's question and answer content recommendation method based on end-to-end memory network, comprising the following steps:
S1: obtaining title and pre-process as data set and to data set, data set is divided into training set, verifying collects And test set;
S2: end-to-end memory network model is established according to data set;
S3: stochastic gradient descent (SGD) Optimized model that there is AdaGrad to update rule is used.
Preferably, data ensemble average described in step S1 is divided into training set, verifying collection and test set.
Preferably, entitled user described in step S1 is browsing the content mark with historical behavior in community's question and answer Topic.
Preferably, the end-to-end memory models include single-layer model and multilayered model;The wherein single-layer model packet Include memory component, input module and output precision;
Wherein memory component indicates: storing the title set D={ x of historical behavior1,x2...xn, using size be dim × | V | matrix A by each word wij∈xiIt is embedded into the memory vector { a of d dimensionijIn, so that aij=Awij.Entire sentence collection { xiMake Memory vector { a that dimension is d is converted to matrix Ai};
Input module indicates: positive browsing title q is vector b by B matrix conversion, calculates b and each memory aiBetween With degree, formula: pi=Softmax (bTai);Wherein Softmax (zi)=eZi/∑jeZj, p is the probability vector in input;
Output precision indicates: the title set D={ x of historical behavior1,x2...xn, being converted to dimension using Matrix C is d's Output vector ci, output o is output vector ciWith probability vector weighted sum, formula:
Final prediction f=Softmax (W (o+b));
The multilayered model is then that the title q of input module is the sum of upper hop input header b and output o, i.e. next layer of k + 1 input is the output o from layer kkWith input bkSummation, formula: bk+1=ok+bk;
Wherein each layer has the embeded matrix A of oneselfk, Ck, for being embedded in input { xi}。
Preferably, the multilayered model further includes sentence expression, each sentence xi={ xi1, xi2..., xin, it is embedding Enter each word and sum to obtained vector, and time expression is added, word vector is the 0-1 vector that a length is V, is made Obtain ai=∑jAxij+TA(i);Wherein TA(i) be code time information Special matrix TAThe i-th row;Similarly, output insertion is used Matrix Tc, ci=∑jCxij+TC(i)., TAAnd TCAll learn during the training period.
Preferably, the multilayered model further includes Word similarity, in the positive browsing title q of first layer, utilizes word phase To be more than that 0.8 keyword is added in q with similarity in q in memory like degree, avoid in memory with the meaning in q or phase Closely, different title weight is too low for word;
In the corpus being made of all pretreated titles, the keyword of positive browsing title is selected, and it is remaining Keyword carries out Word similarity two-by-two and calculates, calculation formula:
Wherein yi is for what w1 and w2 started branch at i-th layer Number.
Preferably, the evaluation criteria of the model is accuracy, recall rate and F1 score.
Compared with prior art, the beneficial effects of the present invention are:
End-to-end memory network can remember a large number of users behavior, and the time is added, and make the interest prediction of user is more acurrate can It leans on.Supervision item is reduced using training end to end.Containing attention mechanism, so that different titles has different weights, it can The point of interest of prediction is ranked up, and then the emphasis point recommended would also vary from, the big point of interest ranking of weight is high, then institute This point of interest content recommended is more than other point of interest contents.Word similarity is added, keeps prediction more accurate.
Detailed description of the invention
Fig. 1 is flow chart of the present invention.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
Below in conjunction with drawings and examples, the present invention is further elaborated.
Embodiment 1
A kind of community's question and answer content recommendation method based on end-to-end memory network, referring to FIG. 1, the following steps are included:
S1: obtaining title and pre-process as data set and to data set, data set is divided into training set, verifying collects And test set;
S2: end-to-end memory network model is established according to data set;
S3: stochastic gradient descent (SGD) Optimized model that there is AdaGrad to update rule is used.
For knowing, know and know compared to Baidu, is more prone to sharing problem and its answer, rather than problem is made It answers.Each problem is very brief, descriptive strong, therefore using problem as title.For all titles got, will carry out pre- Processing, first segments each title, stop words and spcial character, such as " " " a " is then deleted, due to what is known Many in enquirement " why " " how " " experience ", therefore these words can also delete, and avoid common unrelated word weight excessive, cover The maximum length of required heavy duty word and sentence is set as 50, and the content for being more than needs to intercept.Data set is equally divided into training Collection, verifying collection and test set.
Select the title of user's history behavior as the memory in model, historical behavior include remove browsing it is newest It browses title, approve of title, answer title, pay close attention to title, every kind of title is according to selection of time newest 5, since it is desired that pushing away The related content of the newest interest of user is recommended, therefore sorts to the title chosen by user operation time and constitutes title set D, Mei Gebiao Topic insertion dimension experiment effect in 300-500 is relatively good, and the dimension the big more can be shown that sentence, and the model hop count of selection preferably exists 3 or so, should not be excessively also unsuitable very few, both effect can be made to reduce.
In the present embodiment, the end-to-end memory models include single-layer model and multilayered model;The wherein single-layer model Including memory component, input module and output precision;
Wherein memory component indicates: storing the title set D={ x of historical behavior1,x2...xn, using size be dim × | V | matrix A by each word wij∈xiIt is embedded into the memory vector { a of d dimensionijIn, so that aij=Awij.Entire sentence collection { xiMake Memory vector { a that dimension is d is converted to matrix Ai};
Input module indicates: positive browsing title q is vector b by B matrix conversion, calculates b and each memory aiBetween With degree, formula: pi=Softmax (bTai);Wherein Softmax (zi)=eZi/∑jeZj, p is the probability vector in input;
Output precision indicates: the title set D={ x of historical behavior1,x2...xn, being converted to dimension using Matrix C is d's Output vector ci, output o is output vector ciWith probability vector weighted sum, formula:
Final prediction f=Softmax (W (o+b));
The multilayered model is then that the title q of input module is the sum of upper hop input header b and output o, i.e. next layer of k + 1 input is the output o from layer kkWith input bkSummation, formula: bk+1=ok+bk
Wherein each layer has the embeded matrix A of oneselfk, Ck, for being embedded in input { xi}。
In the present embodiment, the multilayered model further includes sentence expression, each sentence xi={ xi1, xi2..., xin, It being embedded in each word and sums to obtained vector, and time expression is added, word vector is the 0-1 vector that a length is V, So that ai=∑jAxij+TA(i);Wherein TA(i) be code time information Special matrix TAThe i-th row;Similarly, output insertion With matrix Tc, ci=∑jCxij+TC(i)., TAAnd TCAll learn during the training period.
Wherein each matrix is also all being drawn by training such as A, B, C, W, and training reduces number of parameters for convenience, First jumps matrix A1=B, final jump matrix WT=CK, the dot-blur pattern A of other every jumps is identical as upper hop output matrix C, i.e., Ak+1=Ck, similarly, the matrix T indicated for the timeA, TCParameter is reduced in the same way.
In the present embodiment, the multilayered model further includes Word similarity, in the positive browsing title q of first layer, utilizes word Similarity, by be more than with similarity in q in memory 0.8 keyword be added q in, avoid memory in in q the meaning or phase Closely, different title weight is too low for word;
In the corpus being made of all pretreated titles, the keyword of positive browsing title is selected, and it is remaining Keyword carries out Word similarity two-by-two and calculates, calculation formula:
Wherein yi is for what w1 and w2 started branch at i-th layer Number.
The result that model prediction the goes out point of interest nearest as user, for each positive browsing title, the interest predicted Point is according to first 5 of ranking selection.Using point of interest as label, recommend the corresponding hot content of label, such as the result mark predicted There is " friend " in label, then recommends the hot content for having " friend " label.
In the present embodiment, the evaluation criteria of the model is accuracy, recall rate and F1 score.
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention Protection scope within.

Claims (7)

1. a kind of community's question and answer content recommendation method based on end-to-end memory network, which comprises the following steps:
S1: obtaining title and pre-process as data set and to data set, data set is divided into training set, verifying collects and surveys Examination collection;
S2: end-to-end memory network model is established according to data set;
S3: stochastic gradient descent (SGD) Optimized model that there is AdaGrad to update rule is used.
2. a kind of community's question and answer content recommendation method based on end-to-end memory network according to claim 1, feature It is, data ensemble average described in step S1 is divided into training set, verifying collection and test set.
3. a kind of community's question and answer content recommendation method based on end-to-end memory network according to claim 1, feature It is, entitled user described in step S1 is browsing the content title with historical behavior in community's question and answer.
4. a kind of community's question and answer content recommendation method based on end-to-end memory network according to claim 1, feature It is, the end-to-end memory models include single-layer model and multilayered model;Wherein the single-layer model include memory component, it is defeated Enter component and output precision;
Wherein memory component indicates: storing the title set D={ x of historical behavior1,x2...xn, using size be dim × | V | Matrix A is by each word wij∈xiIt is embedded into the memory vector { a of d dimensionijIn, so that aij=Awij.Entire sentence collection { xiUse Matrix A is converted to the memory vector { a that dimension is di};
Input module indicates: positive browsing title q is vector b by B matrix conversion, calculates b and each memory aiBetween matching degree, Formula: pi=Softmax (bTai);Wherein Softmax (zi)=eZi/∑jeZj, p is the probability vector in input;
Output precision indicates: the title set D={ x of historical behavior1,x2...xn, the output that dimension is d is converted to using Matrix C Vector ci, output o is output vector ciWith probability vector weighted sum, formula:
Final prediction f=Softmax (W (o+b));
The multilayered model is then that the title q of input module is upper hop input header b and output the sum of o, i.e. next layer of k+1's Input is the output o from layer kkWith input bkSummation, formula: bk+1=ok+bk
Wherein each layer has the embeded matrix A of oneselfk, Ck, for being embedded in input { xi}。
5. a kind of community's question and answer content recommendation method based on end-to-end memory network according to claim 4, feature It is, the multilayered model further includes sentence expression, each sentence xi={ xi1, xi2..., xin, it is embedded in each word simultaneously Vector summation to obtaining, and time expression is added, word vector is the 0-1 vector that a length is V, so that ai=∑jAxij +TA(i);Wherein TA(i) be code time information Special matrix TAThe i-th row;Similarly, insertion matrix Tc, ci=are exported ∑jCxij+TC(i)., TAAnd TCAll learn during the training period.
6. a kind of community's question and answer content recommendation method based on end-to-end memory network according to claim 4, feature It is, the multilayered model further includes Word similarity,, will be in memory using Word similarity in the positive browsing title q of first layer With similarity in q be more than 0.8 keyword be added q in, avoid memory in in q the meaning or it is close, word is different Title weight is too low;
In the corpus being made of all pretreated titles, the keyword of positive browsing title is selected, with remaining key Word carries out Word similarity two-by-two and calculates, calculation formula:
Wherein yi is the coefficient that w1 and w2 starts branch at i-th layer.
7. a kind of community's question and answer content recommendation method based on end-to-end memory network according to claim 1, feature It is, the evaluation criteria of the model is accuracy, recall rate and F1 score.
CN201811008620.4A 2018-08-28 2018-08-28 Community question-answer content recommendation method based on end-to-end memory network Active CN109033463B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811008620.4A CN109033463B (en) 2018-08-28 2018-08-28 Community question-answer content recommendation method based on end-to-end memory network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811008620.4A CN109033463B (en) 2018-08-28 2018-08-28 Community question-answer content recommendation method based on end-to-end memory network

Publications (2)

Publication Number Publication Date
CN109033463A true CN109033463A (en) 2018-12-18
CN109033463B CN109033463B (en) 2021-11-26

Family

ID=64625982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811008620.4A Active CN109033463B (en) 2018-08-28 2018-08-28 Community question-answer content recommendation method based on end-to-end memory network

Country Status (1)

Country Link
CN (1) CN109033463B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems
CN110188272A (en) * 2019-05-27 2019-08-30 南京大学 A kind of community's question and answer web site tags recommended method based on user context

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140030688A1 (en) * 2012-07-25 2014-01-30 Armitage Sheffield, Llc Systems, methods and program products for collecting and displaying query responses over a data network
US20140280087A1 (en) * 2013-03-15 2014-09-18 International Business Machines Corporation Results of Question and Answer Systems
CN106126596A (en) * 2016-06-20 2016-11-16 中国科学院自动化研究所 A kind of answering method based on stratification memory network
CN106407316A (en) * 2016-08-30 2017-02-15 北京航空航天大学 Topic model-based software question and answer recommendation method and device
CN107330130A (en) * 2017-08-29 2017-11-07 北京易掌云峰科技有限公司 A kind of implementation method of dialogue robot to artificial customer service recommendation reply content
CN108133038A (en) * 2018-01-10 2018-06-08 重庆邮电大学 A kind of entity level emotional semantic classification system and method based on dynamic memory network
US20180165361A1 (en) * 2016-12-09 2018-06-14 At&T Intellectual Property I, L.P. Mapping service and resource abstractions to network inventory graph database nodes and edges

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140030688A1 (en) * 2012-07-25 2014-01-30 Armitage Sheffield, Llc Systems, methods and program products for collecting and displaying query responses over a data network
US20140280087A1 (en) * 2013-03-15 2014-09-18 International Business Machines Corporation Results of Question and Answer Systems
CN106126596A (en) * 2016-06-20 2016-11-16 中国科学院自动化研究所 A kind of answering method based on stratification memory network
CN106407316A (en) * 2016-08-30 2017-02-15 北京航空航天大学 Topic model-based software question and answer recommendation method and device
US20180165361A1 (en) * 2016-12-09 2018-06-14 At&T Intellectual Property I, L.P. Mapping service and resource abstractions to network inventory graph database nodes and edges
CN107330130A (en) * 2017-08-29 2017-11-07 北京易掌云峰科技有限公司 A kind of implementation method of dialogue robot to artificial customer service recommendation reply content
CN108133038A (en) * 2018-01-10 2018-06-08 重庆邮电大学 A kind of entity level emotional semantic classification system and method based on dynamic memory network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems
CN110134771B (en) * 2019-04-09 2022-03-04 广东工业大学 Implementation method of multi-attention-machine-based fusion network question-answering system
CN110188272A (en) * 2019-05-27 2019-08-30 南京大学 A kind of community's question and answer web site tags recommended method based on user context
CN110188272B (en) * 2019-05-27 2023-04-21 南京大学 Community question-answering website label recommendation method based on user background

Also Published As

Publication number Publication date
CN109033463B (en) 2021-11-26

Similar Documents

Publication Publication Date Title
Wei et al. Reinforcement learning to rank with Markov decision process
US9171078B2 (en) Automatic recommendation of vertical search engines
CN106339383B (en) A kind of search ordering method and system
CN105808590B (en) Search engine implementation method, searching method and device
WO2014160282A1 (en) Classifying resources using a deep network
CN109947902B (en) Data query method and device and readable medium
CN111538827A (en) Case recommendation method and device based on content and graph neural network and storage medium
CN111310023B (en) Personalized search method and system based on memory network
CN109933708A (en) Information retrieval method, device, storage medium and computer equipment
CN110222260A (en) A kind of searching method, device and storage medium
Truong et al. Content-based sensor search for the Web of Things
CN108604248B (en) Note providing method and device using correlation calculation based on artificial intelligence
WO2010037314A1 (en) A method for searching and the device and system thereof
Yigit et al. Extended topology based recommendation system for unidirectional social networks
CN113987161A (en) Text sorting method and device
Kim et al. Building concept network-based user profile for personalized web search
CN115905687A (en) Cold start-oriented recommendation system and method based on meta-learning graph neural network
CN109033463A (en) A kind of community's question and answer content recommendation method based on end-to-end memory network
CN112364245A (en) Top-K movie recommendation method based on heterogeneous information network embedding
Zhang et al. Hybrid recommender system using semi-supervised clustering based on Gaussian mixture model
Dong et al. Improving sequential recommendation with attribute-augmented graph neural networks
CN112486467B (en) Interactive service recommendation method based on dual interaction relation and attention mechanism
CN113535949A (en) Multi-mode combined event detection method based on pictures and sentences
Chen et al. Multi-context embedding based personalized place semantics recognition
CN110162535B (en) Search method, apparatus, device and storage medium for performing personalization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant