CN108959655B - Self-adaptive online recommendation method for dynamic environment - Google Patents

Self-adaptive online recommendation method for dynamic environment Download PDF

Info

Publication number
CN108959655B
CN108959655B CN201810889330.9A CN201810889330A CN108959655B CN 108959655 B CN108959655 B CN 108959655B CN 201810889330 A CN201810889330 A CN 201810889330A CN 108959655 B CN108959655 B CN 108959655B
Authority
CN
China
Prior art keywords
classifier
expert
recommendation
dynamic environment
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810889330.9A
Other languages
Chinese (zh)
Other versions
CN108959655A (en
Inventor
张利军
卢世银
周志华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201810889330.9A priority Critical patent/CN108959655B/en
Publication of CN108959655A publication Critical patent/CN108959655A/en
Application granted granted Critical
Publication of CN108959655B publication Critical patent/CN108959655B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a dynamic environment-oriented adaptive online recommendation method, which is characterized in that a recommendation task is modeled into an online multi-classification problem, and then the recommendation is carried out by using an adaptive online classification method. First, a historical dataset of an application scenario is obtained. Then, a classifier and a loss function are selected, and the optimal parameters of the classifier on the historical data set are calculated to be used as initial values. Then, recommended items are decided according to the prediction of the classifier in each round, and the classifier parameters are updated through an adaptive method. The adaptive method includes a meta method and a plurality of expert methods. Compared with the prior art, the method can adaptively perform online recommendation, and is suitable for dynamic environments with unpredictable change speed and amplitude.

Description

Self-adaptive online recommendation method for dynamic environment
Technical Field
The invention relates to an online recommendation method in the field of data mining and machine learning, in particular to a method for carrying out self-adaptive online recommendation in a dynamic environment, which can be applied to scenes such as news recommendation, advertisement recommendation, commodity recommendation and the like.
Background
The online recommendation method can learn the interest preference from the interactive data with the user while performing recommendation, and adjust the recommendation strategy in real time to adapt to the interest preference of the user. In each recommendation round, the recommendation method firstly observes the characteristics of the user and all candidate items, then determines a recommendation item according to a recommendation strategy, and finally updates the recommendation strategy according to the item actually selected by the user. With the rapid increase of the amount of observable data and the great increase of the computing power of hardware, online recommendation methods have been largely applied in the fields of economy, education, games, multimedia, and the like. For example, in internet advertisement delivery, the online recommendation method can determine delivered advertisements according to the characteristics of users and all candidate advertisements when each user arrives, and update the model after the user feeds back (clicks on one advertisement) to improve the subsequent delivery effect. In a news recommending system, an online recommending method can predict news categories which are interesting to a user according to the characteristics of the user and all candidate news when each user arrives so as to recommend the news categories, and update a model after the user feeds back the news categories (reads the news of a certain category) so as to improve the subsequent recommending effect. In the stock investment, the online recommendation method can predict the next market fluctuation situation according to the market characteristics at the beginning of each investment cycle so as to recommend the high-quality bid, and update the model according to the actual fluctuation situation at the end of the investment cycle so as to improve the investment income in the next cycle.
The traditional online recommendation method mainly aims to reduce the operation overhead and achieve the performance of the static offline recommendation method. Although many online recommendation methods have been theoretically demonstrated to perform equally well on average as the best offline recommendation methods when the recommendation rounds are sufficiently numerous, static offline recommendation methods tend to perform poorly for a dynamically changing environment, and the theoretical guarantees of these online recommendation methods are of no practical significance. Recently, some online recommendation methods with theoretical guarantee, which can be applied to dynamic environments, have been proposed, but these methods all require that the change speed and amplitude of the environment can be determined in advance, and these requirements limit their application range. In many real-world scenarios, the changing circumstances of the environment faced by the recommendation method are difficult to control and estimate in advance. In the investment of stocks, when a significant event occurs, the price of the stocks is often changed very severely; in internet advertising and news recommendation systems, user streaming is fraught with randomness and contingency. In order to be applicable to highly variable, non-predeterminable dynamic environments, an adaptive online recommendation method is needed.
Disclosure of Invention
The purpose of the invention is as follows: the current online recommendation method is only suitable for a dynamic environment with a priori knowledge and slow change, and the change of the environment under many scenes in reality is fast and cannot be predicted in advance. Aiming at the problem, the invention provides a dynamic environment-oriented self-adaptive online recommendation method.
The technical scheme is as follows: a self-adaptive online recommendation method facing to a dynamic environment is used for application scenes such as news recommendation, advertisement recommendation and commodity recommendation. Specifically, first, a history data set of an application scene is acquired. Then, a classifier and a loss function are selected, and the optimal parameters of the classifier on the historical data set are calculated to be used as initial values. Then, recommended items are decided according to the prediction of the classifier in each round, and the classifier parameters are updated through an adaptive method. The adaptive method includes a meta method and a plurality of expert methods. Each expert method is configured with different learning rates aiming at a possible dynamic environment, and the decision is updated in a gradient descending mode in each round; and the meta method receives the decisions of all the expert methods in each round, then gives different weights to each expert method according to the recent recommendation expression of each expert method in a dynamic environment, and finally combines the decisions of the expert methods to determine a final recommended item based on the weights.
A self-adaptive online recommendation method facing to a dynamic environment comprises a meta method and an expert method.
The meta-method comprises the following specific steps:
step 100, obtaining a recommendation scene history data set H { (x)i,yi) I ═ 1,2, …, m }, where x isiRepresenting a vector y formed by splicing the user features and all candidate item featuresiAn item representing the user's actual selection;
step 101, selecting a classifier c (x, w) and a loss function l (p, y), wherein x represents a vector formed by splicing user features and all candidate item features, y represents an item actually selected by a user, w represents a parameter of the classifier, and p represents a recommended item output by the classifier;
step 102, calculating optimal parameters in a classifier parameter feasible region W on the basis of the selected classifier and the loss function on the historical data set
Figure BDA0001756506390000021
Step 103, setting step size parameters α;
104, setting the number N of expert methods;
step 105, setting the learning rate η of each expert method;
step 106, initializing the weight of each expert method
Figure BDA0001756506390000022
Step 107, at each recommendation round T1, 2, …, T, performing the following steps:
step 108, obtaining a vector x formed by splicing the user characteristics and all candidate project characteristicst
Step 109, receiving the output of each expert method
Figure BDA0001756506390000031
Step 110, calculating classifier parameters
Figure BDA0001756506390000032
η, the learning rate is expressed by,
Figure BDA0001756506390000033
representing the weight of the expert method, and t representing the number of recommended turns;
step 111, according to the recommendation item c (x) output by the classifiert,wt) Recommending;
step 112, obtaining the item y actually selected by the user in the roundt
Step 113, calculate function ft(w)=l(c(xt,w),yt) At wtGradient of (2)
Figure BDA0001756506390000034
Step 114, will
Figure BDA0001756506390000035
Sending to each expert method;
step 115, construct the substitution loss function st(·);
Step 116, update the weight of each expert method
Figure BDA0001756506390000036
The specific steps of each expert method are as follows:
step 200, initialization
Figure BDA0001756506390000037
Step 201, at each recommendation round T1, 2, …, T, the following steps are performed:
step 202, will
Figure BDA0001756506390000038
Sending to the meta method;
step 203, receiving
Figure BDA0001756506390000039
Step 204, updating output
Figure BDA00017565063900000310
The classifiers selected in step 101 include a conventional linear classifier c (x, w) ═ wTx, softmax classifier, neural network classifier, etc.; alternative loss functions are all convex differentiable loss functions, including the square loss l (p, y) ═ p-y2The Hinge loss l (p, y) max (0,1-yp) and the cross entropy loss l (p, y) sigmaiyilog(pi) And the like.
The setting mode of the step length parameter α in the step 103 is
Figure BDA00017565063900000311
Where D is the diameter of the classifier parameter feasible region W; g is an arbitrary value such that the following holds:
Figure BDA00017565063900000312
Figure BDA00017565063900000313
the setting mode of the number N of the professional methods in the step 104 is
Figure BDA00017565063900000314
The learning rate η of each expert method in step 105 is set in such a manner that the learning rate of the i-th (1, 2, …, N) expert is
Figure BDA00017565063900000315
The substitution loss function s constructed in said step 115tSpecific definition of (a) is
Figure BDA00017565063900000316
wtRefer to the parameter values of the t-th classifier.
The projection operator Π in step 204W[·]Is specifically defined asW[u]=argminv∈W‖u-v‖,u∈W。
Has the advantages that: compared with the prior art, the method can adaptively perform online recommendation, and is suitable for dynamic environments with unpredictable change speed and amplitude.
Drawings
FIG. 1 is a meta-method work flow diagram of the present invention;
FIG. 2 is a flow chart of the expert method of the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
Take the recommendation of goods in the e-commerce website as an example.
The workflow of the meta-method is shown in fig. 1. First, a purchase record H { (x) of all users in the last period of time of a website is acquiredi,yi) I ═ 1,2, …, m }, where x isiVector y representing the concatenation of the characteristics of the user and all the goodsiIndicating the goods purchased by the user. The user characteristics include gender, age, residence, income, education, etc., and the commodity characteristics include price, sales, click-through rate, shopping cart conversion rate, etc.
Next, the softmax classifier and cross-entropy loss l (p, y) ═ Σ commonly used in this scenario are selectediyilog(pi). In thatOn purchase of the record data set, optimal classifier parameters are calculated based on the selected classifier and the loss function
Figure BDA0001756506390000041
This can be done by a gradient descent iso-convex optimization method.
Then, determining the number T of recommended rounds, and setting a step length parameter
Figure BDA0001756506390000042
Number of expert methods
Figure BDA0001756506390000043
Where D is any value such that the following holds:
Figure BDA0001756506390000044
Figure BDA0001756506390000045
g is an arbitrary value such that the following holds:
Figure BDA0001756506390000046
Figure BDA0001756506390000047
w is the feasible field of classifier parameters.
Then, the learning rate of each expert method is set: the learning rate of the i (1, 2, …, N) -th expert method is set to
Figure BDA0001756506390000048
Initializing weights for each expert method
Figure BDA0001756506390000049
Finally, an online run of each recommended round is started. In each recommendation turn, the meta-method firstly obtains the feature vectors of the user and all candidate commodities in the turn, and x is obtained by splicingt. The next method receives the output of each expert method
Figure BDA0001756506390000051
Calculating parameters of softmax classifier
Figure BDA0001756506390000052
According to the output c (x) of the softmax classifiert,wt) And recommending the commodity. Later meta-method obtains commodity y actually purchased by the user in the turntCalculating a function ft(w)=l(c(xt,w),yt) At wtGradient of (2)
Figure BDA0001756506390000053
And sends it to all expert methods. Final element method for constructing substitution loss function st(. h) updating the weight of each expert method
Figure BDA0001756506390000054
The workflow of each expert method is shown in fig. 2. After initialization is completed, in each recommended round, the expert method first sends the output of the current round to the meta method, then receives gradient information from the meta method, and finally updates the output of the next round using gradient descent.

Claims (7)

1. A self-adaptive online recommendation method facing to dynamic environment is characterized in that: including meta methods and expert methods;
the meta-method comprises the following specific steps:
step 100, obtaining a recommendation scene history data set H { (x)i,yi) I ═ 1,2, …, m }, where x isiRepresenting a vector y formed by splicing the user features and all candidate item featuresiAn item representing the user's actual selection;
step 101, selecting a classifier c (x, w) and a loss function l (p, y), wherein x represents a vector formed by splicing user features and all candidate item features, y represents an item actually selected by a user, w represents a parameter of the classifier, and p represents a recommended item output by the classifier;
step 102 provides, on the historical data set,calculating optimal parameters in a classifier parameter feasible region W according to the selected classifier and the loss function
Figure FDA0002356077920000011
Step 103, setting step size parameters α;
104, setting the number N of expert methods;
step 105, setting the learning rate η of each expert method;
step 106, initializing the weight of each expert method
Figure FDA0002356077920000012
Step 107, at each recommendation round T1, 2, …, T, performing the following steps:
step 108, obtaining a vector x formed by splicing the user characteristics and all candidate project characteristicst
Step 109, receiving the output of each expert method
Figure FDA0002356077920000013
Step 110, calculating classifier parameters
Figure FDA0002356077920000014
Wherein
Figure FDA0002356077920000015
Represents the weight of the expert with learning rate η in the t-th round;
step 111, according to the recommendation item c (x) output by the classifiert,wt) Recommending;
step 112, obtaining the item y actually selected by the user in the roundt
Step 113, calculating cost function f of the t roundt(w)=l(c(xt,w),yt) At wtGradient of (2)
Figure FDA0002356077920000016
Step 114, will
Figure FDA0002356077920000017
Sending to each expert method;
step 115, construct the substitution loss function st(·);
Step 116, update the weight of each expert method
Figure FDA0002356077920000018
The specific steps of each expert method are as follows:
step 200, initialization
Figure FDA0002356077920000019
Step 201, at each recommended round T1, 2, …, T performs the following steps, where T denotes the total number of rounds:
step 202, will
Figure FDA0002356077920000021
Sending to the meta method;
step 203, receiving
Figure FDA0002356077920000022
Step 204, updating output
Figure FDA0002356077920000023
II thereinW[·]Representing a projection operator.
2. The dynamic environment-oriented adaptive online recommendation method of claim 1, wherein: the classifiers selected in step 101 include a conventional linear classifier c (x, w) ═ wTx, softmax classifier and neural network classifier; alternative loss functions are all convex differentiable loss functions, including the squared loss l (p, y))=(p-y)2The Hinge loss l (p, y) max (0,1-yp) and the cross entropy loss l (p, y) sigmaiyilog(pi)。
3. The adaptive online recommendation method for dynamic environment facing claim 1, wherein the step size parameter α in step 103 is set according to
Figure FDA0002356077920000024
Wherein T is the total number of rounds; d is the diameter of the classifier parameter feasible region W; g is an arbitrary value such that the following holds:
Figure FDA0002356077920000025
4. the dynamic environment-oriented adaptive online recommendation method of claim 1, wherein: the setting mode of the number N of the professional methods in the step 104 is
Figure FDA0002356077920000026
5. The adaptive online recommendation method for dynamic environment according to claim 1, wherein the learning rate η of each expert method in the step 105 is set as 1,2, …, and the learning rate of N experts is set as
Figure FDA0002356077920000027
Wherein T is the total number of rounds; d is the diameter of the classifier parameter feasible region W; g is an arbitrary value such that the following holds:
Figure FDA0002356077920000028
6. the dynamic environment-oriented adaptive online recommendation method of claim 1, wherein: said step (c) is115, and a substitution loss function stSpecific definition of (a) is
Figure FDA0002356077920000029
Figure FDA00023560779200000210
7. The dynamic environment-oriented adaptive online recommendation method of claim 1, wherein: the projection operator Π in step 204W[·]Is specifically defined asW[u]=argminv∈W‖u-v‖,u∈W。
CN201810889330.9A 2018-08-07 2018-08-07 Self-adaptive online recommendation method for dynamic environment Active CN108959655B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810889330.9A CN108959655B (en) 2018-08-07 2018-08-07 Self-adaptive online recommendation method for dynamic environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810889330.9A CN108959655B (en) 2018-08-07 2018-08-07 Self-adaptive online recommendation method for dynamic environment

Publications (2)

Publication Number Publication Date
CN108959655A CN108959655A (en) 2018-12-07
CN108959655B true CN108959655B (en) 2020-04-03

Family

ID=64468227

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810889330.9A Active CN108959655B (en) 2018-08-07 2018-08-07 Self-adaptive online recommendation method for dynamic environment

Country Status (1)

Country Link
CN (1) CN108959655B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210065276A1 (en) * 2019-08-28 2021-03-04 Fuji Xerox Co., Ltd. Information processing apparatus and non-transitory computer readable medium
CN110966937B (en) * 2019-12-18 2021-03-09 哈尔滨工业大学 Large member three-dimensional configuration splicing method based on laser vision sensing
CN111754313B (en) * 2020-07-03 2023-09-26 南京大学 Efficient communication online classification method for distributed data without projection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166668A (en) * 2014-06-09 2014-11-26 南京邮电大学 News recommendation system and method based on FOLFM model
CN105740430A (en) * 2016-01-29 2016-07-06 大连理工大学 Personalized recommendation method with socialization information fused
CN108108351A (en) * 2017-12-05 2018-06-01 华南理工大学 A kind of text sentiment classification method based on deep learning built-up pattern

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9449288B2 (en) * 2011-05-20 2016-09-20 Deem, Inc. Travel services search

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166668A (en) * 2014-06-09 2014-11-26 南京邮电大学 News recommendation system and method based on FOLFM model
CN105740430A (en) * 2016-01-29 2016-07-06 大连理工大学 Personalized recommendation method with socialization information fused
CN108108351A (en) * 2017-12-05 2018-06-01 华南理工大学 A kind of text sentiment classification method based on deep learning built-up pattern

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种联合的时序数据特征序列分类学习算法;史苇杭等;《计算机工程》;20160630;第196-200页 *

Also Published As

Publication number Publication date
CN108959655A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN110969516A (en) Commodity recommendation method and device
CN108959655B (en) Self-adaptive online recommendation method for dynamic environment
US10713560B2 (en) Learning a vector representation for unique identification codes
CN111932336A (en) Commodity list recommendation method based on long-term and short-term interest preference
CN110781409B (en) Article recommendation method based on collaborative filtering
CN109903103B (en) Method and device for recommending articles
CN111242748B (en) Method, apparatus, and storage medium for recommending items to a user
CN105809474B (en) Hierarchical commodity information filtering recommendation method
CN108665323A (en) A kind of integrated approach for finance product commending system
CN110069699B (en) Ranking model training method and device
US11188579B2 (en) Personalized dynamic content via content tagging and transfer learning
CN114219169A (en) Script banner supply chain sales and inventory prediction algorithm model and application system
CN111798280B (en) Multimedia information recommendation method, device and equipment and storage medium
CN116362836A (en) Agricultural product recommendation algorithm based on user behavior sequence
CN115860870A (en) Commodity recommendation method, system and device and readable medium
Han et al. Optimizing ranking algorithm in recommender system via deep reinforcement learning
CN107527128B (en) Resource parameter determination method and equipment for advertisement platform
JP2023525747A (en) Method and apparatus for analyzing information
AU2019200721B2 (en) Online training and update of factorization machines using alternating least squares optimization
JP7441270B2 (en) Machine learning methods, training methods, prediction systems, and non-transitory computer-readable media
CN110555719A (en) commodity click rate prediction method based on deep learning
CN114519600A (en) Graph neural network CTR estimation algorithm fusing adjacent node variances
CN110544129A (en) Personalized recommendation method for social e-commerce users
CN111192112A (en) Multi-platform interaction method and device
US20230298080A1 (en) Automated policy function adjustment using reinforcement learning algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant