CN115099886A - Long and short interest sequence recommendation method and device and storage medium - Google Patents

Long and short interest sequence recommendation method and device and storage medium Download PDF

Info

Publication number
CN115099886A
CN115099886A CN202210575237.7A CN202210575237A CN115099886A CN 115099886 A CN115099886 A CN 115099886A CN 202210575237 A CN202210575237 A CN 202210575237A CN 115099886 A CN115099886 A CN 115099886A
Authority
CN
China
Prior art keywords
user
commodity
embedded
interest
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210575237.7A
Other languages
Chinese (zh)
Other versions
CN115099886B (en
Inventor
许勇
李想
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210575237.7A priority Critical patent/CN115099886B/en
Publication of CN115099886A publication Critical patent/CN115099886A/en
Application granted granted Critical
Publication of CN115099886B publication Critical patent/CN115099886B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a long and short interest sequence recommendation method, a device and a storage medium, wherein the method comprises the following steps: constructing a graph according to the interaction data of the user and the commodity; extracting multi-order information of the commodity and the user by adopting graph convolution, constructing a user-commodity interaction sequence after obtaining an initial embedded vector of the commodity and the user, inputting the sequence into a transform module, and learning a long-term interest embedded vector of the user; acquiring short-time user behavior embedded vectors, inputting the short-time user behavior embedded vectors into a capsule network, and acquiring k short-time interest embedded vectors of users; the k short-time interest embedded vectors, the long-time interest embedded vectors and the commodity embedded vectors pass through an attention mechanism module to obtain the weight of a single commodity embedded vector and each user interest embedded vector, and then the user final embedded vectors are obtained through weighting; and calculating the click possibility of interaction of the user embedded vector and the commodity embedded vector to realize commodity recommendation. The method and the device effectively improve the recommendation effect and can be widely applied to the field of sequence recommendation.

Description

Long and short interest sequence recommendation method and device and storage medium
Technical Field
The invention relates to the technical field of commodity time series recommendation, in particular to a long and short interest series recommendation method, a device and a storage medium.
Background
With the progress and development of network technology and the gradual popularization of devices such as mobile phones and computers, network shopping becomes a part of daily life of residents, network sales becomes an important component of business operation and occupation ratio, each large e-commerce platform generates a large amount of user and commodity interaction information data all the time, the data expresses a lot of important information on a macroscopic level and a microscopic level, the e-commerce industry can fully utilize the data information to research and predict the interest and hobbies of customers, better recommend commodities and services for the customers, better sell the commodities of the customers for the merchants and fully meet the requirements of the merchants and the users; therefore, it is of great significance to research an effective and lightweight recommendation algorithm by using the user-commodity interaction data. When a user purchases a commodity, the relationship between the previously purchased commodity and an article to be purchased is generally considered, so that the interaction sequence of the user and the commodity contains important information for the purchase of the next commodity; it becomes extremely important to recommend to a user the next item they really need from a vast number of items; the time cost of searching commodities by the user is greatly reduced, and the problems of stock and new commodities on line are solved for merchants.
In the prediction algorithm of the next commodity based on the interaction sequence of the user and the commodity, the existing algorithm generally faces three problems; the first problem is that the traditional collaborative filtering algorithm faces the problem of data sparsity, which results in poor recommendation effect; the second problem is that the traditional neural network algorithm can express the expression embedded vector of the user by using a fixed vector, which brings about the problem that popular commodities are often recommended more often during recommendation, and the commodity recall capability in a smaller field is insufficient; the third problem is that more recommended networks do not introduce time information, and the characterization vector of the user in the real situation changes along with the change of time.
Disclosure of Invention
In order to solve at least one of the technical problems in the prior art to a certain extent, the present invention provides a long and short interest sequence recommendation method, apparatus and storage medium.
The technical scheme adopted by the invention is as follows:
a long and short interest sequence recommendation method comprises the following steps:
acquiring data comprising a user and commodity interaction sequence;
constructing a graph of the user and the commodity according to the obtained data, and inputting the constructed graph into a graph neural network to obtain an initial embedded vector of the user and the commodity;
learning to obtain a user short-term behavior embedded vector according to the initial embedded vector of the commodity;
learning to obtain K short-term user interest embedded vectors according to the user short-term behavior embedded vectors;
adding the user and commodity interaction sequence and the position embedding vector, and inputting the sum into a transform module to obtain a long-term user interest embedding vector;
merging the long-term user interest embedded vector and the initial embedded vector of the user into the K short-term user interest embedded vectors to obtain K +1 user interest embedded vectors;
learning the weight of each interest embedded vector through an attention mechanism between the user interest embedded vectors and the commodity embedded vectors, and constructing final user embedded vectors;
and obtaining a commodity prediction result according to the inner product of the commodity embedded vector and the final embedded vector of the user.
Further, the constructing a graph of the user and the commodity according to the obtained data, inputting the constructed graph into a graph neural network, and obtaining an initial embedding vector of the high-order semantics of the user and the commodity, includes:
constructing a graph of the user and the commodity according to the obtained data, and converging the information between the nodes into a central node through multilayer graph convolution so as to express an initial embedded vector of the user and the commodity;
the expression of the initial embedding vector obtained by graph convolution is as follows:
Figure BDA0003661877770000021
in the formula, E (L) Showing that the obtained commodity is embedded vector integration with the user after graph convolution l layer; w 1 (l) And W 2 (l) Is a learnable parameter matrix;
Figure BDA0003661877770000022
the method comprises the steps that each node is shown to integrate self information, an identity matrix I is added, the identity matrix I is multiplied by an initialized embedded matrix, and information of neighbors of users/commodities is aggregated;
Figure BDA0003661877770000023
indicating that the relevance of the user to the product has been merged.
Further, the learning to obtain the user short-term behavior embedded vector according to the initial embedded vector of the commodity includes:
the expression of the interaction sequence of the user and the commodity is as follows:
Figure BDA0003661877770000024
representing an interaction sequence of the user and the commodity, and sequencing according to the interaction time; wherein
Figure BDA0003661877770000025
Is an embedded vector with an article ID of 1, and m is the article ID;
the method comprises the steps of carrying out mean-posing processing on a commodity embedded vector of a user and commodity interaction sequence to obtain a user short-time behavior embedded vector
Figure BDA0003661877770000026
n is a user ID; m is a commodity ID.
Further, the learning to obtain K short-term user interest embedded vectors according to the user short-term behavior embedded vectors includes:
inputting the user short-time behavior embedded vectors into a capsule network, and finally obtaining K user short-time interest embedded vectors through a process of iterating multiple dynamic routes and approaching a low-grade capsule to a high-grade capsule, wherein K is a preset hyper-parameter.
Further, the expression of the position embedding vector is as follows:
Figure BDA0003661877770000031
in the formula,
Figure BDA0003661877770000032
indicating that the user embeds information with the interactive commodity at a position with an even index value,
Figure BDA0003661877770000033
timestamp information indicating the interaction of the item with the user,
Figure BDA0003661877770000034
information indicating that a user and an interactive commodity are embedded at a position with an odd index value, wherein l represents the index value of the commodity, and d is the dimension of commodity embedding.
Further, the expression of the K +1 user interest embedding vectors is as follows:
Figure BDA0003661877770000035
in the formula,
Figure BDA0003661877770000036
representing k +1 interest embedding matrices for users, n being the user ID, i being the ID of the particular next interest embedding vector,
Figure BDA0003661877770000037
is an interest embedding vector;
the expression of the user final embedded vector is as follows:
Figure BDA0003661877770000038
Figure BDA0003661877770000039
wherein σ is a softmax nonlinear activation function; i is j Embedding vectors, V, into the merchandise i Embedding a vector for a user; u. u i For the user-embedded vector, w, which is finally obtained by weighted summation of the short-time and long-time user interest-embedded vectors ij Weight values for the vectors are embedded for each user interest. Wherein formula (1) is shorthand and formula (2) is u i The detailed calculation process of (1).
Further, the obtaining a result of commodity prediction according to an inner product of the commodity embedded vector and the user final embedded vector includes:
for user u i And a proof sample commodity
Figure BDA00036618777700000310
And a load example commodity
Figure BDA00036618777700000311
Supervised training of the output prediction values is required, and the loss of training is defined as follows:
Figure BDA00036618777700000312
o={(u,i,j)|(u,i)∈R + ,(u,j)∈R - }
in the formula, R + To observe the sample, R - No sample observed; σ is a nonlinear activation function; theta denotes
Figure BDA00036618777700000313
Parameters can be learned;
Figure BDA00036618777700000314
and representing the user interaction with the commodity prediction preference score.
Further, the long and short interest sequence recommendation method further includes the step of constructing an end-to-end model:
constructing a long-term user interest embedded vector, an attention mechanism layer between a multi-interest and commodity embedded vector and a click rate prediction part by using pre-trained graph neural network part, user behavior layer part, multi-interest capsule network part and transformations to form an end-to-end long-term interest sequence recommendation algorithm;
and (4) performing model parameter learning on the training data set by using a random gradient descent method until the model converges.
The other technical scheme adopted by the invention is as follows:
a long and short interest sequence recommendation device comprises:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method described above.
The invention adopts another technical scheme that:
a computer readable storage medium, in which a program executable by a processor is stored, the program executable by the processor being for performing the method as described above when executed by the processor.
The invention has the beneficial effects that: according to the method and the device, a plurality of short-term and long-term interest embedded vectors are generated for the user, so that different interest requirements of the user can be better met, and the recommendation effect is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart illustrating steps of a long and short interest sequence recommendation method according to an embodiment;
FIG. 2 is a schematic diagram of the pre-training phase of the model in this embodiment;
fig. 3 is a structural diagram of a main model module of the model in the present embodiment.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is the orientation or positional relationship shown on the basis of the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the present number, and the meaning of larger, smaller, etc. are understood as including the present number. If there is a description of first and second for the purpose of distinguishing technical features only, this is not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of technical features indicated.
In the description of the present invention, unless otherwise explicitly defined, terms such as arrangement, installation, connection and the like should be broadly construed, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the detailed contents of the technical solutions.
As shown in fig. 1, in the present embodiment, a long-and-short interest sequence recommendation method is provided, where a pre-training stage is performed, after interactive data between a user and a commodity is preprocessed, sampling is performed to construct an adjacency matrix of the user and the data, multiple times of graph convolution are used to extract multi-order information of the commodity and the user, after an initial embedded vector of the commodity and the user is obtained, a user-commodity interactive sequence is constructed, and the sequence is input into a transform module to learn a long-term interest embedded vector of the user. Acquiring a short-time user behavior embedded vector through a sequence extraction layer; inputting the k short-time interest embedded vectors into a capsule network to obtain k short-time interest embedded vectors of a user; acquiring the weights of a single commodity embedded vector and each user interest embedded vector by the k short-time interest embedded vectors, the long-time interest embedded vectors and the commodity embedded vectors through an attention mechanism module, and then obtaining a user final embedded vector through weighting; calculating the click possibility of interaction of the user embedded vector and the commodity embedded vector, and recommending top k commodities; and finally, supervising model training through an objective function BPRLoss, and learning network parameters through gradient back propagation until convergence. The method comprises the following steps:
and S1, acquiring data containing the interaction sequence of the user and the commodity.
The user and commodity interaction data comprises user and commodity interaction and user and commodity interaction sequence data; sampling and preprocessing the data, filtering invalid data, constructing an adjacency matrix of the user and the commodity by using interactive data, and constructing and obtaining the adjacency matrix of the user and the commodity after standardization processing:
Figure BDA0003661877770000051
wherein
Figure BDA0003661877770000052
Is an adjacency matrix after standardization, n is the number of users, and m is the number of commodities;
Figure BDA0003661877770000053
Figure BDA0003661877770000054
where D is a degree matrix, N u And N i Is a set of points of adjacency for the user to the merchandise.
And S2, constructing a diagram of the user and the commodity according to the obtained data, inputting the constructed diagram into a diagram neural network, and obtaining initial embedded vectors of the user and the commodity.
Referring to fig. 2, in the pre-training stage, the user and the commodity graph are input into the graph neural network, and through multi-layer graph convolution, information between nodes can be better combined and gathered into a central node, and the characteristics of the nodes not only include characteristics of the nodes, but also are fused with neighbor characteristics directly connected with the nodes, so that initial embedded vectors of the user and the commodity are better expressed.
The expression of the initial embedding vector obtained by graph convolution is as follows:
Figure BDA0003661877770000061
wherein,
Figure BDA0003661877770000062
after the representation graph is convolved by the l layer, the obtained commodity and the user are integrated by an embedded vector;
Figure BDA0003661877770000063
and
Figure BDA0003661877770000064
is a learnable parameter matrix;
Figure BDA0003661877770000065
equivalently, each node fuses own information, adds the information into the identity matrix I, multiplies the identity matrix I by the initialized embedded matrix, and aggregates users/commoditiesInformation of the neighbors of (1);
Figure BDA0003661877770000066
the relevance between the user and the commodity is blended; and finally, adding the two vectors to update the embedded vectors of the user and the commodity.
User embedded vectors obtained after convolution of each layer of graph:
Figure BDA0003661877770000067
commodity embedded vector:
Figure BDA0003661877770000068
splicing each layer together:
Figure BDA0003661877770000069
commercial products: i ═ I 1 ,i 2 ,…i m B, }; wherein i n The commodity embedding vector at the nth position is n which is more than or equal to 1 and less than or equal to m, and m is the total quantity of the commodities;
the user: u ═ U 1 ,u 2 ,…u l B, }; wherein u is n The user embedding vector at the nth position is represented by t which is more than or equal to 1 and less than or equal to L, and L is the total number of users;
pre-training phase, for user u i And a proof sample commodity
Figure BDA00036618777700000610
And a negative sample commodity
Figure BDA00036618777700000611
Supervised training of the output prediction values is required, and the loss of training can be defined as follows:
Figure BDA00036618777700000612
here, o represents: o { (u, i, j) | (u, i) ∈ R + ,(u,j)∈R - In which R is + To observe the sample, R - No sample observed; σ is a nonlinear activation function; theta denotes
Figure BDA00036618777700000613
Learnable parameters, using L 2 Regularization reduces the over-fitting problem of the small model.
And S3, learning to obtain the user short-term behavior embedded vector according to the initial embedded vector of the commodity.
And (3) performing interaction sequence between the user and the quotient, and learning to obtain the user short-time behavior embedding vector by using mean-posing or max-posing and other modes.
User interaction sequence with merchandise:
Figure BDA00036618777700000614
representing an interaction sequence of a user and the commodity, and sequencing according to interaction time; wherein
Figure BDA0003661877770000071
Is an embedded vector with an item ID of 1, and m is the item ID. Mean-posing is carried out on the article embedding vector of the user and commodity interaction sequence, and finally the user short-time behavior embedding vector is obtained
Figure BDA0003661877770000072
n is a user ID; m is a commodity ID.
And S4, learning to obtain K short-time user interest embedded vectors according to the user short-time behavior embedded vectors.
Learning to obtain a user short-term behavior representation embedding vector, and inputting the user short-term behavior representation embedding vector into a capsule network specifically as follows: and inputting the short-time behavior embedding vectors of a plurality of users into a capsule network, wherein the number of the capsules is determined dynamically by the interaction times of the users.
The formula: k' u =max(1,min(K,log 2 (|I u L))) of K' u For dynamic user interest counts, by comparing 1 to min (K, log) 2 (|I u |)) to obtain the number of the dynamic user interests; in which I u For the number of user interactions, K is a super parameter set for people.
Learning to obtain a user short-term behavior representation embedding vector, and inputting the user short-term behavior representation embedding vector into a capsule network specifically as follows:
the learning method of the capsule network comprises the following steps that through the process of iterative multiple dynamic routing, the short-time interest embedding vector of a user is finally obtained through the process that a low-grade capsule approaches a high-grade capsule, and the formula is as follows:
the formula is as follows:
Figure BDA0003661877770000073
wherein b is ij For random initialization, b ij ~N(0,σ 2 ) And then through softamx (b) ij ) Obtaining a weight value of each short-time behavior vector;
further, the air conditioner is provided with a fan,
Figure BDA0003661877770000074
the embedded vector is the short-time user behavior, and j is the user interest number;
Figure BDA0003661877770000075
by embedding vectors of short-term user behavior into bilinear mapping matrix
Figure BDA0003661877770000076
Weighting and summing to obtain a user short-time interest embedded vector; wherein,
Figure BDA0003661877770000077
embedding vectors for short-term interest of users after passing through a non-linear activation function
Figure BDA0003661877770000078
Further, obtain
Figure BDA0003661877770000079
Then, it is input into a nonlinear activation function:
Figure BDA00036618777700000710
wherein,
Figure BDA00036618777700000711
embedding vectors for short-term interest of users after passing through a nonlinear activation function
Figure BDA00036618777700000712
Then, embedding the interest embedding vector into the vector by point-multiplying the short-time user behavior; the formula:
Figure BDA00036618777700000713
the degree of correlation between the short-term user behavior embedding vector and the interest embedding vector can be regarded as the degree of correlation; final and initial values b ij Add to update b ij
The formula:
Figure BDA00036618777700000714
through iterating the steps for multiple times, K short-time user interest embedded vectors are finally obtained.
And S5, adding the user and commodity interaction sequence and the position embedding vector, and inputting the sum to a transformer module to obtain a long-term user interest embedding vector.
The expression of the position embedding vector is as follows:
Figure BDA0003661877770000081
where d is the dimension in which the merchandise is embedded,
Figure BDA0003661877770000082
where i is the user ID, j is the merchandise ID, and k is the dimension of the location embedding vector.
By utilizing the periodicity of the trigonometric function, semantic information at a specific moment can be captured, and the periodic relationship and relativity between time points can be described to generate the embedded representation. The method can introduce different period sizes, so that different periodic relationships can be extracted in different dimensions. Meanwhile, different from the position information, the time axis can be infinitely extended, and the representation mode based on the trigonometric function mapping can be adapted to the time stamp with any size, so that untrained time in the test process can be processed.
And S6, merging the long-term user interest embedded vector and the initial embedded vector of the user into the K short-term user interest embedded vectors to obtain K +1 user interest embedded vectors.
And integrating the long-time user interest embedded vector and the user initial embedded vector into K short-time interest embedded vectors to finally obtain K +1 user interest embedded vectors. The expression of the K +1 user interest embedding vectors is as follows:
Figure BDA0003661877770000083
in the formula,
Figure BDA0003661877770000084
representing the user k +1 interest embedding matrices, n is the user ID, i is the ID of the specific few interest embedding vectors,
Figure BDA0003661877770000085
is the interest embedding vector.
And S7, learning the weight of each interest embedded vector through the attention mechanism between the user interest embedded vector and the commodity embedded vector, and constructing a user final embedded vector.
The expression of the user final embedded vector is as follows:
Figure BDA0003661877770000086
Figure BDA0003661877770000087
wherein σ is a softmax nonlinear activation function; I.C. A j Embedding vectors, V, for the goods i Embedding a vector for a user; u. u i The user embedded vectors are finally obtained by weighting and summing the short-time and long-time user interest embedded vectors.
And S8, obtaining the commodity prediction result according to the inner product of the commodity embedded vector and the final embedded vector of the user.
Pre-training phase, for user u i And a proof sample commodity
Figure BDA0003661877770000088
And a negative sample commodity
Figure BDA0003661877770000089
Supervised training of the output prediction values is required, and the loss of training can be defined as follows:
Figure BDA0003661877770000091
o={(u,i,j)|(u,i)∈R + ,(u,j)∈R - }
in the formula, R + To observe the sample, R - No sample observed; σ is a nonlinear activation function; theta denotes
Figure BDA0003661877770000092
Learnable parameters, using L 2 Regularization reduces the overfitting problem of the small model.
And S9, constructing an end-to-end model, and learning and updating parameters by using the training data.
The method comprises the steps that user and commodity interaction data pass through a pre-trained graph neural network part, a user behavior layer part, a multi-interest capsule network part, a transformers to construct a long-time user interest embedded vector, an attention mechanism layer between a multi-interest and commodity embedded vector and a click rate prediction part to form an end-to-end long-short interest sequence recommendation algorithm based on an attention mechanism, and model parameter learning is carried out on a training data set by using a random gradient descent method until a model converges.
The above method is explained in detail below with reference to the accompanying and specific examples.
The embodiment provides a long and short interest sequence recommendation method based on an attention mechanism, which comprises the steps of sampling and filtering invalid data, constructing a graph of a user and a commodity, and entering the graph into a pre-training module; inputting the constructed user and commodity graph into a graph neural network, and obtaining initial embedded vectors of other nodes fused by the user and commodity nodes after multilayer convolution; splicing the obtained initial embedded vectors of the commodities with learnable position vectors, inputting the spliced initial embedded vectors and the learnable position vectors into a user behavior layer, learning to obtain embedded vectors of short-term behaviors of the users by means of mean-posing or max-posing and the like, inputting the embedded vectors into a capsule network, and learning K embedded vectors of interest of the short-term users through the process of iterative multiple dynamic routing; inputting the interaction sequence of the user and the commodity into a transform module to obtain a global user interest embedded vector; splicing the global user interest embedded vector and the K short-time interest embedded vectors with the user initial embedded vector to obtain K +1 user interest embedded vectors; finally constructing a final embedded vector of the user through the weight of each interest embedded vector of attention mechanism between the user interest embedded vector and the commodity embedded expression vector; obtaining a commodity prediction result by utilizing an inner product of the commodity and the user embedded vector; and constructing an end-to-end model, and learning and updating parameters by using the training data. The method specifically comprises the following steps:
step 1, data preparation.
The data to be prepared mainly includes the interaction and time information of the user and the commodity. Downloading data sets such as taobao, tianmao or movie-1m on the public data set, wherein each sample comprises commodities interacted between the user and specifically interaction time information; and traversing all users, eliminating the users with the commodity interaction quantity less than 30, and filtering invalid data.
And 2, learning the initial embedded vector of the user and the commodity by the pre-training model.
Constructing a graph of the commodity and the user to obtain a key dictionary with adjacent matrix dimensions (6040+3706 and 6040+3706) based on a sparse matrix, and then converting the key dictionary into a lil sparse matrix; constructing a key dictionary dimension (6040, 3706) based on a sparse matrix according to the number of users and commodities, and inserting the matrix into a value (6040, 6040:) according to an index in an adjacent matrix; inserting the transpose of the matrix into the value of (6040: 6040) in the adjacency matrix according to the index; and then converting the adjacency matrix into a key dictionary based on a sparse matrix, and completing the construction of the adjacency matrix of the user and the commodity.
R∈R n×m The adjacency matrix, n is the number of users, and m is the number of commodities;
Figure BDA0003661877770000101
constructing an adjacency matrix of the commodity and the user through the R; then obtaining the product by standardization
Figure BDA0003661877770000102
Figure BDA0003661877770000103
Where D is a degree matrix, N u And N i Is a set of user and commodity adjacent points; the reason for dot-product the degree matrix with the adjacency matrix is that nodes with large degrees have large values in their characteristic representation and nodes with small degrees have small values, which may cause gradient disappearance or gradient explosion. Finally, the normalization of the adjacency matrix is realized.
Further, normalization processing needs to be performed on the adjacency matrix; firstly, establishing the dimension of an identity matrix (9746 ) to be added with an adjacent matrix, wherein the aggregation representation of the nodes does not contain the characteristics of the nodes, the representation is the characteristic aggregation of adjacent nodes and does not contain the characteristics of the nodes, so that the effect of adding the attributes of each node to increase a closed loop through identity matrix fusion is equivalent to adding the attributes of each node. The specific expression is as follows:
Figure BDA0003661877770000104
further, inputting the user, the commodity and the constructed adjacency matrix into a pre-training model; first initializing initial embedded vectors of users (6040, 64) and commodities (3706, 64), and linear variation matrix dimensions (64, 64) and bias matrices (64, 1) in a graph neural network; the adjacent matrix based on the csr sparse matrix is changed into sparse tenar with the dimensions of (9746 ).
In each training process of the model, firstly, sampling a data sample; there are many ways to sample; 4060 users are randomly selected from 6040 users in the training set; traversing each user, and selecting a positive sample and a negative sample from the training set; and finally, obtaining a user id, and three lists of the positive sample and the negative sample commodity id corresponding to the user id after sampling.
Inputting the three lists into a neural network model of the graph; the adjacent matrix passes through a dropout layer (0.1), partial nodes are invalid, the effect is more favorable for test set generalization, and the influence of overfitting is reduced; the initial embedded vectors of the user and the goods are spliced column by column to form an overall initial embedded matrix, tenor, dimensions (9746, 64), which is included by a list.
Entering a first layer graph convolution network, carrying out matrix multiplication on an adjacent matrix (9748, 9746) and an integral initialization embedding vector matrix (9746, 64) to obtain an edge representation embedding matrix, aggregating information of all neighbors of a user to the user, a commodity to the commodity, the user and the commodity, passing the information through a layer of linear change layers, carrying out feature extraction, and obtaining a representation embedding matrix (9746, 64) summarizing neighbor information.
Furthermore, the information gathered by the neighbors is embedded into the node body of the user, the integral initialization embedded vector matrix and the expression embedded matrix for summarizing the neighbor information are subjected to point multiplication, so that the information gathered by the neighbors is gathered on the central node, and the delivery and aggregation of the commodity to the user and the commodity information from the user are realized, and the dimensions (9746 and 64); it is activated through linear change layers and nonlinearities.
Adding the information transmission from the user to the user and the information transmission from the commodity to the user, and putting a layer of graph convolution network into a list after dropout and regularization processing, wherein the obtained matrix is still dimension (9746, 64); after the graph convolution is carried out for multiple times, the user and the commodity can extract higher-order information, and the information representing the embedded vector is enriched;
the specific calculation formula can be written as:
Figure BDA0003661877770000111
wherein
Figure BDA0003661877770000112
After the representation graph is convolved by the l layer, the obtained commodity is integrated with the user by an embedded vector;
Figure BDA0003661877770000113
and
Figure BDA0003661877770000114
is a learnable parameter matrix;
Figure BDA0003661877770000115
equivalently, each node integrates own information, adds the information into the identity matrix I, multiplies the identity matrix I by the initialized embedded matrix, and aggregates the information of neighbors of users/commodities;
Figure BDA0003661877770000116
the relevance between the user and the commodity is blended; and finally, adding the two vectors, and updating the embedded vectors of the user and the commodity.
Finally, splicing the user obtained by multiple times of graph convolution with the commodity representation embedded matrix to obtain a final overall final embedded matrix after passing through the multilayer graph neural network, wherein the dimensionality is (9746, 256) and the assumption is that the 3-layer graph convolution is carried out;
user embedded vectors obtained after convolution of each layer of graph:
Figure BDA0003661877770000117
commodity embedded vector:
Figure BDA0003661877770000118
splicing each layer together:
Figure BDA0003661877770000119
commercial products: i ═ I 1 ,i 2 ,…i m ,}
Wherein i n The commodity embedded vector at the nth position is n which is more than or equal to 1 and less than or equal to m, and m is the total number of the commodities;
the user: u ═ U 1 ,u 2 ,…u l ,}
Wherein u is n And (4) embedding the user in the nth position into a vector, wherein t is more than or equal to 1 and less than or equal to L, and L is the total number of the users.
And obtaining the embedded matrix corresponding to the user and the commodity by splitting the integral final embedded matrix, extracting the sampled user id and commodity id list, and extracting the embedded vector corresponding to the user and the positive and negative sample commodities.
And (3) representing the embedded vector by the embedded vector of the user and positive and negative samples, respectively performing inner products, wherein the loss function is as follows:
Figure BDA00036618777700001110
o here denotes: o { (u, i, j) | (u, i) ∈ R + ,(u,j)∈R - In which R is + To observe the sample, R - No sample observed; σ is a nonlinear activation function; theta denotes
Figure BDA0003661877770000121
Learnable parameters, using L 2 Regularization reduces the over-fitting problem of the small model.
And finally, monitoring model training through a BPRLoss loss function, and updating parameters through gradient feedback. After the pre-training module finishes training, the user and commodity representation embedded vectors obtained through learning are input into the main model.
And 3, inputting the initial embedded vectors of the commodities into a user behavior layer, and learning the short-time user behavior embedded vectors.
Randomly extracting 1000 users in a training set to obtain a user id sequence (1000), a user and commodity purchase sequence (1000, 50), a user and positive sample sequence (1000, 1) and a user and negative sample sequence (1000, 10); the above data are all expanded in one dimension.
Commercial products: i ═ I 1 ,i 2 ,…i m ,}
Wherein i n The commodity embedded vector at the nth position is n which is more than or equal to 1 and less than or equal to m, and m is the total number of the commodities;
the user: u ═ U 1 ,u 2 ,…u l ,}
Wherein u is n And (4) embedding the user in the nth position into a vector, wherein t is more than or equal to 1 and less than or equal to L, and L is the total number of the users.
Further, inputting the interaction sequence of the user and the commodities and the interaction quantity sequence of the commodities purchased by the user into a user behavior layer; taking 50 sequences of each user as a sequence through a sliding window to obtain n sequences of interaction of the user and different commodities, and aggregating commodity representation embedded vectors of the users into n short-time user behavior representation embedded vectors in a mean-posing mode; the action layer is used for averaging the embedded vectors of 50 adjacent commodities of one user to obtain the embedded vectors of the short-term behaviors of the user, and then the embedded vectors of the n short-term behaviors of the user are obtained through the action of the sliding window.
User interaction sequence with merchandise:
Figure BDA0003661877770000122
representing an interaction sequence of a user and the commodity, and sequencing according to interaction time; wherein
Figure BDA0003661877770000123
Is an embedded vector with an item ID of 1, and m is the item ID. Finally, the embedded vector of the short-term behavior of the user is obtained by doing mean-posing on the embedded vector of the article of the interaction sequence of the user and the commodity
Figure BDA0003661877770000124
n is a user ID; m is a commodity ID.
And 4, inputting the short-time user behavior embedded vector into a capsule network, and learning the short-time user interest embedded vector.
After n short-time user behavior embedding vectors of a user are obtained, inputting the n short-time user behavior embedding vectors into a capsule network; the capsule network has the function of aggregating n input vectors into one vector for output, and plays a role in clustering.
Specifically, the method comprises the following steps: in the model, the capsule network has the function of embedding a plurality of short-term user behaviors into vectors and extracting k short-term user interest embedded vectors through the process of iterating multiple dynamic routes.
The formula is as follows: k' u =max(1,min(K,log 2 (|I u L))) of K' u For dynamic user interest number, by taking 1 and min (K, log) 2 (|I u |)))); wherein I u For the number of user interactions, K is a super parameter set by people.
The learning method of the capsule network comprises the following steps that through the process of iterative multiple dynamic routing, the short-time interest embedding vector of a user is finally obtained through the process that a low-grade capsule approaches a high-grade capsule, and the formula is as follows:
the formula:
Figure BDA0003661877770000131
wherein b is ij For random initialization, b ij ~N(0,σ 2 ) And then through softamx (b) ij ) A weight value is obtained for each short-time behavior vector.
Further, the air conditioner is provided with a fan,
Figure BDA0003661877770000132
the embedded vector is the short-time user behavior, and j is the user interest number;
Figure BDA0003661877770000133
by making short-term user behaviourEmbedded vector and bilinear mapping matrix of
Figure BDA0003661877770000134
Weighting and summing to obtain a user short-time interest embedded vector; wherein,
Figure BDA0003661877770000135
embedding vectors for short-term interest of users after passing through a non-linear activation function
Figure BDA0003661877770000136
Further, obtain
Figure BDA0003661877770000137
Then, it is input into the nonlinear activation function:
Figure BDA0003661877770000138
wherein,
Figure BDA0003661877770000139
embedding vectors for short-term interest of users after passing through a nonlinear activation function
Figure BDA00036618777700001310
Then, embedding the interest embedding vector into the vector by point-multiplying the short-time user behavior; the formula is as follows:
Figure BDA00036618777700001311
the degree of correlation between the short-term user behavior embedding vector and the interest embedding vector can be regarded as the degree of correlation; final and initial values b ij Add to update b ij
The formula is as follows:
Figure BDA00036618777700001312
and finally obtaining K short-time user interest expression embedded vectors by iterating the steps for multiple times.
And (3) dynamic routing process: first of all initialise b ij By softmax(b ij ) Sequencing to obtain the weight of each short-time user behavior embedded vector, obtaining a vector after aggregating a plurality of behavior embedded vectors through weighted summation, entering a square nonlinear activation function to finally obtain a new embedded vector, and finally adding the new embedded vector with the previous short-time user behavior embedded vector to obtain a new bij value; through iteration for multiple times, n user behavior embedded vectors can be finally aggregated into k short-time user interest embedded vectors with user behavior characteristics extracted.
And 5, learning a long-term user interest embedded vector through a transform module by the interaction sequence of the user and the commodity.
Specifically, a sequence of interaction between a user and a commodity is extracted through a transform module to obtain a long-term interest embedded vector of the user, and the long-term interest embedded vector is used as the global interest of the user; the main steps include 2 steps.
Referring to fig. 3, firstly, a mapping position embedding layer is used, which is used for giving position information to a commodity embedding vector of interaction between a user and a commodity, so that time information can be better retained in a transform module, and long-time information and short-time information are retained when commodity features are extracted from the transform module; embedding vector time information into each commodity; the method comprises the following steps:
the formula:
Figure BDA0003661877770000141
where d is the dimension in which the merchandise is embedded,
Figure BDA0003661877770000142
where i is the user ID, j is the goods ID, and k is the dimension of the location embedding vector;
by utilizing the periodicity of the trigonometric function, semantic information at a specific moment can be captured, and the periodic relationship and relativity between time points can be described to generate the embedded representation. The scheme can introduce different period sizes, so that different periodic relationships can be extracted in different dimensions. Meanwhile, different from the position information, the time axis can be infinitely extended, and the representation mode based on the trigonometric function mapping can be adapted to the time stamp with any size, so that untrained time in the test process can be processed.
Then d is the dimension of the word embedding,
Figure BDA0003661877770000143
the range is more than [ 1, 1 ]; using different functions in different dimensions to manipulate the position codes, the high latitude representation space is more meaningful, so each dimension of the different position codes is assigned a different value
Figure BDA0003661877770000144
This results in each dimension containing certain position information, and the position is encoded differently.
And then, the expression embedded vector after the time information is integrated into the commodity is input into a multi-head attention mechanism module, and the multi-head attention mechanism module can well learn the relationship between the commodity and the commodity. The embedded vector of one commodity is obtained by weighted summation of other commodities in the form of attention mechanism of query, key and value.
And 6, combining long-term and short-term user interest embedded vectors by applying an attention mechanism.
Figure BDA0003661877770000145
Wherein
Figure BDA0003661877770000146
An interest embedding vector representing all users, n is a user ID, i is an ID of a specific several interest embedding vector,
Figure BDA0003661877770000147
is the interest embedding vector.
Paying attention to the mechanical layer; weighting and summing k short-time user interest embedded vectors and a long-time user interest embedded vector together through an attention mechanism to obtain a final user embedded vector;
Figure BDA0003661877770000148
Figure BDA0003661877770000149
Figure BDA00036618777700001410
wherein
Figure BDA00036618777700001411
Three globally shared trainable parameter matrices respectively,
Figure BDA00036618777700001412
Figure BDA00036618777700001413
wherein d is model Is the dimension of the commodity embedding vector, and k is the dimension of the parameter matrix;
after three vectors of query, key and value of each commodity embedded vector are obtained, the query vector is used for matching each key vector, the two vectors are subjected to point multiplication to obtain the correlation weight of the query vector and the key vector, the correlation weight is obtained after the point multiplication of the query vector and the key vector, and a corresponding weight value is obtained by applying a Softmax activation function; weighting, summing and updating to obtain each commodity embedded vector; finally, after each head is butt-jointed with W through multiple heads o A matrix of parameters is formed by a matrix of parameters,
Figure BDA0003661877770000151
Figure BDA0003661877770000152
for k +1 user interest embedded vectors, our final goal is to synthesize the final user embedded vector; therefore, the k +1 user interest embedded vector passes through a Relu nonlinear activation function layer, the numerical values are all larger than 0, gradient propagation is prevented from disappearing, and gradient transfer is facilitated.
Inputting the k +1 user interest embedding vectors into an attention mechanism layer, and taking the k +1 user interest embedding vectors as keys; then, acquiring an embedded vector of the commodity according to the id value of the randomly sampled positive sample item; and similarly, the negative sample also obtains the commodity embedded vector of the negative sample.
The internal mechanism is that firstly, a commodity embedding vector is used as query, inner product is carried out on the commodity embedding vector and each interest embedding vector, then probability value is obtained through Softmax, and then the embedding vector of the user is obtained through weighted summation of the probability values.
Finally, the last output is used as a long-time interest embedding vector of the user and added into a short-time user interest representation embedding vector; and the initial embedded vectors of the users in the pre-training stage are added together, so that the embedded vectors of the short-time user interest are enriched.
Normalizing the attention value by using a Softmax function to obtain the attention probability distribution of each interest embedded vector:
the formula:
Figure BDA0003661877770000153
the formula:
Figure BDA0003661877770000154
Figure BDA0003661877770000155
wherein,
Figure BDA0003661877770000156
is the attention size of each region, w ij The vector of the last attention of each region, its value and
Figure BDA0003661877770000157
equal in size, w ij The sum of the individual elements thereof is 1;
wherein σ is a softmax nonlinear activation function; i is j Embedding vectors, V, into the merchandise i Embedding a vector for a user; u. of i The user embedded vectors are finally obtained by weighting and summing the short-time and long-time user interest embedded vectors.
And 7, scoring a prediction layer and predicting the click rate.
Obtaining a predicted value by using a dot product mode of the commodity expression embedded vector and the user expression embedded vector;
the formula:
Figure BDA0003661877770000158
the formula: BPRLoss:
Figure BDA0003661877770000159
here, o represents: loss is defined as o { (u, i, j) | (u, i) ∈ R + ,(u,j)∈R - In which R is + To observe the sample, R - No sample observed; sigmoid is a nonlinear activation function; theta represents
Figure BDA00036618777700001510
Learnable parameters, using L 2 Regularization reduces the over-fitting problem of the small model.
And 8, training a model.
An end-to-end model is constructed, and parameter learning and updating are carried out by using training data, specifically:
firstly, sampling and constructing an adjacent matrix of user and data after preprocessing user and commodity interaction data in a pre-training stage, extracting multi-order information of the commodity and the user by using multiple times of graph convolution, constructing a user and commodity interaction sequence after obtaining an initial embedded vector of the commodity and the user, inputting the sequence into a transform module, and learning a long-term global interest expression embedded vector of the user; a second step of obtaining a short-time user behavior embedded vector through a sequence extraction layer; inputting the k interest expression embedded vectors into a capsule network to obtain k interest expression embedded vectors of a user; the k short-time interest embedded vectors, the global interest embedded vectors and the commodity embedded vectors pass through an attention mechanism module to obtain the weight of a single commodity embedded vector and each user interest embedded vector, and then the user final embedded vectors are obtained through weighting; calculating the interaction probability of the user embedded vector and the commodity embedded vector, and recommending top k commodities; and finally, supervising model training through an objective function BPRLoss, and learning network parameters through gradient back propagation until convergence.
The present embodiment further provides a long and short interest sequence recommendation apparatus, including:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.
The long and short interest sequence recommendation device of this embodiment can execute the long and short interest sequence recommendation method provided by the method embodiment of the present invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.
The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.
The present embodiment further provides a storage medium, which stores an instruction or a program capable of executing the long and short interest sequence recommendation method provided in the method embodiment of the present invention, and when the instruction or the program is executed, the method embodiment may be executed in any combination of the steps, and the method has corresponding functions and advantages.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those of ordinary skill in the art will be able to practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, for example, as a sequential list of executable instructions that may be thought of as implementing logical functions, may be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that may fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Further, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following technologies, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A long and short interest sequence recommendation method is characterized by comprising the following steps:
acquiring data comprising a user and commodity interaction sequence;
constructing a graph of the user and the commodity according to the obtained data, and inputting the constructed graph into a graph neural network to obtain an initial embedded vector of the user and the commodity;
learning to obtain a user short-term behavior embedded vector according to the initial embedded vector of the commodity;
learning to obtain K short-term user interest embedded vectors according to the user short-term behavior embedded vectors;
adding the user and commodity interaction sequence and the position embedded vector, and inputting the result into a transform module to obtain a long-term user interest embedded vector;
merging the long-term user interest embedded vector and the initial embedded vector of the user into the K short-term user interest embedded vectors to obtain K +1 user interest embedded vectors;
learning the weight of each interest embedded vector through an attention mechanism between the user interest embedded vectors and the commodity embedded vectors, and constructing final user embedded vectors;
and obtaining a commodity prediction result according to the inner product of the commodity embedded vector and the final embedded vector of the user.
2. The long and short interest sequence recommendation method according to claim 1, wherein the method for constructing a graph of the user and the commodity according to the obtained data, inputting the constructed graph into a graph neural network, and obtaining an initial embedding vector of high-order semantics of the user and the commodity comprises:
constructing a graph of the user and the commodity according to the obtained data, and converging the information between the nodes into a central node through multilayer graph convolution so as to express an initial embedded vector of the user and the commodity;
the expression of the initial embedding vector obtained by graph convolution is as follows:
Figure FDA0003661877760000011
in the formula, E (L) Showing that the obtained commodity is embedded vector integration with the user after graph convolution l layer; w 1 (l) And W 2 (l) Is a learnable parameter matrix;
Figure FDA0003661877760000012
the method comprises the steps that each node is shown to integrate self information, an identity matrix I is added, the identity matrix I is multiplied by an initialized embedded matrix, and information of neighbors of users/commodities is aggregated;
Figure FDA0003661877760000013
indicating that the relevance of the user to the product has been merged.
3. The long and short interest sequence recommendation method according to claim 1, wherein learning to obtain the user short-term behavior embedding vector according to the initial embedding vector of the commodity comprises:
the expression of the interaction sequence of the user and the commodity is as follows:
Figure FDA0003661877760000014
Figure FDA0003661877760000015
representing an interaction sequence of a user and the commodity, and sequencing according to interaction time; wherein
Figure FDA0003661877760000016
Is an embedded vector with an article ID of 1, and m is the article ID;
the method comprises the steps of carrying out mean-posing processing on a commodity embedded vector of a user and commodity interaction sequence to obtain a user short-time behavior embedded vector
Figure FDA0003661877760000021
n is a user ID; m is a commodity ID.
4. The long and short interest sequence recommendation method according to claim 1, wherein learning K short-term user interest embedded vectors according to the user short-term behavior embedded vector comprises:
and inputting the user short-time behavior embedded vectors into a capsule network, and obtaining K user short-time interest embedded vectors through a process of iterative multiple dynamic routing, wherein K is a preset hyper-parameter.
5. The long and short interest sequence recommendation method according to claim 1, wherein the expression of the position embedding vector is as follows:
Figure FDA0003661877760000022
in the formula,
Figure FDA0003661877760000023
indicating that the user embeds information with the interactive commodity at a position with an even index value,
Figure FDA0003661877760000024
timestamp information indicating the interaction of the item with the user,
Figure FDA0003661877760000025
information indicating that a user and an interactive commodity are embedded at a position with an odd index value, wherein l represents the index value of the commodity, and d is the dimension of commodity embedding.
6. The long and short interest sequence recommendation method according to claim 1, wherein the K +1 user interest embedding vectors have the following expressions:
Figure FDA0003661877760000026
in the formula,
Figure FDA0003661877760000027
representing k +1 interest embedding matrices for users, n being the user ID, i being the ID of the particular next interest embedding vector,
Figure FDA0003661877760000028
is an interest embedding vector;
the expression of the user final embedded vector is as follows:
Figure FDA0003661877760000029
Figure FDA00036618777600000210
wherein σ is a softmax nonlinear activation function; I.C. A j Embedding vectors, V, into the merchandise i Embedding a vector for a user; u. of i For the user-embedded vector, w, which is finally obtained by weighted summation of the short-time and long-time user interest-embedded vectors ij The weight values of the vectors are embedded for each user interest.
7. The method for recommending long and short interest sequences according to claim 1, wherein the obtaining of the result of commodity prediction according to the inner product of the commodity embedded vector and the final embedded vector of the user comprises:
for user u i And a proof goods
Figure FDA0003661877760000031
And a negative sample commodity
Figure FDA0003661877760000032
Supervised training of the output prediction values is required, and the loss of training is defined as follows:
Figure FDA0003661877760000033
o={(u,i,j)|(u,i)∈R + ,(u,j)∈R - }
in the formula, R + To observe the sample, R - To observe nothingA sample; σ is a nonlinear activation function; theta denotes
Figure FDA0003661877760000034
Parameters can be learned;
Figure FDA0003661877760000035
and representing the user interaction with the commodity prediction preference score.
8. The long-short interest sequence recommendation method according to claim 1, further comprising the step of constructing an end-to-end model:
constructing a long-term user interest embedded vector, an attention mechanism layer between a multi-interest and commodity embedded vector and a click rate prediction part by using pre-trained graph neural network part, user behavior layer part and multi-interest capsule network part and transformations to form an end-to-end long-term and short-term interest sequence recommendation algorithm;
and (4) performing model parameter learning on the training data set by using a random gradient descent method until the model converges.
9. A long and short interest sequence recommendation device is characterized by comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-8.
10. A computer-readable storage medium, in which a program executable by a processor is stored, wherein the program executable by the processor is adapted to perform the method according to any one of claims 1 to 8 when executed by the processor.
CN202210575237.7A 2022-05-25 2022-05-25 Long-short interest sequence recommendation method, device and storage medium Active CN115099886B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210575237.7A CN115099886B (en) 2022-05-25 2022-05-25 Long-short interest sequence recommendation method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210575237.7A CN115099886B (en) 2022-05-25 2022-05-25 Long-short interest sequence recommendation method, device and storage medium

Publications (2)

Publication Number Publication Date
CN115099886A true CN115099886A (en) 2022-09-23
CN115099886B CN115099886B (en) 2024-04-19

Family

ID=83288591

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210575237.7A Active CN115099886B (en) 2022-05-25 2022-05-25 Long-short interest sequence recommendation method, device and storage medium

Country Status (1)

Country Link
CN (1) CN115099886B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115374369A (en) * 2022-10-20 2022-11-22 暨南大学 News diversity recommendation method and device based on graph neural network
CN116562992A (en) * 2023-07-11 2023-08-08 数据空间研究院 Method, device and medium for recommending items for modeling uncertainty of new interests of user

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111932336A (en) * 2020-07-17 2020-11-13 重庆邮电大学 Commodity list recommendation method based on long-term and short-term interest preference
CN112990972A (en) * 2021-03-19 2021-06-18 华南理工大学 Recommendation method based on heterogeneous graph neural network
CN114519145A (en) * 2022-02-22 2022-05-20 哈尔滨工程大学 Sequence recommendation method for mining long-term and short-term interests of users based on graph neural network
CN114528490A (en) * 2022-02-18 2022-05-24 哈尔滨工程大学 Self-supervision sequence recommendation method based on long-term and short-term interests of user

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111932336A (en) * 2020-07-17 2020-11-13 重庆邮电大学 Commodity list recommendation method based on long-term and short-term interest preference
CN112990972A (en) * 2021-03-19 2021-06-18 华南理工大学 Recommendation method based on heterogeneous graph neural network
CN114528490A (en) * 2022-02-18 2022-05-24 哈尔滨工程大学 Self-supervision sequence recommendation method based on long-term and short-term interests of user
CN114519145A (en) * 2022-02-22 2022-05-20 哈尔滨工程大学 Sequence recommendation method for mining long-term and short-term interests of users based on graph neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何昊晨;***;: "基于多维社交关系嵌入的深层图神经网络推荐方法", 计算机应用, vol. 40, no. 10, 10 October 2020 (2020-10-10), pages 2796 - 2800 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115374369A (en) * 2022-10-20 2022-11-22 暨南大学 News diversity recommendation method and device based on graph neural network
CN116562992A (en) * 2023-07-11 2023-08-08 数据空间研究院 Method, device and medium for recommending items for modeling uncertainty of new interests of user
CN116562992B (en) * 2023-07-11 2023-09-29 数据空间研究院 Method, device and medium for recommending items for modeling uncertainty of new interests of user

Also Published As

Publication number Publication date
CN115099886B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
Zhuang et al. Representation learning via dual-autoencoder for recommendation
Pan et al. Study on convolutional neural network and its application in data mining and sales forecasting for E-commerce
CN110717098B (en) Meta-path-based context-aware user modeling method and sequence recommendation method
Liu et al. Detection of spam reviews through a hierarchical attention architecture with N-gram CNN and Bi-LSTM
CN115082147B (en) Sequence recommendation method and device based on hypergraph neural network
Biswas et al. A hybrid recommender system for recommending smartphones to prospective customers
CN115099886A (en) Long and short interest sequence recommendation method and device and storage medium
Kiran et al. OSLCFit (organic simultaneous LSTM and CNN Fit): a novel deep learning based solution for sentiment polarity classification of reviews
CN112100485B (en) Comment-based scoring prediction article recommendation method and system
Bouzidi et al. Deep learning-based automated learning environment using smart data to improve corporate marketing, business strategies, fraud detection in financial services, and financial time series forecasting
Khan et al. Comparative analysis on Facebook post interaction using DNN, ELM and LSTM
Fang Making recommendations using transfer learning
Hiriyannaiah et al. DeepLSGR: Neural collaborative filtering for recommendation systems in smart community
Li Cross‐Border E‐Commerce Intelligent Information Recommendation System Based on Deep Learning
Ren et al. A hierarchical neural network model with user and product attention for deceptive reviews detection
Liu Deep learning in marketing: a review and research agenda
Pughazendi et al. Graph sample and aggregate attention network optimized with barnacles mating algorithm based sentiment analysis for online product recommendation
Mawane et al. Unsupervised deep collaborative filtering recommender system for e-learning platforms
Jiang et al. A fusion recommendation model based on mutual information and attention learning in heterogeneous social networks
Vielma et al. Sentiment analysis with novel GRU based deep learning networks
Reddy et al. An approach for suggestion mining based on deep learning techniques
Shen et al. A deep embedding model for co-occurrence learning
Agarwal et al. Binarized spiking neural networks optimized with Nomadic People Optimization-based sentiment analysis for social product recommendation
CN116932862A (en) Cold start object recommendation method, cold start object recommendation device, computer equipment and storage medium
Zafar Ali Khan et al. Hybrid collaborative fusion based product recommendation exploiting sentiments from implicit and explicit reviews

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant