CN113761392B

CN113761392B - Content recall method, computing device, and computer-readable storage medium

Info

Publication number: CN113761392B
Application number: CN202111072356.2A
Authority: CN
Inventors: 黄清纬; 彭飞; 唐文斌
Original assignee: Shanghai Renyimen Technology Co ltd
Current assignee: Shanghai Renyimen Technology Co ltd
Priority date: 2021-09-14
Filing date: 2021-09-14
Publication date: 2022-04-12
Anticipated expiration: 2041-09-14
Also published as: CN113761392A

Abstract

The present disclosure provides a content recall method, a computing device, and a computer-readable storage medium. The method comprises the following steps: determining a training sample of a user based on a streaming log of a social platform; training a first neural network model based on the first training samples to generate a first training target vector; training a second neural network model based on the second training samples to produce a second training target vector; determining training outputs of the training samples based on the first and second training target vectors to update weight functions of the first and second neural network models, respectively; and calculating scores of a plurality of candidate contents based on the trained first neural network model and the trained second neural network model respectively to determine a recall result from the plurality of candidate contents.

Description

Content recall method, computing device, and computer-readable storage medium

Technical Field

The present invention relates generally to the field of machine learning, and more particularly to a content recall method, computing device, and computer-readable storage medium.

Background

With the continuous development of various social platforms, more and more users acquire various information through the social platforms. Generally, when a user logs in a social platform, a server of the social platform recommends some content to the user and displays the content on a user terminal. However, a social platform generates a huge amount of content all the time, and the amount of content that can be presented by a user terminal is limited, and if several items of content of the head cannot arouse the user's interest, the user is likely to turn to other editions or other platforms. Therefore, how to display the content that best matches the user's interest in a limited head position is an important means for improving the content recommendation effect.

At present, information flow recommendation scenes such as information flows of trembling, fast hands and the like, and commercial products such as Taobao, Jingdong and the like which are frequently seen have the centralized characteristic, namely, a few candidate contents are exposed to most users, so that user behaviors are easily accumulated on a single candidate item. In this case, clustering is easier to achieve using traditional personalization methods — collaborative filtering.

In order to solve the problem, on the basis of collaborative filtering, a user behavior sequence is further introduced to correct the user interest distribution, and the sum (or average or mapping through a neural network) of the user feature vectors obtained through collaborative filtering and the feature vectors of the user behavior sequence is crossed, so that stronger personalized learning capacity is obtained.

However, in some high-traffic social platforms, a huge amount of new content is generated within a short time interval, and the life cycle of each piece of new content is very short, for example, at least millions of pieces of new content may be generated and released each day, and the life cycle of each piece of content is usually not more than two days. In this case, the user behavior is shared by a large number of candidate contents, so that it is difficult for a single candidate content to accumulate enough data, and the traditional method is also more difficult to generate collaborative filtering, and cannot acquire effective feature vectors through collaborative filtering, so that the difficulty in learning the user interest is greatly increased. For example, the inventor of the present invention performs a simulation experiment to perform content ranking and recommendation by combining the conventional collaborative filtering with user behavior sequence correction, and finds that this method can only learn weak personalization, and cannot actually improve the recommendation effect at all.

Disclosure of Invention

In order to solve the above problems, the present invention provides a content recall scheme, in which a content dense vector and a content sparse vector are spliced and trained through a neural network model to generate an interest feature vector of a user, and candidate contents are scored by using the trained neural network model to determine a recall result from the candidate contents. In addition, a unique storage structure is designed to store and search the interest feature vectors, and the outdated interest feature vectors are automatically deleted. Through the modes, the scheme of the invention solves the problem that the interest of the user is difficult to learn in the recall stage on the large-flow social platform in a decentralized scene, and can provide more accurate content recommendation for the user.

According to one aspect of the present invention, a content recall method is provided. The method comprises the following steps: determining a training sample of a user based on a streaming log of a social platform, wherein the streaming log records user behaviors of the user on a plurality of contents of the social platform, the training sample comprises a first training sample and a second training sample, the first training sample comprises a first user dense vector and a first user behavior sequence vector, and the second training sample comprises a first content dense vector and a first content sparse vector; training a first neural network model based on the first training samples to generate a first training target vector, the first training target vector comprising indexes of a second user dense vector and a second user behavior sequence vector; training a second neural network model based on the second training samples to produce a second training target vector, the second training target vector comprising a second content dense vector and a second content sparse vector as feature vectors of interest and the second neural network model being identical to the first neural network model; determining training outputs of the training samples based on the first and second training target vectors to update weight functions of the first and second neural network models, respectively; and calculating scores of a plurality of candidate contents based on the trained first neural network model and the trained second neural network model respectively to determine a recall result from the plurality of candidate contents.

According to another aspect of the invention, a computing device is provided. The computing device includes: at least one processor; and at least one memory coupled to the at least one processor and storing instructions for execution by the at least one processor, the instructions when executed by the at least one processor causing the computing device to perform steps according to the above-described method.

According to yet another aspect of the present invention, a computer-readable storage medium is provided, having stored thereon computer program code, which when executed performs the method as described above.

In some embodiments, wherein determining a training sample for a user based on a streaming log of a social platform comprises: obtaining a streaming log of the social platform; extracting a log feature set of the user from the streaming log, the log feature set comprising a user feature set of the user and a content feature set of a plurality of contents for which the user behavior is directed; determining the first dense user vector based on the set of user features; determining the first content dense vector based on the set of content features; randomly determining the first content sparse vector; obtaining a user behavior sequence, and determining a first user behavior sequence vector based on a first content dense vector of content for which each user behavior is directed; splicing the first user dense vector and the first user behavior sequence vector into the first training sample; and concatenating the first content dense vector and the first content sparse vector into the second training sample.

In some embodiments, wherein training a first neural network model based on the first training samples to generate a first training target vector comprises: inputting the first training sample into a first dense layer of the first neural network model to obtain the first training target vector.

In some embodiments, wherein training a second neural network model based on the second training samples to generate a second training target vector comprises: inputting the second training sample into a second dense layer of the second neural network model to obtain the second training target vector.

In some embodiments, wherein determining the training output of the training sample based on the first training target vector and the second training target vector comprises: performing a dot product operation on the first training target vector and the second training target vector to determine a training output of the training sample.

In some embodiments, wherein updating the weight functions of the first and second neural network models, respectively, based on the training output comprises: determining a score for the training output using an activation function; determining a gradient of the training output at a last layer of the first and second neural network models based on a score of the training output, a sample label of the training sample, and a loss function of the first and second neural network models; and updating the weight function of each layer of the first and second neural network models based on the gradient of the training output at the last layer of the first and second neural network models.

In some embodiments, wherein calculating scores for a plurality of candidate content based on the trained first and second neural network models, respectively, to determine recall results from the plurality of candidate content comprises: for each candidate content in the plurality of candidate contents, determining a score of the candidate content by using the trained activation functions of the first neural network model and the second neural network model; and determining a recall result from the plurality of candidate content based on the score for each of the plurality of candidate content.

In some embodiments, the method further comprises: and putting the second content dense vector and the second content sparse vector into an interest feature vector pool as the interest feature vector of the user, wherein the interest feature vector pool comprises a first-in first-out queue and a closed hash table, the first-in first-out queue comprises a list of a plurality of content IDs, the closed hash table comprises a plurality of entries, and each entry comprises a content ID as a key field and a content as a value field.

In some embodiments, wherein placing the second content dense vector and second content sparse vector as the interest feature vector of the user into an interest feature vector pool comprises: determining whether the first-in first-out queue is full; deleting a first content ID in the first-in-first-out queue and deleting an entry corresponding to the first content ID from the closed hash table if the first-in-first-out queue is full; adding the second content dense vector and the content ID corresponding to the second content sparse vector at the end of the first-in first-out queue and adding the content ID and the content corresponding to the second content dense vector and the second content sparse vector to the closed hash table; and if the first-in first-out queue is not full, directly adding the second content dense vector and the content ID corresponding to the second content sparse vector at the end of the first-in first-out queue and adding the content ID and the content corresponding to the second content dense vector and the second content sparse vector to the closed hash table.

Drawings

The invention will be better understood and other objects, details, features and advantages thereof will become more apparent from the following description of specific embodiments of the invention given with reference to the accompanying drawings.

Fig. 1 shows a schematic diagram of a system for implementing a content recall method according to an embodiment of the present invention.

FIG. 2 illustrates a flow diagram of a content recall method according to some embodiments of the invention.

FIG. 3 shows a flowchart of steps for determining training samples for a user, according to an embodiment of the invention.

Fig. 4 shows a schematic structural diagram of a first neural network model according to an embodiment of the present invention.

FIG. 5 shows a structural schematic diagram of a second neural network model, according to an embodiment of the invention.

FIG. 6 shows a flowchart of the steps of updating the weighting functions of the first and second neural network models, according to an embodiment of the present invention.

FIG. 7 shows a flowchart of steps for determining recall results from candidate content according to some embodiments of the present invention.

FIG. 8 shows a flowchart of the steps of placing an interest feature vector into a pool of interest feature vectors, according to an embodiment of the present invention.

FIG. 9 illustrates a block diagram of a computing device suitable for implementing embodiments of the present invention.

Detailed Description

Preferred embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

In the following description, for the purposes of illustrating various inventive embodiments, certain specific details are set forth in order to provide a thorough understanding of the various inventive embodiments. One skilled in the relevant art will recognize, however, that the embodiments may be practiced without one or more of the specific details. In other instances, well-known devices, structures and techniques associated with this application may not be shown or described in detail to avoid unnecessarily obscuring the description of the embodiments.

Throughout the specification and claims, the word "comprise" and variations thereof, such as "comprises" and "comprising," are to be understood as an open, inclusive meaning, i.e., as being interpreted to mean "including, but not limited to," unless the context requires otherwise.

Reference throughout this specification to "one embodiment" or "some embodiments" means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one embodiment. Thus, the appearances of the phrases "in one embodiment" or "in some embodiments" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the terms first, second and the like used in the description and the claims are used for distinguishing objects for clarity, and do not limit the size, other order and the like of the described objects.

Fig. 1 shows a schematic diagram of a system 1 for implementing a content recall method according to an embodiment of the invention. As shown in fig. 1, the system 1 includes a user terminal 10, a computing device 20, a server 30, and a network 40. User terminal 10, computing device 20, and server 30 may interact with data via network 40. Here, each user terminal 10 may be a mobile or fixed terminal of an end user, such as a mobile phone, a tablet computer, a desktop computer, or the like. The user terminal 10 may communicate with a server 30 of the social platform, for example, through a social platform application installed thereon, to send information to the server 30 and/or receive information from the server 30. The computing device 20 performs corresponding operations based on data from the user terminal 10 and/or the server 30. The computing device 20 may include at least one processor 210 and at least one memory 220 coupled to the at least one processor 210, the memory 220 having stored therein instructions 230 executable by the at least one processor 210, the instructions 230, when executed by the at least one processor 210, performing at least a portion of the method 100 as described below. Note that herein, computing device 20 may be part of server 30 or may be separate from server 30. The specific structure of computing device 20 or server 30 may be described, for example, in connection with FIG. 9, below.

FIG. 2 illustrates a flow diagram of a content recall method 100 according to some embodiments of the invention. The method 100 may be performed, for example, by the computing device 20 or the server 30 in the system 1 shown in fig. 1. The method 100 is described below in conjunction with fig. 1-9, with an example being performed in the computing device 20.

As shown in fig. 2, method 100 includes step 110, where computing device 20 may determine a training sample for a user based on a streaming log of a social platform. As previously described, the social platform may be a social platform with high traffic and short life cycle content, and thus, the streaming log may be a minute-level streaming log of the social platform in which user behavior of a plurality of users on a large amount of dispersed content is recorded. For a particular user, the streaming log records the user's behavior of the user with respect to a plurality of content on the social platform. Here, the content may refer to pictures, videos, texts, or a combination thereof uploaded by users of the social platform, and may also be referred to as "posts". The user behavior may include at least one of a click, a like, a comment, a favorite, a forward, and the like, for example. Further, the user behavior may also include not performing any user operation on the content.

In an aspect of the present invention, each training sample may include a first training sample and a second training sample, wherein the first training sample includes a first user dense vector and a first user behavior sequence vector, the second training sample includes a first content dense vector and a first content sparse vector, and different models are trained using the first training sample and the second training sample, as described in detail below.

FIG. 3 shows a flowchart of step 110 for determining training samples for a user, according to an embodiment of the invention.

As shown in fig. 3, step 110 may include sub-step 111, where computing device 20 may obtain a streaming log of the social platform. The streaming log may be stored, for example, in the server 30 of the system 1 shown in fig. 1 or in a database (not shown in the figure) associated with the server 30.

Next, at sub-step 112, computing device 20 may extract the user's log feature set from the streaming log obtained in sub-step 111. Here, the log feature set of the user includes a user feature set of the user and content feature sets of a plurality of contents for which the user behavior of the user is directed. The user feature set may include attribute information of the user, such as age, gender, city of the user, and the like. The content feature set may include attribute information of each content, such as pictures, videos, texts, word numbers, languages, authors, etc. contained in the content.

In sub-step 113, computing device 20 may determine a user dense vector (also referred to herein as a first user dense vector for purposes of differentiation from a trained user dense vector) based on the set of user features. The dense user vector may be obtained, for example, by performing a stitching and dense vectorization (dense) operation on the user features in the user feature set, and is used to indicate the user features with higher occurrence probability (such as user name, gender, city to which the user belongs, and the like). Wherein, dense operation refers to the operation of obtaining the fully connected vector of the input vector by using the dense function in Tensorflow.

In one embodiment, the user dense vector may be determined by equation (1) as follows:

e^u＝dense(concat({e^f|f∈F_u})) (1)

wherein e is^uRepresenting dense vectors of users, e^fRepresenting a user feature vector, F_uRepresenting a user feature set, a concat () function for concatenating multiple input data (i.e., user features in the user feature set), and a dense () function for adjusting the input data x (i.e., concat ({ e))^f|f∈F_u})) dimension.

Herein, the dense () function can be generically expressed as:

dense(x)＝W·x+b (2)

where W represents the weight of the dense () function, b represents the bias of the dense () function, and W and b are trainable parameters.

At sub-step 114, computing device 20 may determine a content dense vector (also referred to herein as a first content dense vector for purposes of differentiation from a trained content dense vector) based on the set of content features.

Similar to the dense vector of the user, the dense vector of the content may also be obtained by performing a splicing and dense vectorization (dense) operation on the content features in the content feature set, so as to indicate the content features with higher occurrence probability (such as content length, content praise number, author gender, etc.). Generally, content features having a higher probability of having the same feature value within one training Batch (Batch) are referred to as content dense features. For example, in a training batch, the content feature with female gender as author appears in a very high probability in the whole training batch, so the content feature can be regarded as a content-dense feature.

In one embodiment, the content dense vector may be determined by equation (3) as follows:

e^p＝dense(concat({e^g|g∈F_p})) (3)

wherein e is^pRepresenting content dense vectors, e^gRepresenting a content feature, F_pRepresenting a content feature set, a concat () function for concatenating multiple input data (i.e., content features in the content feature set), and a dense () function for adjusting the input data x (i.e., concat ({ e))^g|g∈F_p})) dimension.

In sub-step 115, computing device 20 may randomly determine a content sparse vector (also referred to herein as a first content sparse vector for distinction from a trained content sparse vector).

Content sparse vector e as opposed to content dense vector^sFor indicating content features with low probability of occurrence (e.g., content ID, content distribution city, content category tag (tag), etc.). Generally, content features having a low probability of having the same feature value within one training Batch (Batch) are referred to as content sparse features. For example, within a training batch, the same content ID has a low probability of occurring within the whole training batch, and therefore the content feature can be used as a content sparse feature.

At sub-step 116, computing device 20 may obtain a sequence of user behaviors and determine a user behavior sequence vector (also referred to herein as a first user behavior sequence vector for the purpose of distinguishing from a trained user behavior sequence vector) based on a first content dense vector of content for which each user behavior is intended.

In one embodiment, the user behavior sequence vector may be determined by equation (4) as follows:

wherein e is^rRepresenting a sequence of user actions, S_uA set of contents representing the user on which the user performed the user behavior, eⁱA content dense vector representing a content on which the user performed a user action (i.e., the above-described first content dense vector e)^p)。

In sub-step 117, computing device 20 may density the first user vector e^uAnd a first user behavior sequence vector e^rStitching into a first training sample x_in1。

Here, the first user dense vector e may be populated using the concat () function as described above^uAnd a first user behavior sequence vector e^rMake a splice, i.e.

x_in1＝concat(e^u，e^r) (5)

Similarly, in sub-step 118, computing device 20 may density the first content dense vector e^pAnd a first content sparse vector e^sStitching into a second training sample x_in2。

Here, the first content dense vector e may be dense using the concat () function as described above^pAnd a first content sparse vector e^sMake a splice, i.e.

x_in2＝concat(e^p，e^s) (6)

Next, continuing with FIG. 2, at step 120, computing device 20 may base on first training sample x_in1The first neural network model is trained to produce a first training target vector.

Fig. 4 shows a schematic structural diagram of a first neural network model 400 according to an embodiment of the present invention. As shown in fig. 4, the first neural network model 400 may include an input layer 410, a hidden layer 420, and an output layer 430.

In the model training phase, the first neural network model 400 may input the plurality of first training samples x constructed as described above_in1(including the first user dense vector e^uAnd a first user behavior sequence vector e^r)。

In the model use phase, the first neural network model 400 may be input forMultiple candidate content constructs with a first training sample x_in1Similar first input vectors, and outputs a first output target vector for each first input vector. The first neural network model 400 is described below mainly in terms of a model training phase.

In one embodiment, in step 120, computing device 20 may pass first training sample x through input layer 410_in1The hidden layer 420 of the first neural network model 400 is input to obtain a first training target vector.

In one embodiment, the hidden layer 420 may comprise a dense layer, and the first training target vector may be determined by equation (7) as follows:

x_out1＝dense(x_in1) (7)

in other embodiments, the first training sample x_in1Input in Batch form (i.e., the previously described training Batch Batch), and thus input X of the first neural network model 400_in1Can be represented as a matrix of B x N:

X_in1＝{x_in1} (8)

where B is the first training sample x of the batch input_in1N is the first training sample x_in1Length of (i.e. first user dense vector e)^uAnd a first user behavior sequence vector e^rThe sum of the lengths of (a) and (b).

In this case, the first training target vector may be represented as:

X_out1＝dense(X_in1) (9)

in one example, the hidden layer 420 may be a matrix of (N +1) × 32 (except for the first training sample x)_in1Includes a dimension for offset in addition to length N), input X_in1Performing matrix multiplication with the hidden layer 420 to obtain a matrix X of B32_out1。

Thus, the first neural network model 400 is trained once, which is equivalent to a model in view of the model

X_out1＝W₁*X_in1+b₁ (10)

Wherein, W₁Is a weight function of the first neural network model 400, b₁Is a bias function of the first neural network model 400, the first neural network model 400 being trained such that its weight function W₁And a bias function b₁Continuously updates to a convergence value. Here, the weight function W₁The initial value of (b) may be arbitrarily set, or may be set empirically.

At step 130, computing device 20 may base on second training sample x_in2The second neural network model is trained to produce a second training target vector.

FIG. 5 shows a structural schematic diagram of a second neural network model 500, according to an embodiment of the present invention. The second neural network model 500 is substantially the same as the first neural network model 400 shown in fig. 4, and may also include an input layer 510, a hidden layer 520, and an output layer 530.

Similar to the first neural network model 400, in the model training phase, the second neural network model 500 may input the plurality of second training samples x constructed as described above_in2(including the first content dense vector e^pAnd a first content sparse vector e^s) And outputting each second training sample x_in2A corresponding second training target vector.

In the model usage stage, the second neural network model 500 may input a second training sample x constructed for a plurality of candidate contents_in2Similar second input vectors, and outputs a second output target vector for each second input vector. The second neural network model 500 is described below primarily in terms of a model training phase.

In one embodiment, in step 130, computing device 20 may pass a second training sample x through input layer 510_in2The hidden layer 520 of the second neural network model 500 is input to obtain a second training target vector.

In one embodiment, the hidden layer 520 may comprise a dense layer, and the second training target vector may be determined by equation (11) as follows:

X_out2＝dense(X_in2) (11)

in other embodiments, the second training sample x_in2Input in Batch form (i.e., the training Batch Batch described above), and thus input X of the second neural network model 500_in2Can be represented as a matrix of B x N:

X_in2＝{x_in2} (12)

where B is a second training sample x input in bulk_in2N is the second training sample x_in2Length of (i.e. first content dense vector e)^pAnd a first content sparse vector e^sSum of lengths of) i.e. the second training sample x_in2And a first training sample x_in1Are the same length.

In this case, the second training target vector may be represented as:

X_out2＝dense(X_in2) (13)

the hidden layer 520 is the same size as the hidden layer 420. Thus, in the case where the hidden layer 420 is a matrix of (N +1) × 32, the hidden layer 520 should also be a matrix of (N +1) × 32, with the input X_in2After matrix multiplication with the hidden layer 520, a matrix X of B32 is obtained_out2。

Thus, the second neural network model 500 is trained once, which is equivalent to a model from the viewpoint of model

X_out2＝W₂*X_in2+b₂ (14)

Wherein, W₂Is a weight function of the second neural network model 500, b₂Is a bias function of the second neural network model 500, and the second neural network model 500 is trained so that its weight function W₂And a bias function b₂Continuously updates to a convergence value. Here, the weight function W₂The initial value of (b) may be arbitrarily set, or may be set empirically.

At step 140, computing device 20 may base on first training target vector x_out1/X_out1And a second training target vector x_out2/X_out2To determine the training output of the training sampleOut, as a weighting function W for the first neural network model 400 and the second neural network model 500₁And W₂And (6) updating.

In one embodiment, a first training target vector x may be paired_out1/X_out1And a second training target vector x_out2/X_out2A dot product operation is performed to determine a training output of the training samples. For example, the training output x_outCan be expressed as:

x_out＝x_out1·x_out2 (15)

the result of the dot product operation of two vectors is a number, so that the training output x can be used_outTo represent the score of the training sample.

Similarly, in the case of batch training, the first training target vector X is trained_out1And a second training target vector X_out2The result is a B x 1 vector, each value in the vector representing the score of a corresponding one of the B training samples.

FIG. 6 illustrates a weighting function W for the first neural network model 400 and the second neural network model 500, according to an embodiment of the present invention₁And W₂A flow chart of the update taking step 140. Note that the weighting function W for the first neural network model 400 and the second neural network model 500₁And W₂The updating process is substantially the same, and therefore the following description is given with respect to the weighting function W of the first neural network model 400₁The description will be made by taking an update as an example.

As shown in FIG. 6, step 140 may include sub-step 142 in which computing device 20 determines training output x using an activation function_outIs scored.

In one embodiment, the activation function may be a Sigmoid function. The Sigmoid function can be expressed as:

as previously describedThe training output x_outIs a numerical value representing the score of the corresponding training sample, and is output x to the training by using the activation function_outThe processing makes the obtained result more suitable for the second classification.

For example, the training output x_outThe score of (c) can be determined by the following formula (17):

S＝Sigmoid(x_out) (17)

in sub-step 144, computing device 20 may output x based on the training_outScore of S, training sample x_inAnd the loss functions of the first neural network model 400 and the second neural network model 500 determine the training output x_outGradients at the last layer of the first neural network model 400 and the second neural network model 500.

Here, the same loss function, which may be a mean square error loss function or a cross entropy loss function, may be set for the first neural network model 400 and the second neural network model 500. The sub-step 154 is described below by taking a cross entropy loss function as an example, the effect of combining the cross entropy loss function and the Sigmoid activation function is better, and the defect caused by the dispersion of the Sigmoid activation function can be avoided. It will be appreciated by those skilled in the art that the same inventive concept is equally applicable to the mean square error loss function.

In one embodiment, the cross-entropy Loss function Loss may be expressed as:

wherein

Represents the training output x_outIs 1. Here, the sample label is determined according to whether the user performs a user action. For example, if the user performs a click operation on a content, the sample label of the training sample corresponding to the content may be set to 1, whereas if the user performs a click operation on the content, the sample label of the training sample corresponding to the content may be set to 1If the content is presented but not clicked by the user, the sample label of the training sample corresponding to the content may be set to 0, i.e. such sample label represents the click rate of the content.

In some embodiments of the present invention, a back propagation algorithm is used to weight functions W for the various layers of the first and second

neural network models

400 and 500^mAnd a bias function b^m(M-1, 2, … … M), where M is the number of layers of the first and second

neural network models

400, 500. Although only one hidden layer 420 and 520 (i.e., M ═ 1) is shown in fig. 4 and 5, those skilled in the art will appreciate that the first and second

neural network models

400 and 500 may contain more hidden layers, or the hidden layers themselves may have more complex hierarchies.

Thus, in sub-step 144, the cross entropy Loss function Loss and the weighting function W of the Mth layer may be based on^MAnd a bias function b^MTo determine the gradient of the last layer (mth layer) of the first and second

neural network models

400 and 500

And

next, at sub-step 146, computing device 20 may output x based on the training_outThe weight function of each of the plurality of layers of the first and second

neural network models

400 and 500 is updated for the gradient at the last layer of the first and second

neural network models

400 and 500.

Specifically, the gradient of the mth layer based on the first and second

neural network models

400 and 500 may be determined using any one of a Batch (Batch), mini-Batch (mini-Batch), or stochastic gradient descent method

And

the gradients of the M-1 st layer, the M-2 nd layer and the … … 1 st layer are determined in turn, and the weight function W of each layer is used for the layer^m(and bias function b^m) And (6) updating.

The above operation of step 140 is repeated based on the preset iteration step size until the maximum number of iterations is reached or the threshold for stopping the iterations is reached. To this end, the weighting functions W (and bias functions b) of the first and second

neural network models

400 and 500 are trained to a converged value, which can be used to score new content.

Continuing with FIG. 2, at step 150, computing device 20 may calculate scores for a plurality of candidate content based on the trained first and second

neural network models

400 and 500, respectively, to determine recall results from the candidate content.

FIG. 7 shows a flowchart of step 150 for determining recall results from candidate content according to some embodiments of the present invention.

As shown in fig. 7, step 150 may include sub-step 152 in which computing device 20 may determine, for each candidate content of the plurality of candidate contents, a score for the candidate content using the activation functions of trained first and second

neural network models

400 and 500. Here, the candidate content may be new content that is continually generated on the social platform over time. The candidate content may be massive, and the purpose of the recall described herein is to initially pick some of the content from the recall as the result of the recall. Further, the present invention may be combined with subsequent ranking or recommendation methods to obtain more accurate content recommendation results.

The process of determining the score of the candidate content is similar to the process of determining the training output x described above in step 140_outThe process of scoring S is substantially the same, and thus is not described in detail.

Next, at sub-step 154, computing device 20 may determine a recall result from the plurality of candidate content based on the score of each of the candidate content.

For example, the computing device 20 may rank the scores of the candidate contents in order from high to low, and take the contents with the highest scores as the recall result, which may be directly displayed on the display screen of the user terminal 10, or may be combined with a subsequent recommendation or ranking method to further obtain and display a more accurate recommendation result.

As previously mentioned, in social platforms with high-traffic and short-life-cycle content, the content updates too quickly to effectively train out features of interest. In this case, some embodiments of the present invention also design a specific interest feature vector pool to store and update the obtained interest feature vectors.

Specifically, a training output x is obtained in step 140_outAlternatively, after obtaining the second training target vector in step 130, the method 100 may further include step 160, in which the computing device 20 puts the second content dense vector and the second content sparse vector as the interest feature vectors of the user into the interest feature vector pool. Wherein the interest feature vector pool comprises a first-in-first-out queue and a closed hash table. The first-in-first-out queue includes a list of a plurality of content IDs, and the closed hash table includes a plurality of entries, each entry including a content ID as a key field and a content as a value field.

FIG. 8 shows a flowchart of the step 160 of placing the interest feature vector into the interest feature vector pool according to an embodiment of the present invention.

As shown in fig. 8, step 160 may include sub-step 162, where computing device 20 may determine whether the fifo queue is full.

If it is determined that the fifo queue is full (yes determination of sub-step 162), then, in sub-step 164, computing device 20 may delete the first content ID in the fifo queue and delete the entry corresponding to the first content ID from the closed hash table.

Then, in sub-step 166, computing device 20 may add the second content dense vector and the content ID corresponding to the second content sparse vector generated in step 130 at the end of the fifo queue and add the content ID and the content corresponding to the second content dense vector and the second content sparse vector to the closed hash table.

On the other hand, if it is determined that the fifo queue is not full (determination of sub-step 162 is "no"), then computing device 20 may proceed directly to sub-step 166, adding the content IDs and content corresponding to the second content dense vector and second content sparse vector at the end of the fifo queue and adding the content IDs and content corresponding to the second content dense vector and second content sparse vector to the closed hash table.

In this case, in the updating process of step 140, at each iteration, the latest interest feature vector may be taken from the interest feature vector pool as the first content dense vector to generate a new training sample.

By the method, the outdated interest feature vectors can be automatically eliminated, and the interest feature vectors can be quickly searched and calculated under the high flow of millions of live users, so that the model has high-precision personalized learning capability.

FIG. 9 illustrates a block diagram of a computing device 900 suitable for implementing embodiments of the present invention. Computing device 900 may be, for example, computing device 20 or server 30 as described above.

As shown in fig. 9, computing device 900 may include one or more Central Processing Units (CPUs) 910 (only one shown schematically) that may perform various suitable actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM)920 or loaded from a storage unit 980 into a Random Access Memory (RAM) 930. In the RAM 930, various programs and data required for operation of the computing device 900 may also be stored. The CPU 910, ROM 920, and RAM 930 are connected to each other via a bus 940. An input/output (I/O) interface 950 is also connected to bus 940.

A number of components in computing device 900 are connected to I/O interface 950, including: an input unit 960 such as a keyboard, a mouse, etc.; an output unit 970 such as various types of displays, speakers, and the like; a storage unit 980 such as a magnetic disk, optical disk, or the like; and a communication unit 990 such as a network card, a modem, a wireless communication transceiver, or the like. The communication unit 990 allows the computing device 900 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.

The method 100 described above may be performed, for example, by the CPU 910 of the computing device 900 (e.g., computing device 20 or server 30). For example, in some embodiments, the method 100 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 980. In some embodiments, some or all of the computer program can be loaded and/or installed onto computing device 900 via ROM 920 and/or communications unit 990. When loaded into RAM 930 and executed by CPU 910, may perform one or more of the operations of method 100 described above. Further, the communication unit 990 may support wired or wireless communication functions.

Those skilled in the art will appreciate that the computing device 900 shown in FIG. 9 is merely illustrative. In some embodiments, computing device 20 or server 30 may contain more or fewer components than computing device 900.

A content recall method 100 and a computing device 900 that may be used as computing device 20 or server 30 in accordance with the present invention are described above in connection with the figures. However, it will be appreciated by those skilled in the art that the performance of the steps of the method 100 is not limited to the order shown in the figures and described above, but may be performed in any other reasonable order. Further, the computing device 900 need not include all of the components shown in FIG. 9, it may include only some of the components necessary to perform the functions described in the present invention, and the manner in which these components are connected is not limited to the form shown in the figures.

The present invention may be methods, apparatus, systems and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therein for carrying out aspects of the present invention.

In one or more exemplary designs, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. For example, if implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The units of the apparatus disclosed herein may be implemented using discrete hardware components, or may be integrally implemented on a single hardware component, such as a processor. For example, the various illustrative logical blocks, modules, and circuits described in connection with the invention may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both.

The previous description of the invention is provided to enable any person skilled in the art to make or use the invention. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the present invention is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A content recall method, comprising:

determining a training sample of a user based on a streaming log of a social platform, wherein the streaming log records user behaviors of the user on a plurality of contents of the social platform, the training sample comprises a first training sample and a second training sample, the first training sample comprises a first user dense vector and a first user behavior sequence vector, and the second training sample comprises a first content dense vector and a first content sparse vector;

training a first neural network model based on the first training samples to generate a first training target vector, the first training target vector comprising a second user dense vector and a second user behavior sequence vector;

training a second neural network model based on the second training samples to generate a second training target vector, the second training target vector comprising a second content dense vector and a second content sparse vector as interest feature vectors;

determining training outputs of the training samples based on the first and second training target vectors to update weight functions of the first and second neural network models, respectively; and

calculating scores of a plurality of candidate contents based on the trained first neural network model and the trained second neural network model respectively to determine a recall result from the plurality of candidate contents,

wherein training a first neural network model based on the first training samples to generate a first training target vector comprises: inputting the first training sample into a first dense layer of the first neural network model to obtain the first training target vector;

wherein training a second neural network model based on the second training samples to generate a second training target vector comprises: inputting the second training sample into a second denseness layer of the second neural network model to obtain the second training target vector;

wherein determining a training output for the training sample based on the first training target vector and the second training target vector comprises: performing a dot product operation on the first training target vector and the second training target vector to determine a training output of the training sample.

2. The method of claim 1, wherein determining a training sample for a user based on a streaming log of a social platform comprises:

obtaining a streaming log of the social platform;

extracting a log feature set of the user from the streaming log, the log feature set comprising a user feature set of the user and a content feature set of a plurality of contents for which the user behavior is directed;

determining the first dense user vector based on the set of user features;

determining the first content dense vector based on the set of content features;

randomly determining the first content sparse vector;

obtaining a user behavior sequence, and determining a first user behavior sequence vector based on a first content dense vector of content for which each user behavior is directed;

splicing the first user dense vector and the first user behavior sequence vector into the first training sample; and

stitching the first content dense vector and the first content sparse vector into the second training sample.

3. The method of claim 1, wherein updating the weight functions of the first and second neural network models, respectively, based on the training output comprises:

determining a score for the training output using an activation function;

determining a gradient of the training output at a last layer of the first and second neural network models based on a score of the training output, a sample label of the training sample, and a loss function of the first and second neural network models; and

updating a weight function for each layer of the first and second neural network models based on a gradient of the training output at a last layer of the first and second neural network models.

4. The method of claim 1, wherein calculating scores for a plurality of candidate content based on the trained first and second neural network models, respectively, to determine recall results from the plurality of candidate content comprises:

for each candidate content in the plurality of candidate contents, determining a score of the candidate content by using the trained activation functions of the first neural network model and the second neural network model; and

determining a recall result from the plurality of candidate content based on the score for each of the plurality of candidate content.

5. The method of claim 1, further comprising:

and putting the second content dense vector and the second content sparse vector into an interest feature vector pool as the interest feature vector of the user, wherein the interest feature vector pool comprises a first-in first-out queue and a closed hash table, the first-in first-out queue comprises a list of a plurality of content IDs, the closed hash table comprises a plurality of entries, and each entry comprises a content ID as a key field and a content as a value field.

6. The method of claim 5, wherein placing the second content dense vector and second content sparse vector as the user's interest feature vector into an interest feature vector pool comprises:

determining whether the first-in first-out queue is full;

deleting a first content ID in the first-in-first-out queue and deleting an entry corresponding to the first content ID from the closed hash table if the first-in-first-out queue is full;

adding the second content dense vector and the content ID corresponding to the second content sparse vector at the end of the first-in first-out queue and adding the content ID and the content corresponding to the second content dense vector and the second content sparse vector to the closed hash table; and

and if the first-in first-out queue is not full, directly adding the second content dense vector and the content ID corresponding to the second content sparse vector at the end of the first-in first-out queue and adding the content ID and the content corresponding to the second content dense vector and the second content sparse vector to the closed hash table.

7. A computing device, comprising:

at least one processor; and

at least one memory coupled to the at least one processor and storing instructions for execution by the at least one processor, the instructions when executed by the at least one processor causing the computing device to perform the steps of the method of any of claims 1-6.

8. A computer readable storage medium having stored thereon computer program code which, when executed, performs the method of any of claims 1 to 6.