CN112085093A

CN112085093A - Training method and device of collaborative filtering model, readable medium and system

Info

Publication number: CN112085093A
Application number: CN202010936001.2A
Authority: CN
Inventors: 姚权铭
Original assignee: 4Paradigm Beijing Technology Co Ltd
Current assignee: 4Paradigm Beijing Technology Co Ltd
Priority date: 2020-09-08
Filing date: 2020-09-08
Publication date: 2020-12-15

Abstract

A method and a device for training a collaborative filtering model, a readable medium and a system are provided. The training method comprises the following steps: providing a sample buffer, wherein the sample buffer stores a portion of samples in a sample space, the portion of samples including S₁A negative sample, S₁Is a positive integer; in each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer. The disclosure of the inventionThe technical scheme can sample high-quality negative samples, thereby realizing efficient and robust collaborative filtering training.

Description

Training method and device of collaborative filtering model, readable medium and system

Technical Field

The present disclosure relates to the field of computer application technologies, and in particular, to a method and an apparatus for training a collaborative filtering model, a readable medium, and a system.

Background

Recommendation systems are widely used in a variety of scenarios. For example, the recommender system may utilize an e-commerce website to provide merchandise information and recommendations to the customer, help the user decide what products should be purchased, and simulate sales personnel to help the customer complete the purchase process. The personalized recommendation is to recommend information and commodities which are interested by the user to the user according to the interest characteristics and purchasing behaviors of the user. Objects that may be recommended include merchandise, advertisements, news, music, and the like.

Collaborative Filtering (CF) is a key technique for personalized recommendation systems, focusing on learning user preferences from observed user-item interactions. Current recommendation systems also focus on implicit user feedback, such as shopping in an e-commerce website or viewing on an online video platform. Implicit user feedback is easier to collect than explicit user feedback, such as scoring. In the implicit CF model scenario, only what the user clicked on can be obtained, and what the user did not click on does not represent dislike of the user, but such feedback data cannot be obtained. Therefore, when CF is performed, it is necessary to rank the known observed pair (pair) of (user, item) ahead of the unknown pair. However, since the number of unknown observed pairs is huge, sampling a part of such (user, article) pairs is a necessary way for implicit CF model training, so how to find a proper negative sample is a very critical problem in the training process of the CF model. All unmarked or unobserved pairs can be used as negative samples, so that the number of potential negative samples is huge, how to select effective negative samples not only affects the efficiency of the CF algorithm, but also the effective negative samples can obviously improve the effect of the co-CF.

In the prior art, a uniform sampling (uniform random) method is mostly adopted to extract negative samples. And training the CF model according to the extracted negative samples. However, in the method of extracting negative samples in the uniform sampling manner adopted in the prior art, the extracted negative samples may have poor performance, and the evaluation function has a low score on the negative samples.

Disclosure of Invention

According to an exemplary embodiment of the present disclosure, there is provided a training method of a collaborative filtering model, which may include: providing a sample buffer, wherein the sample buffer stores a portion of samples in a sample space, the portion of samples including S₁A negative sample, S₁Is a positive integer; in each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

The sample space may include a positive sample space and a negative sample space and the samples therein are pairs with respect to user-item interactions, and the collaborative filtering model may recommend respective items to the user.

The step of setting the sample buffer may comprise: uniform sampling S from negative sample space₁Putting a negative sample into the sample buffer; or selecting S from the negative sample space based on an evaluation function₁The more highly scored negative samples are placed in the sample buffer.

The step of updating the sample buffer may comprise: uniform sampling S from a specified sample space₂A negative sample, S₂Is a positive integer; from the S already in the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁A negative sample; using selected S₁Negative samples to update the sample buffer.

From the S already in the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁The step of negative sampling may include: calculating S by adopting an evaluation function for each positive sample sampled from a batch in a positive sample space₁+S₂The scores of the negative samples; according to the S₁+S₂The score of each negative sample is from S₁+S₂Selecting S from the negative sample₁A negative example.

According to the S₁+S₂Scoring negative examples to select S₁The step of negative sampling may include: from said S₁+S₂Top S with highest score was selected from among negative samples₁A negative sample; or for said S₁+S₂Each negative sample of the negative samples based on the S₁+S₂Calculating the extraction probability of each negative sample according to the scores of the negative samples, and according to the S₁+S₂The extraction probability corresponding to each negative sample is sequentially from the S₁+S₂Extracting S from the negative sample₁A negative example.

The step of selecting negative samples to participate in the collaborative filtering model training may include: calculating the prediction variance of each negative sample in the sample buffer; and selecting the negative sample with high prediction variance from the sample buffer as the negative sample participating in the collaborative filtering model training according to the calculated prediction variance of each negative sample.

The step of calculating the prediction variance for each negative sample in the sample buffer may comprise: obtaining the scores of all negative samples in the sample buffer at different training time or the probability that all negative samples are positive samples; and calculating the prediction variance of each negative sample based on the score of each negative sample in the sample buffer at different training time or the probability that each negative sample is a positive sample.

According to another exemplary embodiment of the present disclosure, there is provided a training apparatus of a collaborative filtering model, which may include: the initialization module is used for setting a sample buffer, wherein the sample buffer stores a part of samples in a sample space, and the part of samples comprises S₁A negative sample, S₁Is a positive integer; and a training module for, during each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

The initialization module may uniformly sample S from the negative sample space₁Putting a negative sample into the sample buffer; or S may be selected from the negative sample space based on an evaluation function₁The more highly scored negative samples are placed in the sample buffer.

The training module may uniformly sample S from a specified sample space₂A negative sample, S₂Is a positive integer; from the S already in the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁A negative sample; and using the selected S₁Negative samples to update the sample buffer.

The training module may calculate S using an evaluation function for each positive sample sampled in batches from the positive sample space₁+S₂The scores of the negative samples; and according to said S₁+S₂The score of each negative sample is from S₁+S₂Selecting S from the negative sample₁A negative example.

The training module can select from the S₁+S₂Top S with highest score was selected from among negative samples₁A negative sample; or may be for said S₁+S₂Each negative sample of the negative samples based on the S₁+S₂The score of each negative sample calculates the extraction probability of each negative sample; according to said S₁+S₂The extraction probability corresponding to each negative sample is sequentially from the S₁+S₂Extracting S from the negative sample₁A negative example.

The training module can calculate the prediction variance of each negative sample in the sample buffer; and selecting the negative sample with high prediction variance from the sample buffer as the negative sample participating in the collaborative filtering model training according to the calculated prediction variance of each negative sample.

The training module can obtain the scores of all negative samples in the sample buffer at different training time or the probability that all negative samples are positive samples; and calculating the prediction variance of each negative sample based on the score of each negative sample in the sample buffer at different training time or the probability that each negative sample is a positive sample.

According to another exemplary embodiment of the present disclosure, a computer-readable storage medium storing instructions is provided, wherein the instructions, when executed by at least one computing device, cause the at least one computing device to perform the training method as previously described.

According to another exemplary embodiment of the present disclosure, a system is provided comprising at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one computing device to perform the training method as previously described.

By applying the training method and the training device according to the exemplary embodiment of the disclosure, high-quality negative samples can be effectively sampled, and the robustness of the collaborative filtering model is enhanced, so that the accuracy and the efficiency of recommendation are improved.

Drawings

The above and other aspects, features and advantages of particular embodiments of the present disclosure will become more apparent from the following description when taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart illustrating a collaborative filtering model training method according to an exemplary embodiment of the present disclosure;

fig. 2 is a block diagram illustrating a collaborative filtering model training apparatus according to an exemplary embodiment of the present disclosure.

Throughout the drawings, it should be noted that the same reference numerals are used to designate the same or similar elements, features and structures.

Detailed Description

The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of the embodiments of the disclosure as defined by the claims and their equivalents. Various specific details are included to aid understanding, but these are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.

In the present disclosure, "at least one of the plurality of items" means a case where three types of parallel operations including "any one of the plurality of items", "a combination of any plurality of the plurality of items", and "the entirety of the plurality of items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.

Training an implicit collaborative filtering model generally involves three main steps, namely selecting a scoring function r, an objective function L, and a negative sample distribution p_ns. Scoring function r (p)_u,q_iBeta) based on the user U e U with learning parameter beta_u∈R^FAnd the embedded vector q of the item I ∈ I_i∈R^FTo calculate the association/interaction between user u and item i. The scoring function may be selected from Matrix Factorization (MF), multi-layer sensing Machine (MLP), or Graph Neural Network (GNN), among others. Scoring function r (p)_u,q_iβ), the larger the value of β) the more interesting the user u is for the item i. For simplicity, the scoring function r (p) will be utilized_u,q_iβ) calculating the association between user u and item i, denoted r_ui。

Each observed user u and item i ∈ R_uThe sample of interactions may be considered a positive sample, while for the remaining unobserved interactions, i.e. the

The probability that (u, j) is a negative sample can be represented by the following equation (1):

P_neg(j|u,i)＝sigmoid(r_ui-r_uj) (1)

wherein when r is_uiFar greater than r_ujWhen is, P_neg(j | u, i) is close to 1. In other words, when learning user preferences in implicit CF, instead of r, more attention is paid to the pairwise association between an observed interaction (u, i) and another unobserved interaction (u, j)_uiAnd r_ujAbsolute value of (a).

The learning objective may be formulated to minimize the following loss function (i.e., an objective function), which may be expressed as the following equation (2):

wherein the negative samples (u, j) can be distributed according to a given distribution p_ns(j | u) is sampled. General purpose of p_ns(j | u) are all distributions. Learning the above-mentioned objective is equivalent to observing such a pairing relationship r_ui＞r_ujThe likelihood of (c) is maximized. Other objective functions in the implicit CF problem may be used instead of the above-described objective function.

Uniform sampling (uniform random) is the most commonly used negative sample sampling method. In addition, it is also disclosed in the prior art that a Generative Adaptive Network (GAN) is used to replace the uniform sampling, and achieve better embedding effect. Among the most important observations is "the scores of most negative samples (the scores obtained from the scoring functions) are small". Therefore, in uniform sampling, the negative samples drawn are likely to have a small score, which results in the training of the CF model only being able to sample negative samples of low quality, and the problem that the gradient is zeroed too quickly without the zeroing in the training of the CF model, thereby prematurely terminating the training of the algorithm of the CF model. GAN solves this problem by training the proto-model and, at the same time, also a generator (generator) constructed from a neural network. In particular, the generator will give a fractional distribution of negative examples, which will be drawn on the basis of this. The generator can well simulate the distribution of the negative samples, so that the negative samples with higher scores can be effectively given, and the problem of uniform sampling is avoided. Although GAN can draw more highly scored negative examples, GAN cannot robustly handle false negative examples because false negative examples may also score higher. In addition, the selection of negative samples is a discrete optimization problem, and reinforcement learning must be used to train GAN-based CF models; the GAN-based approach is also difficult to train because reinforcement learning itself is very unstable and trains slowly.

Furthermore, the problem that has been neglected by prior work is that there is no need to model the complex overall distribution of negative examples. Especially in the later period of training, useful are only negative samples with large scores, and the negative samples only account for a small proportion of all the negative samples. GAN can model the distribution as a whole, but at the same time it takes a lot of time and parameters on small negative examples that are not useful. Therefore, GAN-based models are inefficient.

In view of the above, the present disclosure is mainly studied to design a more robust and simple negative sampling method from the following three aspects:

1. how to obtain the dynamic distribution of the true negative sample by using a simple model. In the implicit CF problem, true negative samples are hidden in a large amount of unobserved/unmarked data along with false negative samples. While negative examples in other areas follow a skewed distribution and can be modeled by simple models, it is not known whether the prior art can be applied to implicit CF problems where only true negative examples are expected to be used.

2. How to reliably measure the mass of the negative sample. In view of the risk of introducing false negative examples, there is a need for a more reliable way to measure the quality of negative examples, i.e. to devise a discrimination criterion that can accurately identify true negative examples with high quality.

3. How to effectively sample high quality true negative samples. While general machine learning methods (such as sample quadratic weighting methods) can learn valid information from unlabeled and noisy data, these methods are not suitable for implicit CF. This is because a large number of untagged user-item interactions require efficient modeling.

Based on the above problems and analysis, the present disclosure proposes a buffer-based model to simply capture the dynamic distribution of true negative samples. For example, a computer cache would be utilized to retain only high-score negative examples, and then to select high-quality negative examples from the computer cache, thereby achieving an efficient and high-speed learning CF model. In particular, a buffer mechanism is established for each user, the buffer will be dynamically alternated during algorithm iteration during model training, and negative examples will be extracted from the buffer. Meanwhile, the negative samples with high variance are considered to be high in quality and more effective, and the negative samples with high variance are sampled when the negative samples are extracted from the buffer, so that the noise of false negative samples is avoided.

The training scheme of the CF model of the present disclosure is developed in the foregoing background, and the following embodiments may be referred to for specific implementation.

Fig. 1 is a flowchart illustrating a collaborative filtering model training method according to an exemplary embodiment of the present disclosure. The main body of the CF model training method of this embodiment may be a CF model training device, and the CF model training device may be an electronic device having physical entities. In addition, the training device of the CF model of the present embodiment may be a software integrated application.

Referring to fig. 1, in step S101, a sample buffer is provided, wherein the sample buffer stores a portion of samples in a sample space, the portion of samples including S₁A negative sample, S₁Is a positive integer. The sample space may include a positive sample space and a negative sample space, and the samples in the sample space are pairs with respect to user-item interactions.

When initializing the sample buffer, S may be uniformly sampled from the negative sample space₁The negative samples are placed in a sample buffer. Optionally, S may be chosen from the negative sample space based on an evaluation function₁The more highly scored negative samples are placed in the sample buffer. For example, S can be selected using equations (1) and (2) above₁Negative samples are placed in a sample buffer to initialize the sample buffer.

As an example, each user u may be assigned a size S₁Sample buffer M of_uSample buffer M_uIn which negative samples, i.e. M, are stored for sampling_u＝{(u,k₁),(u,k₂),...,(u,kS₁) In which S is₁May be a hyper-parameter.

Furthermore, to improve efficiency, all positive samples of the same user u may be designed to share the sample buffer M_u。

In step S102, in each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

Since the collaborative filtering model is constantly changing during the training process, M needs to be dynamically updated_uTo obtain a desired candidate sample for negative sampling. In particular, S may be uniformly sampled from a specified sample space₂A negative sample, S₂Is a positive integer; from S already in the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁A negative sample; using selected S₁The sample buffer is updated with negative samples. Here, the specified sample space may be a space including negative samples recorded with scores at different training times (such as a plurality of epochs) for subsequent prediction variance calculation. For example, M may be first placed_uIs extended to

Wherein the content of the first and second substances,

may include S being uniformly sampled₂A candidate sample, then from

Selection of S₁Candidate samples difficult to be classified to obtain new M_u. Wherein S is₂May be a hyper-parameter. E.g. S selected from a specified sample space₂A negative sample canWith the scores of the first 5 rounds of training so that the predicted variance of the corresponding negative sample can be calculated.

To guarantee the sample buffer M_uWhere only samples providing useful information are retained, a score-based update strategy may be employed to dynamically update M_uSo that M_uIncluding more diff negative samples. In particular, an evaluation function may be employed to calculate S for each positive sample sampled in a batch from the positive sample space₁+S₂Scoring of individual negative examples, then according to S₁+S₂The score of each negative sample is from S₁+S₂Selecting S from the negative sample₁A negative example. For example, S can be calculated according to equation (1) above₁+S₂The score of each negative sample. However, the above examples are merely exemplary, and the present disclosure is not limited thereto.

At the slave S₁+S₂Selecting S from the negative sample₁When there is a negative sample, can be selected from S₁+S₂Top S with highest score was selected from among negative samples₁A negative example. Alternatively to S₁+S₂Each negative sample of the negative samples is based on S₁+S₂The score of each negative sample calculates the extraction probability of each negative sample according to S₁+S₂The extraction probability corresponding to each negative sample is sequentially from S₁+S₂Extracting S from the negative sample₁A negative sample to realize M_uAnd (4) updating.

For example, S can be obtained by sampling S according to the following probability distribution (3)₁Updating M by negative sample_u：

Wherein the temperature coefficient τ e (0, + ∞) is such that the extraction probability is

A large fraction of the samples are of greater interest.

Furthermore, oversampling the hard negative examples may increase the risk of introducing false negative examples, thereby weakening the robustness of the above-described score-based update strategy. To solve this problem, a robust sampling strategy can be adopted, that is, such noise can be effectively avoided by selecting negative samples with high variance, in consideration of the low variance characteristic of the false negative samples.

The present disclosure relies on a high variance based criterion to retrieve S from stored S₁Sample buffer M with high performance candidates_uNegative samples for each training sample are sampled. Specifically, the prediction variance of each negative sample in the sample buffer is calculated, and the negative sample with high prediction variance is selected from the sample buffer as the negative sample participating in the collaborative filtering model training according to the calculated prediction variance of each negative sample.

Assume that there are positive samples (u, i) and a sample buffer M for user u_uFor the sample buffer M_uEach negative sample (u, k) e M in (c)_uThe effective negative samples are selected as the negative samples participating in the collaborative filtering model training by equation (4) below:

wherein [ P ]_pos(k|u,i)]Represents the set of probabilities, α, that a negative sample (u, k) is positive with respect to a positive sample (u, i) at different training times_tIs a hyper-parameter for controlling the importance of the high variance in the t-th round of training. mean representation for calculating P_pos(k | u, i) mean on the t-th round of training, std for calculating [ P_pos(k|u,i)]The variance of (c).

The step of calculating the prediction variance for each negative sample in the sample buffer may comprise: obtaining the scores of all negative samples in the sample buffer at different training time or the probability that all negative samples are positive samples; and calculating the prediction variance of each negative sample based on the scores of each negative sample in the sample buffer at different training time or the probability that each negative sample is a positive sample. In the present disclosure, different training times may indicate different epochs.

As an example, for storage in the sample buffer M_uOf each negative sample (u, k)Variance measure std [ P ]_pos(k|u,i)]The score or probability for the negative sample (u, k) of the last 5 rounds of training may be used for calculation, for example, using equations (5) and (6) below:

with such an arrangement, the calculation results can be made more stable and the overhead can be kept constant for each sampling operation.

According to embodiments of the present disclosure, the sample uncertainty may be determined based on the last 5 rounds of training. However, the above-described number of wheels is merely exemplary, and the present disclosure is not limited thereto.

In real data experiments, the data set is very large, which is very time consuming if the prediction probabilities/scores are calculated for all user-item pairs at each round of training epoch. Thus, in the present disclosure, the project space may be pruned for each update process of the user's sample buffer to avoid recording the prediction probability/score of all the projects. Specifically, random sampling was performed from the item set that had been generated in the (t-5) th round of training

While recording P at the time of the last 5 rounds of training for the items of the item set that have been produced in the (t-5) th round of training_pos(k | u, i). Thus, the prediction results from the previous iteration can be used directly without any additional forward or backward propagation of r.

Sample buffer M at user u_uIn addition to previous items remaining from previous training, newly added items also have P_posIn the history of the last 5 rounds of training, the variance calculation described above can be achieved.

The training method of the CF model of the present embodiment may be performed in a small batch data mode.

Based on the embodiment shown in FIG. 1, the sample buffer M is based on the above_uThe algorithm iteration is performed according to the algorithm flow shown in table 1 below. Specifically, the algorithm is iterated for T times in a whole manner, and in each iteration, batch sampling data R with the size of B is selected firstly_batch(step 4), then for each at R_batchNegative sampling is performed on the positive samples in (1) based on the sample buffer (step 6-7); then updating the sample buffer (step 8-9); and finally, updating the user embedded vector, the project embedded vector and the learning parameters.

TABLE 1

In the algorithm flow shown in table 1, during each iteration, a negative sample is selected from the sample buffer to perform the iterative training, and then the sample buffer is updated. However, in another embodiment of the present invention, the sample buffer may be updated first during each iteration, and then negative samples are selected from the sample buffer for the current iteration training.

Through the scheme, the sample buffer is updated, so that the negative samples with higher corresponding scores can be reserved in the sample buffer, and then the negative samples with high variance are selected from the updated sample buffer to participate in the training of the CF model, so that the negative samples with high scores and variances can be used for training the CF model, and the training efficiency and robustness of the CF model are improved.

Fig. 2 is a block diagram illustrating a collaborative filtering model training apparatus according to an exemplary embodiment of the present disclosure. Referring to fig. 2, the training apparatus 200 may include an initialization module 201 and a training module 202. Each module in the training apparatus 200 may be implemented by one or more modules, and the name of the corresponding module may vary according to the type of the module. In various embodiments, some modules in training device 200 may be omitted, or additional modules may be included. Furthermore, modules/elements according to various embodiments of the present disclosure may be combined to form a single entity, and thus the functions of the respective modules/elements may be equivalently performed prior to the combination.

The initialization module 201 may be configured to set a sample buffer, wherein the sample buffer stores a portion of samples in a sample space, the portion of samples including S₁A negative sample, S₁Is a positive integer. For example, the initialization module 201 may uniformly sample S from the negative sample space₁The negative samples are placed in a sample buffer, or S can be selected from the negative sample space based on an evaluation function₁The more highly scored negative samples are placed in the sample buffer.

The training module 202 may, during each of at least part of the iterative training process of training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

Training module 202 may uniformly sample S from a specified sample space₂A negative sample, S₂Is a positive integer; from S already in the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁A negative sample; and using the selected S₁The sample buffer is updated with negative samples.

Training module 202 may employ an evaluation function to calculate S for each positive sample sampled from the batch in positive sample space₁+S₂Scoring of individual negative examples, and according to S₁+S₂The score of each negative sample is from S₁+S₂Selecting S from the negative sample₁A negative example.

The training module can be selected from S₁+S₂Top S with highest score was selected from among negative samples₁A negative example, or for S₁+S₂Each negative sample of the negative samples is based on S₁+S₂The score of each negative sample calculates the extraction probability of each negative sample; according to S₁+S₂The extraction probability corresponding to each negative sample is sequentially from S₁+S₂Extracting S from the negative sample₁And (4) counting negative samples, thereby completing the updating of the sample buffer.

The training module 202 may calculate the predicted variance of each negative sample in the sample buffer; and selecting the negative sample with high prediction variance from the sample buffer as the negative sample participating in the collaborative filtering model training according to the calculated prediction variance of each negative sample.

The training module 202 may obtain scores of negative samples in the sample buffer at different training times or probabilities that the negative samples are positive samples, and then calculate the prediction variance of the negative samples based on the scores of the negative samples in the sample buffer at different training times or the probabilities that the negative samples are positive samples.

The implementation principle and technical effect of the CF model training device using the above modules are the same as those of the related method embodiments, and reference may be made to the description of the related method embodiments in detail, which is not repeated herein.

The training method and apparatus of the CF model according to the exemplary embodiment of the present disclosure are described above with reference to fig. 1 to 2. However, it should be understood that: the means shown in the figures may each be configured as software, hardware, firmware, or any combination of the preceding to perform a particular function. These means may correspond, for example, to a dedicated integrated circuit, to pure software code, or to a module combining software and hardware. Further, one or more functions implemented by these apparatuses may also be collectively performed by components in a physical entity device (e.g., a processor, a client, a server, or the like).

Further, the above method may be implemented by instructions recorded on a computer-readable storage medium, for example, according to an exemplary embodiment of the present disclosure, there may be provided a computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the steps of: setting a sample buffer, wherein the sample buffer stores a part of samples in a sample space, the part of samples comprises S1 negative samples, and S1 is a positive integer; in each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

The instructions stored in the computer-readable storage medium can be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, and the like, and it should be noted that the instructions can also be used to perform additional steps other than the above steps or perform more specific processing when the above steps are performed, and the contents of the additional steps and the further processing are mentioned in the description of the related method with reference to fig. 1 to 2, and therefore will not be described again here to avoid repetition.

It should be noted that the CF model training method and apparatus according to the exemplary embodiments of the present disclosure may fully depend on the execution of a computer program or instructions to implement the corresponding functions, that is, each apparatus corresponds to each step in the functional architecture of the computer program, so that the entire system is called by a special software package (e.g., lib library) to implement the corresponding functions.

On the other hand, when the apparatus shown in fig. 2 is implemented in software, firmware, middleware or microcode, program code or code segments to perform the corresponding operations may be stored in a computer-readable medium such as a storage medium, so that at least one processor or at least one computing device may perform the corresponding operations by reading and executing the corresponding program code or code segments.

For example, according to an exemplary embodiment of the present disclosure, a system may be provided comprising at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one computing device to perform the steps of: setting a sample buffer, wherein the sample buffer stores a part of samples in a sample space, the part of samples comprises S1 negative samples, and S1 is a positive integer; in each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

In particular, the above described apparatus and systems may be deployed in servers or clients, as well as on nodes in a distributed network environment. Further, the system may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the set of instructions. In addition, the system may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). In addition, all components of the system may be connected to each other via a bus and/or a network.

The system here need not be a single system, but can be any collection of devices or circuits capable of executing the above instructions (or sets of instructions) either individually or in combination. The system may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).

In the system, the at least one computing device may comprise a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a dedicated processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, the at least one computing device may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like. The computing device may execute instructions or code stored in one of the storage devices, which may also store data. Instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.

The memory device may be integrated with the computing device, for example, by having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage device may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The storage device and the computing device may be operatively coupled or may communicate with each other, such as through I/O ports, network connections, etc., so that the computing device can read instructions stored in the storage device.

While various exemplary embodiments of the present disclosure have been described above, it should be understood that the above description is exemplary only, and not exhaustive, and that the present disclosure is not limited to the disclosed exemplary embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. Therefore, the protection scope of the present disclosure should be subject to the scope of the claims.

Claims

1. A training method of a collaborative filtering model, wherein the training method comprises:

providing a sample buffer, wherein the sample buffer stores a portion of samples in a sample space, the portion of samples including S₁A negative sample, S₁Is a positive integer;

in each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

2. The training method of claim 1, wherein the sample space comprises a positive sample space and a negative sample space and wherein the samples are pairs related to user-item interactions, the collaborative filtering model being used to recommend respective items to a user.

3. The training method of claim 1, wherein the step of setting a sample buffer comprises:

uniform sampling S from negative sample space₁Putting a negative sample into the sample buffer; or

Selecting S from negative sample space based on evaluation function₁The more highly scored negative samples are placed in the sample buffer.

4. The training method of claim 1, wherein the step of updating the sample buffer comprises:

uniform sampling S from a specified sample space₂A negative sample, S₂Is a positive integer;

from the S already in the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁A negative sample;

using selected S₁Negative samples to update the sample buffer.

5. Training method according to claim 4, wherein S is already present from the sample buffer₁Negative sample and sampled S₂Selecting S from the negative sample₁The step of generating negative samples includes:

calculating S by adopting an evaluation function for each positive sample sampled from a batch in a positive sample space₁+S₂The scores of the negative samples;

according to the S₁+S₂The score of each negative sample is from S₁+S₂Selecting S from the negative sample₁A negative example.

6. Training method according to claim 5, wherein according to said S₁+S₂Scoring negative examples to select S₁The step of generating negative samples includes:

from said S₁+S₂Top S with highest score was selected from among negative samples₁A negative sample; or

For said S₁+S₂Each negative sample of the negative samples based on the S₁+S₂Calculating the extraction probability of each negative sample according to the scores of the negative samples, and according to the S₁+S₂The extraction probability corresponding to each negative sample is sequentially from the S₁+S₂Extracting S from the negative sample₁A negative example.

7. The training method of claim 1, wherein the step of selecting negative samples to participate in the collaborative filtering model training comprises:

calculating the prediction variance of each negative sample in the sample buffer;

and selecting the negative sample with high prediction variance from the sample buffer as the negative sample participating in the collaborative filtering model training according to the calculated prediction variance of each negative sample.

8. A training apparatus for collaborative filtering models, wherein the training apparatus comprises:

an initialization module configured to set a sample buffer, where the sample buffer stores a portion of samples in a sample space, and the portion of samples includes S₁A negative sample, S₁Is a positive integer;

a training module for, during each iterative training process of at least part of the iterative training processes for training the collaborative filtering model: and updating the sample buffer, and then selecting the negative sample participating in the collaborative filtering model training from the updated sample buffer, or selecting the negative sample participating in the collaborative filtering model training from the sample buffer, and then updating the sample buffer.

9. A computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform a training method as claimed in any one of claims 1 to 7.

10. A system comprising at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform a training method as claimed in any one of claims 1 to 7.