CN107301247B - Method and device for establishing click rate estimation model, terminal and storage medium - Google Patents

Method and device for establishing click rate estimation model, terminal and storage medium Download PDF

Info

Publication number
CN107301247B
CN107301247B CN201710578583.XA CN201710578583A CN107301247B CN 107301247 B CN107301247 B CN 107301247B CN 201710578583 A CN201710578583 A CN 201710578583A CN 107301247 B CN107301247 B CN 107301247B
Authority
CN
China
Prior art keywords
value
user
click
evaluation index
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710578583.XA
Other languages
Chinese (zh)
Other versions
CN107301247A (en
Inventor
潘岸腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN201710578583.XA priority Critical patent/CN107301247B/en
Publication of CN107301247A publication Critical patent/CN107301247A/en
Application granted granted Critical
Publication of CN107301247B publication Critical patent/CN107301247B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application discloses a method and a device for establishing a click rate pre-estimation model, a terminal and a storage medium, wherein the method comprises the following steps: the method comprises the steps of collecting characteristics of a plurality of first users, wherein the first users are recommended users, and the objects are recommended contents to be clicked; setting an evaluation index for the object for each feature of each user in the plurality of first users, and constructing the click rate estimation model based on the evaluation index; establishing an error function of the actual click value and the estimated click value of each first user to the object according to the click rate estimation model, and establishing an error loss function based on the error function; solving a numerical value of an evaluation index set for the object based on the error loss function and the pre-counted actual click value of each first user on the object; and determining the click rate estimation model according to the value of the evaluation index of the object obtained by solving.

Description

Method and device for establishing click rate estimation model, terminal and storage medium
Technical Field
The present application relates to the field of network technologies, and in particular, to a method and an apparatus for building a click rate estimation model, a terminal, and a storage medium.
Background
The network technology is developed to the present day, more and more objects such as news information, articles, music, pictures and the like need to be recommended to users, and the recommended users need to be found in a targeted manner for different objects to perform personalized recommendation to obtain a good recommendation effect, and the core technical difficulty of the personalized recommendation is how to accurately determine the objects to be recommended.
Disclosure of Invention
In view of the foregoing problems, an object of the present application is to provide a method and an apparatus for building a click rate estimation model, a terminal, and a storage medium, which improve accuracy of recommending an object to a user.
In one aspect, an embodiment of the present application provides a method for establishing a click rate prediction model, including:
the method comprises the steps of collecting characteristics of a plurality of first users, wherein the first users are recommended users, and the objects are recommended contents to be clicked;
setting an evaluation index for the object for each feature of each user in the plurality of first users, and constructing the click rate estimation model based on the evaluation index;
establishing an error function of the actual click value and the estimated click value of each first user to the object according to the click rate estimation model, and establishing an error loss function based on the error function;
solving a numerical value of an evaluation index set for the object based on the error loss function and the pre-counted actual click value of each first user on the object;
and determining the click rate estimation model according to the value of the evaluation index of the object obtained by solving.
Optionally, the method further comprises:
and obtaining a pre-estimated value of the click rate of the second user to the object by using the click rate pre-estimation model based on the characteristics of the second user, wherein the second user is a user who has not been recommended to the object.
After collecting the characteristics of a plurality of first users, the method further comprises the following steps:
classifying the features, and dividing each class of features into a plurality of feature sets;
after setting the evaluation index for the object for each feature of each of the plurality of first users, the method further includes:
the same evaluation index of the object is assigned the same value for the features in each feature set.
The evaluation index includes: the estimated click value r of each feature on the object and the reliability a of the estimated click value of each feature on the object, wherein r belongs to [0,1], and a belongs to [0,1 ].
The step of solving the numerical value of the evaluation index set for the object based on the error loss function and the pre-counted actual click value of each user on the object comprises:
setting an initial value of the evaluation index;
performing iterative computation on the error loss function by taking the loss minimum of the error loss function as a target;
and stopping the iterative computation when the change rate of the error loss function is smaller than a preset threshold value, and taking the value of the evaluation index at the moment as the numerical value of the evaluation index.
The click rate estimation model is as follows:
Figure BDA0001350632910000021
the error function is:
Figure BDA0001350632910000022
the error loss function is:
Figure BDA0001350632910000023
wherein i represents an object, U represents a user, U represents the set of the first users, ctru,iRepresenting the click rate estimated value of the user u to the object i, F representing the characteristics of the user u, FuA feature set representing user u; y isu,iRepresenting the actual click value of the user u on the object i; r isf,iRepresenting the estimated click value of the object i by the characteristic f; a isf,iRepresenting the reliability of the estimated click value of the feature f on the object i.
On the other hand, the application also provides a device for establishing a click rate pre-estimation model, which comprises the following steps:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring the characteristics of a plurality of first users, the first users are recommended users, and the objects are recommended contents to be clicked;
the first modeling module is used for setting an evaluation index of the object for each feature of each user in the plurality of first users and constructing the click rate estimation model based on the evaluation index;
the second modeling module is used for establishing an error function of the actual click value and the estimated click value of each first user to the object according to the click rate estimation model and establishing an error loss function based on the error function;
the solving module is used for solving the numerical value of the evaluation index set for the object based on the error loss function and the pre-counted actual click value of each first user to the object;
and the third modeling module is used for determining the click rate estimation model according to the numerical value of the evaluation index of the object obtained by solving.
Optionally, the apparatus further comprises:
and the calculation module is used for obtaining the estimated value of the click rate of the second user to the object by using the click rate estimation model based on the characteristics of the second user, wherein the second user is a user who has not been recommended to the object.
Optionally, the acquisition module is further configured to classify the features, and divide each class of features into a plurality of feature sets;
the first modeling module is further configured to assign the same evaluation index of the object to the same value for the features in each feature set.
The evaluation index includes: the evaluation index of each feature on the object comprises: the estimated click value r of each feature on the object and the reliability a of the estimated click value of each feature on the object, wherein r belongs to [0,1], and a belongs to [0,1 ].
The solving module comprises:
the first solving submodule is used for setting an initial value of the evaluation index;
the second solving submodule is used for carrying out iterative calculation on the error loss function by taking the loss minimum of the error loss function as a target;
and the third solving submodule is used for stopping the iterative computation when the change rate of the error loss function is smaller than a preset threshold value and taking the value of the evaluation index at the moment as the numerical value of the evaluation index.
The click rate estimation model is as follows:
Figure BDA0001350632910000041
the error function is:
Figure BDA0001350632910000042
the error loss function is:
Figure BDA0001350632910000043
wherein i represents an object, U represents a user, U represents the set of the first users, ctru,iRepresenting the click rate estimated value of the user u to the object i, F representing the characteristics of the user u, FuA feature set representing user u; y isu,iRepresenting the actual click value of the user u on the object i; r isf,iRepresenting the estimated click value of the object i by the characteristic f; a isf,iRepresenting the reliability of the estimated click value of the feature f on the object i.
In another aspect, the present application further provides a terminal, including: a processor and a memory storing computer instructions;
the processor reads the computer instructions and executes a method for establishing a click rate prediction model as described above.
In another aspect, the present application further provides a storage medium storing computer instructions, which when executed, implement a method for establishing a click through rate prediction model as described above.
According to the method and the device for establishing the click rate estimation model, the terminal and the storage medium, the click behavior of the user recommended by the object i is taken as the basis, the characteristics of each user are collected, and the click rate estimation model is established by taking the evaluation index of each characteristic on the object i as a parameter. After the evaluation index of each feature on the object i is determined, for a user who is not recommended to the object i, the click rate of the user on the object i can be estimated according to the click rate estimation model as long as the feature set of the user is determined. By combining the method, the object i can be recommended to the user with higher estimated click rate, the method improves the accuracy of pushing the object to the user, reduces invalid recommendation in practical application, improves the network utilization rate, and can realize providing personalized object recommendation for different users.
Drawings
The above and other objects, features and advantages of the present application will become apparent from the following detailed description, which proceeds with reference to the accompanying drawings. In the drawings:
fig. 1 is a flowchart of a method for establishing a click rate estimation model according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for establishing a click rate estimation model of a network advertisement according to an embodiment of the present application;
FIG. 3 is a scene diagram of an application of a network advertisement click-through rate estimation model according to an embodiment of the present application;
FIG. 4 is a flowchart of a method for building an article click-through rate estimation model according to an embodiment of the present application;
FIG. 5 is a scene diagram illustrating an application of a click-through rate prediction model of an article according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an apparatus for building a click rate prediction model according to an embodiment of the present application.
Detailed Description
Various aspects of the present application are described below. The teachings herein may be embodied in a wide variety of forms and any specific structure, function, or both being disclosed herein is merely representative. Based on the teachings herein one skilled in the art should appreciate that an aspect disclosed herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented or a method may be practiced using any number of the aspects set forth herein. In addition, such an apparatus may be implemented or such a method may be practiced using other mechanisms, functions, or structures and functions in addition to or other than one or more of the aspects set forth herein. Furthermore, any aspect described herein may include at least one element of a claim.
The application provides a method and a device for establishing a click rate estimation model, a terminal and a storage medium. The following description of specific embodiments of the present application refers to the accompanying drawings.
Referring to fig. 1, a method for establishing a click through rate prediction model according to an embodiment of the present application includes steps 101 to 106.
Step 101: characteristics of a plurality of first users are collected, and the first users are users recommended by the objects.
As described above, in the embodiment of the present application, the object is recommended content to be clicked, and may be, for example, a web advertisement, an article, an application, music or a picture, a movie, and the like.
For example, a music recommendation is recommended to a plurality of users, and the users who have recommended the music are the first users of the music. The characteristics of the user to whom the music is recommended can be obtained by collecting the characteristics of the first user.
In this embodiment, the characteristics of the first user may include various types such as age, academic calendar, city, occupation, income, and the like.
In an embodiment of the present application, after acquiring features of a plurality of first users, the method may further include:
and classifying the features, and dividing each class of features into a plurality of feature sets.
For each class of features, it may be divided in some way into a number of different feature sets. For example, the age may be classified into "children", "teenagers", "young adults", "middle-aged people" and "old people", and may also be classified into "60 posterior", "70 posterior", "80 posterior", "90 posterior" and "00 posterior".
The dividing mode of each type of features can be determined according to actual needs, and the application is not limited.
Step 102: and setting an evaluation index for the object for each characteristic of each user in the plurality of first users, and constructing the click rate estimation model based on the evaluation index.
In an embodiment of the present application, the evaluation index for the object may include: the estimated click value r of each characteristic pair object and the reliability a of the estimated click value of each characteristic pair object; wherein r belongs to [0,1], a belongs to [0,1 ]. In other words, in the embodiment of the present application, each feature evaluates an object through two indexes, namely, the estimated click value r and the reliability a of the estimated click value.
When the features are classified and each class of features is divided into a plurality of feature sets, the method provided by the embodiment of the application further comprises:
the same evaluation index of the object is assigned the same value for the features in each feature set.
Taking the feature of the age type as an example, the feature is further divided into a plurality of feature sets of "children", "teenagers", "young people", "middle-aged people" and "old people", and then the evaluation indexes r and a of the same object are the same in value for the features in each set.
Still taking music recommendation as an example, one recommendation object is music i, and if the evaluation indexes r and a of the features in the feature set of juveniles to the music i take the values r respectively1And a1If the characteristic of the age of any user belongs to the characteristic set of 'teenagers', the evaluation indexes of the characteristic of the age of the user on the music i are r1And a1
In an embodiment of the application, the click rate estimation model may be:
Figure BDA0001350632910000071
wherein ctr represents a click rate estimated value; u represents a user; i represents an object; f represents a feature of the user; fuA set of features representing a user; r isf,iRepresenting the estimated click value of the feature f on the object i; a isf,iRepresenting the reliability of the estimated click value of the feature f on the object i.
Through the click rate prediction model, the click rate of a user u to a certain object i can be known to be determined through the prediction click value r of each feature in the feature set F of the user to the object i and the reliability a of the prediction click value of the object i.
If it is usedThe feature set F of family A has 4 features of 'youth', 'university student', 'everlasting Guangzhou' and 'science and technology fan', and two evaluation indexes of the 4 features to a certain object i are a1And r1、a2And r2、a3And r3And a4And r4Then user A estimates the click rate for object i, ctrA,i=(a1*r1+a2*r2+a3*r3+a4*r4)/(a1+a2+a3+a4). That is, after the feature set F of the user is determined, if the evaluation index of each feature in the feature set F of the user to the object i is determined, the click rate of the user to the object i can be estimated according to the evaluation index of each feature to the object i.
Step 103: and establishing an error function of the actual click value and the estimated click value of each first user to the object according to the click rate estimation model, and establishing an error loss function based on the error function.
Step 104: and solving the numerical value of the evaluation index set for the object based on the error loss function and the pre-counted actual click value of each first user on the object.
In an embodiment of the present application, based on the error loss function and a difference between an actual click value and an estimated click value of each user on the object, which is counted in advance, solving a value of an evaluation index set for the object may include:
setting an initial value of the evaluation index;
performing iterative computation on the error loss function by taking the loss minimum of the error loss function as a target;
and stopping the iterative computation when the change rate of the error loss function is smaller than a preset threshold value, and taking the value of the evaluation index at the moment as the numerical value of the evaluation index.
In the embodiment of the application, the evaluation index of each characteristic to the object is determined according to the real click value and the estimated click value of a plurality of users u recommended by the object i.
In an embodiment of the present application, if the click rate estimation model is shown as equation 1, the corresponding error function is:
Figure BDA0001350632910000091
the error loss function is:
Figure BDA0001350632910000092
wherein u represents a user; i represents an object; f represents a feature of the user; ctru,iRepresenting an estimated click rate of user u on object i, FuA feature set representing user u; y isu,iRepresenting the actual click value of the user u on the object i; r isf,iRepresenting the estimated click value of the feature f on the object i; a isf,iAnd representing the reliability of the estimated click value of the object i by the characteristic f, and U represents the set of recommended users of the object i.
If the user clicks on object i by point u, then yu,i1, otherwise yu,i=0。
As can be known from equation 3, the embodiment of the present application determines the evaluation index a of each feature f for the object i based on the analysis of the features and the actual click values of a plurality of recommended users of the object if,iAnd rf,i
In the embodiment of the application, a gradient descent method can be adopted, based on the actual click value of the recommended user of the object i, the minimum error L (r, a) is taken as a target, and the evaluation index of each characteristic f on the object i is solved. The solving method may include the steps of:
step 1: randomly giving a group of vectors r, a consisting of decimal fractions between 0 and 1, and setting the vectors r, a as r(0),a(0)Initializing the iteration step number k to be 0;
step 2: iterative computation
Figure BDA0001350632910000093
Figure BDA0001350632910000101
Where θ is the step size of the iteration, 0.01 is taken
And 3, step 3: determining whether the error loss function converges
ΔL(r(k+1),a(k+1))=|L(r(k+1),a(k+1))-L(r(k),a(k))|
If | Δ L (r)(k+1),a(k+1))-ΔL(r(k),a(k)) If | is less than α, then r is returned(k+1),a(k+1)I.e. the parameters of the model, otherwise, go back to step 2 to continue the calculation, where α is a small value, and may be 0.01 · θ.
Step 105: and determining the click rate estimation model according to the value of the evaluation index of the object obtained by solving.
The evaluation index a of each feature f to the object i is calculated and determined by the above stepsf,iAnd rf,iAnd thus the click rate prediction model of the object i is determined.
Step 106: and obtaining a pre-estimated value of the click rate of the second user to the object by using the click rate pre-estimation model based on the characteristics of the second user, wherein the second user is a user who has not been recommended to the object.
For a user B who has not been recommended to the object i, the characteristics of the user are collected, and the corresponding evaluation index is determined according to the collected characteristics, so that the click rate pre-evaluation value of the user B to the object i can be calculated according to the click rate pre-evaluation model.
According to the method for establishing the click rate estimation model, the click behavior of the user recommended by the object i is taken as a basis, the characteristics of each user are collected, and the click rate estimation model is established by taking the evaluation index of each characteristic on the object i as a parameter. After the evaluation index of each feature on the object i is determined, for a user who is not recommended to the object i, the click rate of the user on the object i can be estimated according to the click rate estimation model as long as the feature set of the user is determined. By combining the method, the object i can be recommended to the user with higher estimated click rate, the method improves the accuracy of pushing the object to the user, reduces invalid recommendation in practical application, improves the network utilization rate, and can realize providing personalized object recommendation for different users.
Referring to fig. 2, in an embodiment of the present application, the object is a web advertisement c, and the method for establishing a click through rate prediction model provided by the present application includes steps 201 to 206.
Step 201: and collecting the characteristics of a plurality of users recommended by the network advertisement, and establishing a characteristic set characterizing the users recommended by the network advertisement.
In the embodiment of the application, a click rate estimation model of the network advertisement c is established by collecting the characteristics of the recommended users of the network advertisement c.
In the embodiment of the application, the feature set characterizing the recommended users of the network advertisements can be collected from multiple dimensions.
Dimension 1: through the user's preference for web advertisements, users who like shopping, for example, characterize "shopping fans".
Dimension 2: the region attribute of the user is characterized, such as Beijing, Tianjin and Shanghai.
Dimension 3: by user natural attributes such as age, gender, etc.
Dimension 4: the social attributes of the user are characterized, such as cultural level, occupation, territory, and the like.
In practical applications, the selected dimension may be different according to different objects. This is not a limitation of the present application.
For each type of feature, the feature can be further divided into different feature sets.
Step 202: and setting an evaluation index of the network advertisement for the characteristics of each characteristic set, and constructing the click rate estimation model based on the evaluation index.
In this embodiment of the present application, the evaluation index of the network advertisement may include: the estimated click value r of the characteristics of each characteristic set to the network advertisement and the reliability a of the estimated click value of the characteristics of each characteristic set to the network advertisement are obtained; wherein r belongs to [0,1], a belongs to [0,1 ].
It should be noted that the value of the evaluation index of the feature of each feature set to the network advertisement is the same.
Taking age as an example, if the characteristics of this type of age further include 5 characteristic sets of "children", "teenagers", "youth", "middle-aged" and "old", then the characteristics in each characteristic set have the same evaluation index for the same network advertisement. For example, the users 30 and 32 belong to the feature set of "young", and the evaluation index values of the features of the ages of the users 30 and 32 are the same for the same network advertisement.
The click rate estimation model of the network advertisement established in the embodiment of the application is shown as a formula 4.
Figure BDA0001350632910000121
Wherein c represents the web advertisement, u represents a user, f represents a characteristic of the user, ctru,c,Representing a predicted value of click-through rate of user u on network advertisement c, FuA feature set, r, representing user uf,cThe representation characteristic f pre-estimates a click value of the network advertisement c; a isf,cAnd representing the reliability of the estimated click value of the characteristic f on the network advertisement c.
Step 203: and establishing an error function of the actual click value and the estimated click value of each user recommended by the network advertisement to the network advertisement according to the network advertisement click rate estimation model, and establishing an error loss function based on the error function.
The error function in the embodiment of the present application is shown in equation 5:
Figure BDA0001350632910000122
wherein, yu,cRepresenting the actual click value of user u on web advertisement c. If user u clicks the network advertisement c, then yu,c1, otherwise yu,c=0。
Step 204: and solving the value of the evaluation index set for the object based on the error loss function and the difference between the actual click value and the estimated click value of each user on the network advertisement counted in advance.
The error loss function is shown in equation 6:
Figure BDA0001350632910000123
wherein L (r, a) represents an error loss function and U represents a set of users U to which the web advertisement c has been recommended.
As can be seen from equation 6, in the embodiment of the present application, the actual click value of the recommended users of the network advertisement c on the network advertisement c and the characteristic f of each user are combined to determine the evaluation index a of each characteristic on the network advertisement cf,cAnd rf,cIn (1).
A gradient descent method may be adopted based on the actual click value y of the recommended user u of the web advertisement cu,cAnd solving the evaluation index of each characteristic f of each user U in the user set U to the network advertisement c by taking the minimum error L (r, a) as a target. The solving method may include the steps of:
step 1: randomly giving a group of vectors r, a consisting of decimal fractions between 0 and 1, and setting the vectors r, a as r(0),a(0)Initializing the iteration step number k to be 0;
step 2: iterative computation
Figure BDA0001350632910000131
Figure BDA0001350632910000132
Where θ is the step size of the iteration, 0.01 is taken
And 3, step 3: determining whether the error loss function converges
ΔL(r(k+1),a(k+1))=|L(r(k+1),a(k+1))-L(r(k),a(k))|
If | Δ L (r)(k+1),a(k+1))-ΔL(r(k),a(k)) If | is less than α, then r is returned(k+1),a(k+1)I.e. the parameters of the model, otherwise, go back to step 2 to continue the calculation, where α is a small value, and may be 0.01 · θ.
Step 205: and determining a click rate estimation model of the network advertisement according to the numerical value of the evaluation index of the network advertisement obtained by solving.
The evaluation index a of each characteristic f to the network advertisement c is calculated and determined through the stepsf,cAnd rf,cAnd determining the click rate estimation model of the network advertisement.
Step 206: and inputting the evaluation indexes corresponding to the characteristics of the users with the network advertisements not recommended into the click rate estimation model to obtain the estimated value of the click rate of the users to the network advertisements.
FIG. 3 is a schematic diagram of an application scenario of the network advertisement click-through rate estimation model. For a user B who has not been recommended the network advertisement c, by collecting n characteristics of the user, the evaluation index corresponding to each characteristic is a1And r1、a2And r2、a3And r3……anAnd rnAnd the evaluation index corresponding to each feature is determined through the modeling process. The click rate estimated value ctr of the user B to the network advertisementB,c=(a1*r1+a2*r2+a3*r3+……+an*rn)/(a1+a2+a3+……+an)。
The click rate estimation model of the network advertisement is established based on the characteristics and the actual click value of the user recommended by the network advertisement, the click rate estimation model is utilized to calculate the click rate estimation value of the user not recommended by the network advertisement, then the user with the high click rate estimation value is selected from the click rate estimation values, and the network advertisement is sent to the users, so that the accuracy of network advertisement delivery can be greatly improved, the delivery of the advertisement with the low click rate is avoided, the advertisement delivery cost is saved, and the economic benefit of the network advertisement is improved.
FIG. 4 is a block diagram illustrating an application of a click-through rate prediction model provided in the present application to the click-through rate prediction of an article, and the method includes steps 401 to 406.
Step 401: and collecting the characteristics of the recommended multiple users of the article, and establishing a characteristic set depicting the multiple users.
In the embodiment of the application, a click rate estimation model of the article is established by collecting the characteristics of the recommended user of the article.
The manner of determining the user's feature set is similar to the previous embodiment and will not be described herein.
Step 402: and setting an evaluation index of the article for the features of each feature set, and constructing a click rate estimation model of the article based on the evaluation index.
In an embodiment of the present application, the evaluation index of the article may include: the estimated click value r of the characteristic of each characteristic set to the article and the reliability a of the estimated click value of the characteristic of each characteristic set to the article; wherein r belongs to [0,1], a belongs to [0,1 ].
Note that the evaluation index value of each feature set for the article is the same.
The click rate estimation model of the article established in the embodiment of the application is shown as a formula 7.
Figure BDA0001350632910000151
Wherein d represents the article, u represents the user, and f representsCharacteristic; ctru,dRepresenting a predicted value of click-through rate of user u on article d, FuA feature set, r, representing user uf,dRepresenting the estimated click value of the feature f to the article d; a isf,dAnd representing the reliability of the predicted click value of the article d by the characteristic f.
Step 403: and establishing an error function of the actual click value and the estimated click value of each user to the article according to the click rate estimation model of the article, and establishing an error loss function based on the error function.
The error function in the embodiment of the present application is shown in equation 8:
Figure BDA0001350632910000152
wherein, yu,dRepresenting the actual click value of the user u on the article d, and if the user u clicks the article d, yu,d1, otherwise yu,d=0。
Step 404: and solving the numerical value of the evaluation index set for the object based on the error loss function and the difference between the actual click value and the estimated click value of each user on the article counted in advance.
The error loss function is shown in equation 9.
Figure BDA0001350632910000153
Wherein L (r, a) represents an error loss, and U represents a set of recommended users U of the article d.
As can be seen from equation 9, in the embodiment of the present application, the actual click value of the article d by a plurality of recommended users of the article d and the feature of each user are combined to determine a of the evaluation index of each feature on the article df,dAnd rf,dIn (1).
A gradient descent method can be adopted, and the actual click value y of the recommended user u based on the article du,dSolving for the minimum error L (r, a)And evaluating the evaluation index of each feature f in the feature set on the article d. The solving method may include the steps of:
step 1: randomly giving a group of vectors r, a consisting of decimal fractions between 0 and 1, and setting the vectors r, a as r(0),a(0)Initializing the iteration step number k to be 0;
step 2: iterative computation
Figure BDA0001350632910000161
Figure BDA0001350632910000162
Where θ is the step size of the iteration, 0.01 is taken
And 3, step 3: determining whether the error loss function converges
ΔL(r(k+1),a(k+1))=|L(r(k+1),a(k+1))-L(r(k),a(k))|
If | Δ L (r)(k+1),a(k+1))-ΔL(r(k),a(k)) If | is less than α, then r is returned(k+1),a(k+1)I.e. the parameters of the model, otherwise, go back to step 2 to continue the calculation, where α is a small value, and may be 0.01 · θ.
Step 405: and determining a click rate estimation model of the article according to the numerical value of the evaluation index of the article obtained by solving.
The evaluation indexes a of the articles d by the characteristics f are calculated and determined through the stepsf,dAnd rf,dAnd determining the click rate estimation model of the article.
Step 406: and inputting evaluation indexes corresponding to the characteristics of the user who is not recommended to the article into the click rate estimation model to obtain the estimated value of the click rate of the article by the user.
FIG. 5 is a schematic diagram of an application scenario of the article click rate estimation model. For a user M who has not been recommended the article dBy collecting 4 features (taking 4 features as an example) of the user, the evaluation index corresponding to each feature is a1And r1、a2And r2、a3And r3And a4And r4And the evaluation index corresponding to each feature is determined through the modeling process. The click rate estimated value ctr of the article d by the user MM,d=(a1*r1+a2*r2+a3*r3+a4*r4)/(a1+a2+a3+a4)。
The click rate estimation model of the article d is established based on the characteristics and the actual click value of the recommended user of the article d, the click rate estimation model is used for calculating the click rate estimation value of the user who is not recommended of the article d, then the user with the high click rate estimation value is selected from the click rate estimation values, and the article d is sent to the users, so that the recommendation accuracy of the article d can be greatly improved, and the reading capacity of the article is greatly improved.
Fig. 6 is a schematic structural diagram of an apparatus for building a click rate prediction model according to an embodiment of the present application. Because the apparatus embodiments are substantially similar to the method embodiments, reference may be made to some of the descriptions of the method embodiments for related points. The device embodiments described below are merely illustrative.
The device for establishing the click rate pre-estimation model comprises the following steps:
an acquisition module 601, configured to acquire characteristics of a plurality of first users, where the first users are users whose objects are recommended;
a first modeling module 602, configured to set an evaluation index for the object for each feature of each of the multiple first users, and construct the click rate estimation model based on the evaluation index;
the second modeling module 603 is configured to establish an error function between an actual click value and an estimated click value of each first user on the object according to the click rate estimation model, and establish an error loss function based on the error function;
a solving module 604, configured to solve a numerical value of an evaluation index set for the object based on the error loss function and a pre-counted actual click value of each first user on the object;
and the third modeling module 605 is configured to determine the click rate estimation model according to the value of the evaluation index of the object obtained through the solution.
In an embodiment of the present application, the apparatus further includes:
a calculating module 606, configured to obtain, based on characteristics of a second user, a pre-estimated value of the click rate of the second user to the object by using the click rate pre-estimating model, where the second user is a user for which the object is not recommended.
In an embodiment of the present application, the acquisition module 601 is further configured to classify the features, and divide each class of features into a plurality of feature sets;
the first modeling module 602 is further configured to assign the same evaluation index of the object to the features in each feature set by the same numerical value.
In an embodiment of the present application, the evaluation index includes: the evaluation index of each feature on the object comprises: the estimated click value r of each feature on the object and the reliability a of the estimated click value of each feature on the object, wherein r belongs to [0,1], and a belongs to [0,1 ].
In an embodiment of the present application, the solving module 604 includes:
the first solving submodule is used for setting an initial value of the evaluation index;
the second solving submodule is used for carrying out iterative calculation on the loss function by taking the loss minimum of the error loss function as a target;
and the third solving submodule is used for stopping the iterative computation when the change rate of the error loss function is smaller than a preset threshold value and taking the value of the evaluation index at the moment as the numerical value of the evaluation index.
In an embodiment of the present application, the click rate estimation model is:
Figure BDA0001350632910000181
the error function is:
Figure BDA0001350632910000182
the error loss function is:
Figure BDA0001350632910000191
wherein i represents an object, U represents a user, U represents the set of the first users, ctru,iRepresenting the click rate estimated value of the user u to the object i, F representing the characteristics of the user u, FuA feature set representing user u; y isu,iRepresenting the actual click value of the user u on the object i; r isf,iRepresenting the estimated click value of the object i by the characteristic f; a isf,iRepresenting the reliability of the estimated click value of the feature f on the object i.
According to the device for establishing the click rate estimation model, the click behavior of the user recommended by the object i is taken as the basis, the characteristics of each user are collected, and the click rate estimation model is established by taking the evaluation index of each characteristic on the object i as a parameter. After the evaluation index of each feature on the object i is determined, for a user who is not recommended to the object i, the click rate of the user on the object i can be estimated according to the click rate estimation model as long as the feature set of the user is determined. By the aid of the method, the object i can be recommended to the user with the high estimated click rate, accuracy of object pushing for the user is improved, invalid recommendation is reduced in practical application, network utilization rate is improved, and personalized object recommendation can be provided for different users.
The present application further provides a terminal, including: a processor and a memory storing computer instructions; the processor reads the computer instructions and executes a method for establishing a click rate pre-estimation model as described above.
The present application further provides a storage medium storing computer instructions that, when executed, implement a method for establishing a click-through rate estimation model as described above.
It should be noted that, the click rate estimation model building device, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer-readable storage medium and used by a computer to implement the steps of the embodiments of the methods described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the application to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.
The preferred embodiments and examples of the present application have been described in detail with reference to the accompanying drawings, but the present application is not limited to the embodiments and examples described above, and various changes can be made within the knowledge of those skilled in the art without departing from the concept of the present application.

Claims (12)

1. A method for establishing a click-through rate prediction model is characterized by comprising the following steps:
the method comprises the steps of collecting characteristics of a plurality of first users, wherein the first users are recommended users, and the objects are recommended contents to be clicked;
setting an evaluation index for the object for each feature of each user in the plurality of first users, and constructing the click rate estimation model based on the evaluation index;
establishing an error function of the actual click value and the estimated click value of each first user to the object according to the click rate estimation model, and establishing an error loss function based on the error function;
solving a numerical value of an evaluation index set for the object based on the error loss function and the pre-counted actual click value of each first user on the object;
determining the click rate pre-estimation model according to the value of the evaluation index of the object obtained by solving,
wherein the evaluation index includes: the estimated click value r of each feature on the object and the reliability a of the estimated click value of each feature on the object, wherein r belongs to [0,1], a belongs to [0,1],
the click rate estimation model is as follows:
Figure FDA0002719475670000011
wherein i represents an object, U represents a user, U represents the set of the first users, ctru,iRepresenting the click rate estimated value of the user u to the object i, F representing the characteristics of the user u, FuA feature set representing user u; r isf,iRepresenting the estimated click value of the object i by the characteristic f; a isf,iRepresenting the reliability of the estimated click value of the feature f on the object i.
2. The method of claim 1, further comprising:
and obtaining a pre-estimated value of the click rate of the second user to the object by using the click rate pre-estimation model based on the characteristics of the second user, wherein the second user is a user who has not been recommended to the object.
3. The method of claim 1, wherein after collecting the characteristics of the plurality of first users, further comprising:
classifying the features, and dividing each class of features into a plurality of feature sets;
after setting the evaluation index for the object for each feature of each of the plurality of first users, the method further includes:
the same evaluation index of the object is assigned the same value for the features in each feature set.
4. The method according to claim 1, wherein the solving the value of the evaluation index set for the object based on the error loss function and the pre-counted actual click value of each user on the object comprises:
setting an initial value of the evaluation index;
performing iterative computation on the error loss function by taking the loss minimum of the error loss function as a target;
and stopping the iterative computation when the change rate of the error loss function is smaller than a preset threshold value, and taking the value of the evaluation index at the moment as the numerical value of the evaluation index.
5. The method of claim 4, wherein the error function is:
Figure FDA0002719475670000021
the error loss function is:
Figure FDA0002719475670000022
wherein, yu,iRepresenting the actual click value of user u on object i.
6. An apparatus for building a click-through rate prediction model, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring the characteristics of a plurality of first users, the first users are recommended users, and the objects are recommended contents to be clicked;
the first modeling module is used for setting an evaluation index of the object for each feature of each user in the plurality of first users and constructing the click rate estimation model based on the evaluation index;
the second modeling module is used for establishing an error function of the actual click value and the estimated click value of each first user to the object according to the click rate estimation model and establishing an error loss function based on the error function;
the solving module is used for solving the numerical value of the evaluation index set for the object based on the error loss function and the pre-counted actual click value of each first user to the object;
a third modeling module for determining the click rate estimation model according to the value of the evaluation index of the object obtained by solving,
wherein the evaluation index includes: the estimated click value r of each feature on the object and the reliability a of the estimated click value of each feature on the object, wherein r belongs to [0,1], a belongs to [0,1],
wherein, the click rate estimation model is as follows:
Figure FDA0002719475670000031
wherein i represents an object, U represents a user, U represents the set of the first users, ctru,iRepresenting the click rate estimated value of the user u to the object i, F representing the characteristics of the user u, FuA feature set representing user u; r isf,iRepresenting the estimated click value of the object i by the characteristic f; a isf,iRepresenting the reliability of the estimated click value of the feature f on the object i.
7. The apparatus of claim 6, further comprising:
and the calculation module is used for obtaining the estimated value of the click rate of the second user to the object by using the click rate estimation model based on the characteristics of the second user, wherein the second user is a user who has not been recommended to the object.
8. The apparatus of claim 6,
the acquisition module is also used for classifying the features and dividing each class of features into a plurality of feature sets;
the first modeling module is further configured to assign the same evaluation index of the object to the same value for the features in each feature set.
9. The apparatus of claim 6, wherein the solving module comprises:
the first solving submodule is used for setting an initial value of the evaluation index;
the second solving submodule is used for carrying out iterative calculation on the loss function by taking the loss minimum of the error loss function as a target;
and the third solving submodule is used for stopping the iterative computation when the change rate of the error loss function is smaller than a preset threshold value and taking the value of the evaluation index at the moment as the numerical value of the evaluation index.
10. The apparatus of claim 9, wherein the error function is:
Figure FDA0002719475670000041
the error loss function is:
Figure FDA0002719475670000042
wherein, yu,iRepresenting the actual click value of user u on object i.
11. A terminal, comprising: a processor and a memory storing computer instructions;
the processor reads the computer instructions and executes a method of establishing a click through rate prediction model according to any one of claims 1-5.
12. A storage medium having stored thereon computer instructions which, when executed, implement a method of modeling click through rate estimates as claimed in any one of claims 1 to 5.
CN201710578583.XA 2017-07-14 2017-07-14 Method and device for establishing click rate estimation model, terminal and storage medium Active CN107301247B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710578583.XA CN107301247B (en) 2017-07-14 2017-07-14 Method and device for establishing click rate estimation model, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710578583.XA CN107301247B (en) 2017-07-14 2017-07-14 Method and device for establishing click rate estimation model, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN107301247A CN107301247A (en) 2017-10-27
CN107301247B true CN107301247B (en) 2020-12-18

Family

ID=60132942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710578583.XA Active CN107301247B (en) 2017-07-14 2017-07-14 Method and device for establishing click rate estimation model, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN107301247B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053050A (en) * 2017-11-14 2018-05-18 广州优视网络科技有限公司 Clicking rate predictor method, device, computing device and storage medium
CN107977859A (en) * 2017-11-14 2018-05-01 广州优视网络科技有限公司 Advertisement placement method, device, computing device and storage medium
CN107886361A (en) * 2017-11-14 2018-04-06 深圳市金立通信设备有限公司 A kind of method and server for assessing ad conversion rates prediction model
CN109829116B (en) * 2019-02-14 2021-07-30 北京达佳互联信息技术有限公司 Content recommendation method and device, server and computer readable storage medium
CN111598638B (en) * 2019-02-21 2023-11-07 北京沃东天骏信息技术有限公司 Click rate determination method, device and equipment
CN111160638B (en) * 2019-12-20 2022-09-02 深圳前海微众银行股份有限公司 Conversion estimation method and device
CN111159241B (en) * 2019-12-20 2023-04-07 深圳前海微众银行股份有限公司 Click conversion estimation method and device
CN111861542B (en) * 2020-06-15 2024-02-02 北京雷石天地电子技术有限公司 Method, device, terminal and non-transitory computer readable storage medium for advertising

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104536983A (en) * 2014-12-08 2015-04-22 北京掌阔技术有限公司 Method and device for predicting advertisement click rate
CN105260471A (en) * 2015-10-19 2016-01-20 广州唯品会信息科技有限公司 Training method and system of commodity personalized ranking model
CN105589971A (en) * 2016-01-08 2016-05-18 车智互联(北京)科技有限公司 Method and device for training recommendation model, and recommendation system
CN105808762A (en) * 2016-03-18 2016-07-27 北京百度网讯科技有限公司 Resource sequencing method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080275775A1 (en) * 2007-05-04 2008-11-06 Yahoo! Inc. System and method for using sampling for scheduling advertisements in an online auction
CN104834641B (en) * 2014-02-11 2019-03-15 腾讯科技(北京)有限公司 The processing method and related system of network media information
CN105701191B (en) * 2016-01-08 2020-12-29 腾讯科技(深圳)有限公司 Pushed information click rate estimation method and device
CN105678335B (en) * 2016-01-08 2019-07-02 车智互联(北京)科技有限公司 It estimates the method, apparatus of clicking rate and calculates equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104536983A (en) * 2014-12-08 2015-04-22 北京掌阔技术有限公司 Method and device for predicting advertisement click rate
CN105260471A (en) * 2015-10-19 2016-01-20 广州唯品会信息科技有限公司 Training method and system of commodity personalized ranking model
CN105589971A (en) * 2016-01-08 2016-05-18 车智互联(北京)科技有限公司 Method and device for training recommendation model, and recommendation system
CN105808762A (en) * 2016-03-18 2016-07-27 北京百度网讯科技有限公司 Resource sequencing method and device

Also Published As

Publication number Publication date
CN107301247A (en) 2017-10-27

Similar Documents

Publication Publication Date Title
CN107301247B (en) Method and device for establishing click rate estimation model, terminal and storage medium
CN104935963B (en) A kind of video recommendation method based on timing driving
CN107451894B (en) Data processing method, device and computer readable storage medium
CN107526810B (en) Method and device for establishing click rate estimation model and display method and device
Merialdo Clustering for collaborative filtering applications
CN103744966B (en) A kind of item recommendation method, device
US9582569B2 (en) Targeted content distribution based on a strength metric
JP6261547B2 (en) Determination device, determination method, and determination program
EP3179434A1 (en) Designing context-aware recommendation systems, based on latent contexts
CN103678518B (en) Method and device for adjusting recommendation lists
CN104063481A (en) Film individuation recommendation method based on user real-time interest vectors
CN111177559B (en) Text travel service recommendation method and device, electronic equipment and storage medium
CN103399858A (en) Socialization collaborative filtering recommendation method based on trust
CN108446297B (en) Recommendation method and device and electronic equipment
CN110879864A (en) Context recommendation method based on graph neural network and attention mechanism
CN103106285A (en) Recommendation algorithm based on information security professional social network platform
Liu et al. Identifying online user reputation of user–object bipartite networks
CN110083764A (en) A kind of collaborative filtering cold start-up way to solve the problem
JP6827305B2 (en) Selection device, selection method and selection program
KR101567684B1 (en) Method for selecting recommendation type in product recommeding system
JP2019113943A (en) Information providing apparatus, information providing method, and program
CN106570031A (en) Service object recommending method and device
CN103365842B (en) A kind of page browsing recommends method and device
JP2017059057A (en) Estimation device, estimation method, and estimation program
JP2017201535A (en) Determination device, learning device, determination method, and determination program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200527

Address after: 310051 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 510627 unit 02, floor 15, Tower B, Pingyun Plaza, radio and television, No. 163, xipingyun Road, Huangpu Avenue, Tianhe District, Guangzhou City, Guangdong Province (for office use only)

Applicant before: GUANGZHOU UC NETWORK TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant