CN111352976A - Search advertisement conversion rate prediction method and device for shopping nodes - Google Patents

Search advertisement conversion rate prediction method and device for shopping nodes Download PDF

Info

Publication number
CN111352976A
CN111352976A CN202010146512.4A CN202010146512A CN111352976A CN 111352976 A CN111352976 A CN 111352976A CN 202010146512 A CN202010146512 A CN 202010146512A CN 111352976 A CN111352976 A CN 111352976A
Authority
CN
China
Prior art keywords
shopping
data
festival
feature
conversion rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010146512.4A
Other languages
Chinese (zh)
Other versions
CN111352976B (en
Inventor
赖粤
钱毅霖
余荣
吴茂强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202010146512.4A priority Critical patent/CN111352976B/en
Publication of CN111352976A publication Critical patent/CN111352976A/en
Application granted granted Critical
Publication of CN111352976B publication Critical patent/CN111352976B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application discloses a method and a device for predicting search advertisement conversion rate aiming at shopping nodes, wherein the method comprises the following steps: acquiring a shopping data set on the current day of a shopping festival and before the shopping festival; taking a shopping data set before a shopping festival as a first training set, training an advertisement conversion rate model, and predicting to obtain a first prediction result of the current day of the shopping festival; taking a shopping data set of a part of time of the day of the shopping festival as a second training set, taking a shopping data set of another part of time of the day of the shopping festival as a test set, taking part of data in the second training set as a verification set, and respectively adding a first prediction result serving as a new feature into the second training set, the verification set and the test set; and training an advertisement conversion rate model to obtain a final prediction result of the current shopping festival. The problem of inaccurate prediction results caused by the training set constructed in the daily period predicting shopping time and the test set constructed in the daily period is solved.

Description

Search advertisement conversion rate prediction method and device for shopping nodes
Technical Field
The application relates to the technical field of data mining, in particular to a method and a device for predicting search advertisement conversion rate aiming at shopping nodes.
Background
The search advertisements are advertisements related to the search terms that are presented in the return interface according to the user's search behavior. The e-commerce platform, as a complex system, can be affected by various factors. In shopping malls such as henry, 618, various activities of merchants and platforms can result in dramatic changes in traffic distribution, and we refer to this particular period of ad conversion-related data as a particular traffic. And the model trained in daily period is difficult to match with special flow effectively.
For the problems, the most similar implementation scheme of the invention is to take the conversion rate prediction as the traditional regression prediction problem, divide the data set by a method of sliding a time window, use a plurality of different machine learning algorithms as models, and use a Stacking method for fusion to obtain the final result.
The prior art always migrates a traditional regression problem prediction method to a conversion rate prediction problem without modification, or rarely distinguishes the conversion rate prediction method under two data distributions of a daily period and a shopping mall period, so the prior art has the following defects that 1, a training set constructed in the daily period is not adapted, but is directly used for predicting a test set constructed in the shopping mall period, 2, a high-dimensional feature is coded by using a unique heat vector, or all the features are coded by using the same method without distinction, the coding methods have poor effects and large promotion space, 3, the prior art uses a fixed time window for extracting the features, but the shopping mall period and the period nearby have various different data distributions, another mode of data set division is needed, the waste of data is avoided, 4, the prior art adopts a smooth window for processing the features when the conversion rate features are calculated, but the prior art adopts a plurality of weighted prediction methods which respectively adopt different weighting methods, and respectively adopt a plurality of weighted prediction methods, and the weighted prediction method is used for determining more serious learning efficiency when a plurality of weighted prediction methods are adopted, and the weighted prediction method adopts a plurality of weighted prediction methods which are different weighting and are used for determining more different models, and the weighted prediction methods are adopted when the weighted prediction method is adopted, and the weighted prediction method for calculating more serious learning efficiency is adopted, and the model is adopted.
Disclosure of Invention
The application provides a method and a device for predicting the conversion rate of search advertisements aiming at shopping sessions, so that the problem of inaccurate prediction results caused by a test set constructed in a period of predicting shopping sessions by a training set constructed in a daily period is solved.
In view of the above, a first aspect of the present application provides a method for predicting a search advertisement conversion rate for a shopping node, the method comprising:
acquiring a shopping data set on the current day of a shopping festival and before the shopping festival;
taking a shopping data set before the shopping festival as a first training set, training an advertisement conversion rate model, and predicting to obtain a first prediction result of the current shopping festival;
taking the shopping data set of a part of time of the day of the shopping festival as a second training set, taking the shopping data set of another part of time of the day of the shopping festival as a test set, taking part of data in the second training set as a verification set, and respectively adding the first prediction result serving as a new feature into the second training set, the verification set and the test set;
and training the advertisement conversion rate model by adopting the second training set, the verification set and the test set added with new features to obtain the final result of the current shopping festival.
Optionally, the training the advertisement conversion rate model by using the shopping data set before the shopping festival as a first training set to predict a first prediction result of the current day of the shopping festival further includes:
and preprocessing the data in the shopping data set to obtain data in accordance with the input format of the advertisement conversion rate model.
Optionally, the preprocessing the data in the shopping data set specifically includes:
processing missing values and abnormal values in the shopping data set;
processing the original attribute characteristics of the search advertisement categories in the shopping data set to obtain characteristic data meeting format requirements;
and coding the characteristic data by adopting a hierarchical coding method based on an information entropy principle.
Optionally, the encoding the feature data by using a hierarchical encoding method based on an information entropy principle specifically includes:
inputting the characteristic data;
comparing the characteristic data with data in a characteristic library, and if the characteristic library does not contain the characteristic data, calculating a score of the characteristic data by using an information entropy formula;
if the score is greater than a first threshold, discarding the feature data;
if the score is larger than a second threshold and smaller than a first threshold, encoding the characteristic data by adopting a one-hot encoding method;
if the score is smaller than a third threshold value and the feature data are not id features, encoding the feature data by adopting a mean encoding method; if the score is smaller than a third threshold value and the feature data are id features, encoding the feature data by adopting an Embedding encoding method;
and outputting the coded characteristic data.
Optionally, the feature data includes a conversion rate feature, a user feature, a commodity feature, an id feature, a search advertisement feature, a time feature, and a ranking feature.
Optionally, in the preprocessing of the data in the shopping data set, the method further includes smoothing the conversion rate feature by using a conversion rate smoothing method based on a priori value, specifically:
Figure BDA0002400937430000031
when the conversion rate is calculated in different feature or feature combination groups, B represents the number of purchases of the corresponding feature or feature combination, C represents the number of clicks of the corresponding feature or feature combination,
Figure BDA0002400937430000032
and
Figure BDA0002400937430000033
the average purchase number and the average click number of the corresponding characteristics or the characteristic combinations in the same time range are respectively, the parameter adjustment range of lambda is between 0 and 1, and the confidence of the statistical value is shown.
Optionally, the evaluation criteria used for acquiring the shopping data set on the day of the shopping festival and before the shopping festival are:
Figure BDA0002400937430000034
where N represents the number of test set samples, yiTrue label, p, representing the ith sample in the test setiThe estimated conversion for the ith sample is shown.
A second aspect of the present application provides a search advertisement conversion rate prediction apparatus for shopping malls, the apparatus comprising:
the data acquisition module is used for acquiring a shopping data set on the current shopping festival and before the shopping festival;
the first prediction module is used for taking a shopping data set before the shopping festival as a first training set, training an advertisement conversion rate model and predicting to obtain a first prediction result of the current shopping festival;
the data set processing module is used for taking the shopping data set of a part of time of the day of the shopping festival as a second training set, taking the shopping data set of another part of time of the day of the shopping festival as a test set, taking part of data in the second training set as a verification set, and respectively adding the first prediction result serving as a new feature into the second training set, the verification set and the test set;
and the merging prediction module is used for training the advertisement conversion rate model by the second training set, the verification set and the test set added with new features to obtain the final result of the current shopping festival.
Optionally, the method further includes:
and the preprocessing module is used for preprocessing the data in the shopping data set to obtain data in accordance with the input format of the advertisement conversion rate model.
Optionally, the preprocessing module further includes:
the data restoration module is used for processing missing values and abnormal values in the shopping data set;
the characteristic extraction module is used for processing the original characteristics of the attributes of the search advertisement categories in the shopping data set to obtain characteristic data meeting the format requirement;
and the hierarchical coding module is used for hierarchically coding the characteristic data.
According to the technical scheme, the method has the following advantages:
the application provides a method and a device for predicting the conversion rate of search advertisements for shopping nodes, wherein the method comprises the following steps: acquiring a shopping data set on the current day of a shopping festival and before the shopping festival; taking a shopping data set before a shopping festival as a first training set, training an advertisement conversion rate model, and predicting to obtain a first prediction result of the current day of the shopping festival; taking a shopping data set of a part of time of the day of the shopping festival as a second training set, taking a shopping data set of another part of time of the day of the shopping festival as a test set, taking part of data in the second training set as a verification set, and respectively adding a first prediction result serving as a new feature into the second training set, the verification set and the test set; and training an advertisement conversion rate model to obtain a final prediction result of the current shopping festival.
According to the method, part of data of the current shopping festival is used as a training set, the other part of shopping data set is used as a test set, and part of data in the training set is used as a verification set and used for training a corresponding machine learning model, so that the accuracy of the machine learning model is guaranteed; in addition, the data before the shopping festival is used as a training set to train a corresponding machine learning model, and corresponding time windows are divided to obtain the characteristics of the influence of the data before the shopping festival on the current day of the shopping festival, so that the defect of time characterization characteristics caused by only using the data on the current day of the shopping festival as training data is overcome.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of a search advertisement conversion rate prediction method for shopping nodes according to the present application;
FIG. 2 is a schematic diagram of an embodiment of an apparatus for predicting conversion rate of search advertisements for shopping nodes according to the present application;
FIG. 3 is a line graph of click rate on the day of the shopping festival and 7 days before the shopping festival in the embodiment of the present application;
FIG. 4 is a line graph of the total conversion on the day of the shopping festival and 7 days before the shopping festival in accordance with an embodiment of the present invention;
FIG. 5 is a graphical illustration of the local conversion on the day of the shopping festival in an embodiment of the present invention;
FIG. 6 is a flow chart of hierarchical encoding of feature data in an embodiment of the present invention;
FIG. 7 is a diagram illustrating feature data obtained after layered coding according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating the data of the current day of the shopping festival as a training set, a validation set, and a test set according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of an advertisement conversion rate model according to an embodiment of the present invention;
fig. 10 is a schematic diagram of the feature groups obtained in the embodiment of the present invention.
Detailed Description
Because in the existing data of shopping nodes, the training set formed by the conversion rate related data before the shopping node has no predictability for the test set formed by the conversion rate related data of the current day of the shopping node. Specifically, as shown in fig. 3 and 4, fig. 3 is a line graph of click rate on the day of the shopping festival and 7 days before the shopping festival, fig. 4 is a line graph of total conversion rate on the day of the shopping festival and 7 days before the shopping festival, it can be seen that day1 to day6 are one type, and the conversion rate is medium; day7 is one type, low conversion; day8 is a type with high conversion rate (day8 is the date of the shopping festival), wherein the conversion rate represents the ratio of the purchase amount to the click rate. In practice, the distance between the shopping nodes is represented as day 1-day 6, the influence of the shopping node factors is small, and the shopping nodes are called as daily periods in the invention; day7 is the day before the shopping festival, and the user generates a large amount of click behaviors such as collection and purchase in order to wait for the arrival of the shopping festival, but the actual purchase behaviors are few; and day8, the purchasing behavior generated on the day of the shopping festival, is far beyond the daily period, and belongs to a special flow with high conversion rate. Therefore, if data before the shopping festival is used as a training set to directly predict the conversion rate of the shopping festival on the day, the prediction is distorted.
Therefore, the method for composite prediction of the composite model composed of different data set division methods is used for predicting the current-day conversion rate data of the shopping festival, and by adopting part of the current-day data of the shopping festival as a training set, the other part of the current-day data of the shopping festival as a test set and part of the current-day data of the shopping festival as a verification set, the current-day data of the shopping festival is used for training a corresponding machine learning model, so that the accuracy of the machine learning model is ensured; in addition, the data before the shopping festival is used as a training set to train a corresponding machine learning model, and corresponding time windows are divided to obtain the characteristics of the influence of the data before the shopping festival on the current day of the shopping festival, so that the defect of time characterization characteristics caused by only using the data on the current day of the shopping festival as training data is overcome.
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for predicting a search advertisement conversion rate of a shopping node according to an embodiment of the present invention, as shown in fig. 1, where fig. 1 includes:
101. shopping data sets on the day of the shopping festival and before the shopping festival are obtained.
It should be noted that, when acquiring a shopping node and a shopping data set before the shopping node, data such as search advertisements and purchases before and after the shopping node of the shopping mall may be selected, the data type may include multiple dimensions such as user behavior, product attribute, store attribute, data related to search terms, and the corresponding behavior is converted into a purchase of the user. The data source data obtained can be modeled as large, using the estimated criterion as the loglos function:
Figure BDA0002400937430000061
where N represents the number of test set samples, yiTrue label, p, representing the ith sample in the test setiIndicating the predicted conversion of the ith sample, a smaller value of logloss indicates a more accurate predicted value.
In a specific embodiment, data of a shopping festival and 7 days before the shopping festival may be obtained, specifically as shown in fig. 3-5, and fig. 3 is a line graph of click rate on the day of the shopping festival and 7 days before the shopping festival; FIG. 4 is a plot of the overall conversion on the day of the shopping festival and 7 days prior to the shopping festival; fig. 5 is a schematic diagram of local conversion rate of the current day of the shopping festival, so that it can be seen that the data distribution before the shopping festival is greatly different from the data distribution of the conversion rate of the current day of the shopping festival.
102. And taking a shopping data set before the shopping festival as a first training set, training an advertisement conversion rate model, and predicting to obtain a first prediction result of the current shopping festival.
It should be noted that, since the daily purchase amount of the shopping festival far exceeds the average daily purchase amount of the shopping festival, in order to ensure that the data distribution of the training set and the test set is the same, the user behavior of part of the data of the shopping festival on the day needs to be modeled as the training set, but the data before the shopping festival should not be directly discarded, otherwise, serious data waste is caused, and therefore, the characteristics of the user, the commodity, the shop and the like can be characterized through the data before the shopping festival. Specifically, the features of the user, the commodity, the shop and the like in the historical data can be extracted by adopting the data set before a part of shopping nodes, the data of the current shopping node can be predicted by adopting the data set before another part of shopping nodes as a training set, and the prediction result is used as a new feature.
In one specific embodiment, data from the day of the shopping day and data from the 7 th day before the shopping day are used as experimental data. As shown in fig. 9, the Model2 in the structural diagram of the advertisement conversion rate Model in the embodiment of the present invention trains the advertisement conversion rate Model by using data of day4-day6 to predict the conversion rate of the shopping festival on the day, and the data sets of day 1-day 3 are used to extract the characteristics of users, commodities, stores, etc. in the history data.
103. And taking the shopping data set of the shopping festival in a part of the current day as a second training set, taking the shopping data set of the shopping festival in another part of the current day as a test set, taking part of the data in the second training set as a verification set, and respectively adding the first prediction result serving as a new feature into the second training set, the verification set and the test set.
It should be noted that, because the daily purchase amount of the shopping festival far exceeds the average purchase amount of each day before the shopping festival, only the data of the shopping festival on the day is taken as the main training set for modeling, that is, the shopping data set of the shopping festival on a part of the day is taken as the second training set, the shopping data set of the shopping festival on another part of the day is taken as the test set, and the data in the second training set is taken as the verification set. However, if only the data of the current day of the shopping festival is considered, the loss of the time characterization feature is caused, namely, the data before the shopping festival may have certain influence on the prediction result of the current day of the shopping festival. Therefore, it is necessary to predict the current day conversion rate of the shopping node by using the data before the shopping node as a training set, and since the prediction result is the current day conversion rate of the shopping node, the prediction result needs to be added as a new feature to the second training set, the verification set and the test set, respectively.
In a specific embodiment, according to the local conversion rate diagram of the shopping festival on the day shown in fig. 5, the data of the shopping festival on the day from 0 to 10 may be selected as a training set, the data of the shopping festival from 10 to 12 may be selected as a verification set, and the data of the shopping festival from 12 to 24 may be selected as a test set. As shown in the structural diagram of the advertisement conversion rate model shown in FIG. 9, Dataset2 adopts a conventional time sliding window method, which means that the conversion rate of the day of the shopping festival is predicted by using the data before the shopping festival as a training set, and the predicted result is added as a new list of features to a second training set, a verification set and a test set.
104. And training an advertisement conversion rate model by adopting a second training set, a verification set and a test set added with new characteristics to obtain a final result of the current shopping festival.
It should be noted that, in order to make up for the lack of the time-related feature, the feature including the time-related feature obtained by training the advertisement conversion rate model with the data before the shopping festival is added to the training set, the verification set and the test set of the current shopping festival for training the corresponding advertisement conversion rate model to obtain the final result of the current shopping festival.
In a specific embodiment, Dataset2 uses a traditional time sliding window method, and takes day 1-day 7 as a training set to predict the conversion rate of day8 all day, and adds the predicted result as a new list of features to Dataset1, wherein the new list of features carries time information, which can make up for the lack of time depiction of the features mentioned in Dataset 1.
According to the method, part of data of the current shopping festival is used as a training set, the other part of shopping data set is used as a test set, and part of data in the training set is used as a verification set and used for training a corresponding machine learning model, so that the accuracy of the machine learning model is guaranteed; in addition, the data before the shopping festival is used as a training set to train a corresponding machine learning model, and corresponding time windows are divided to obtain the characteristics of the influence of the data before the shopping festival on the current day of the shopping festival, so that the defect of time characterization characteristics caused by only using the data on the current day of the shopping festival as training data is overcome.
The above is a method flow diagram of an embodiment of a method for predicting a search advertisement conversion rate for a shopping segment according to the present application, and the present application further includes another embodiment of a method for predicting a search advertisement conversion rate for a shopping segment, and the steps further include:
201. and preprocessing the data in the shopping data set to obtain data in accordance with the input format of the advertisement conversion rate model.
Wherein the pretreatment specifically comprises:
2011. missing values and outliers in the shopping dataset are processed.
The processing of the missing values and the abnormal values in the shopping dataset includes filling the missing values, deleting or replacing the abnormal values, and denoising the abnormal click data.
Specifically, the missing values are filled: and carrying out mode filling on the discrete data, and carrying out median filling on the continuous data.
Deletion or substitution of outliers: and (4) performing data analysis by adopting the box type graph, and filling or deleting abnormal values in the box type graph by using a specific value.
And denoising the abnormal click data: through data exploration, the fact that the number of clicks of certain shops in a certain time period is large but the clicks are not converted into purchasing behaviors once can be found, so from the business perspective, the clicks are considered to belong to the bill swiping behaviors, the data needs to be subjected to noise reduction, and the specific method is to eliminate the users who have the largest number of clicks and do not purchase corresponding to suspicious merchants.
2012. And processing the original characteristics of the attributes of the search advertisement categories in the shopping data set to obtain characteristic data meeting the format requirement.
It should be noted that the category and attribute features of the product belong to the raw data. In addition, there is also a list of special data in most commercial platforms: the predicted commodity category attribute list is a commodity feature which is predicted by a platform by using a search word input by a user through a collaborative filtering method and the like and is possibly interested by the user, and belongs to special data in a search advertisement scene. The data formats of the original category attributes of the commodities and the predicted category attributes of the search terms are specifically shown in the following table, and the characteristics need to be sliced to separate multiple columns of characteristics.
Figure BDA0002400937430000091
Figure BDA0002400937430000101
Counting the occurrence times of the commodity attributes aiming at item _ properties, and keeping the first ten items as new characteristics; for the prediction _ category _ property, reserving the top ten items with the most times as new characteristics; for item _ categories, only leaf categories are taken as new features since the root categories are all the same.
2013. And coding the characteristic data by adopting a hierarchical coding method based on an information entropy principle.
It should be noted that, the present application adopts a hierarchical coding method based on the information entropy principle to solve the feature coding problem, wherein the available high-dimensional features are processed by adopting a scheme of combining mean coding with Embedding.
The specific process of hierarchical coding is shown in fig. 6, and includes:
inputting characteristic data; comparing the characteristic data with data in a characteristic library, and if the characteristic library does not contain the characteristic data, calculating a characteristic data score; if the score is greater than a first threshold, discarding the feature data; if the score is larger than a second threshold and smaller than a first threshold, encoding the characteristic data by adopting a one-hot encoding method; if the score is smaller than a third threshold value and the feature data are not the id-type features, encoding the feature data by adopting a mean encoding method; if the score is smaller than a third threshold value and the feature data are id-type features, encoding the feature data by adopting an Embedding encoding method; and outputting the coded characteristic data.
It should be noted that there are two common characteristic attributes: dimensionality and sparsity. In the traditional method, the dimension can be distinguished by setting a threshold, the sparsity is often distinguished manually by experience and based on a specific scene, and the dimension is distinguished by a quantitative method.
The entropy of information is the expectation of information quantity, represents the uncertainty of information, describes the chaos degree of data, and when the probability in entropy is estimated by the data, the corresponding entropy value is called empirical entropy. The method uses the idea of information entropy to measure the sparsity degree of the features, namely, the discrete degree of a certain column of feature data is calculated by using the empirical entropy. Assuming that the data set is D, the number of samples in the data set is | D |, and assuming that the value of a certain column of features is represented by (K ═ 0,1,2 …, K), the number of samples belonging to a certain value is | CkL. The empirical entropy calculation formula of the features is as follows:
Figure BDA0002400937430000102
the smaller the entropy value H is, the denser the characteristic distribution of the list is represented; the larger the entropy value H, the more sparsely the distribution of the features representing the column. The feature dimension is denoted by K. The dimension and entropy values are multiplied by one another and expressed as Score, and the formula is as follows:
Figure BDA0002400937430000111
analysis shows that when the feature dimension is larger and the feature distribution is more sparse, the Score value is larger; meanwhile, under the condition that each column of features belong to the same training set and the data volume of the training set is large enough, the low-dimensional features cannot be sparsely distributed at the same time, so the value of the Score can be divided into three conditions: high and overly sparse, high and denser, and low.
The method selects the threshold values a and b to distinguish the three conditions, firstly, initial values of the threshold values a and the threshold values b are set from actual services, the set basis is the proportion of the three characteristics in the total characteristics, the parameter adjusting process floats around the initial values in a certain range, and the value with the optimal result is selected.
The specific coding method adopted is shown in the following table:
Figure BDA0002400937430000112
for available high-dimensional features, whether the features are id class features or not is further distinguished. When the id features are processed, each id generally represents an entity, each characteristic value can be edited into a document form for encoding by using Embedding, and the space distance of entities with similar attributes can be shortened by a vector obtained by encoding, so that the prediction effect on the portrait features of users and commodities is greatly increased. The mean value coding can be suitable for various high-dimensional features, but because the coding is performed by using a statistical method only, the coding effect of id features is inferior to that of Embedding, and therefore the mean value coding is used as an auxiliary method of Embedding to process other high-dimensional features.
In addition, in order to save the time required for encoding, a feature library can be set by using the prediction situation of the daily period, and the name, structure and encoding method of the existing feature can be stored in the feature library. When encoding features, first, it is searched for existing features in the feature library. If so, directly adopting a coding method corresponding to the characteristic; if not, the coding method of the invention is adopted, and the name, the structure and the corresponding coding method of the new feature are stored in the feature library after coding, and the feature library becomes a new feature library suitable for the large shopping volume saving and can be used again when the next shopping volume comes.
In the present application, the feature data may further include a conversion rate feature, a user feature, a commodity feature, an id feature, a search advertisement feature, a time feature, and a ranking feature.
It should be noted that the conversion rate characteristic is defined as the ratio of the number of purchases of the user to the number of clicks, but this simple calculation method has the following two problems:
a. when the number of advertisement clicks is small, directly calculating the conversion rate may result in a high result. For example, if an advertisement is clicked only 1 time, and 1 purchase is generated, then CVR is 1.0, which is an overestimate.
b. When the number of advertisement clicks is large but the number of purchases is small, directly calculating the conversion rate results in a low result, even close to 0, which is an overestimate.
Therefore, the conversion rate needs to be smoothed, and the traditional method adopts a Bayesian smoothing method, which is a method widely used in the estimation of the conversion rate of the click rate. However, the calculation process of the parameters is complicated, and under the condition of large data volume, if Bayesian smooth conversion is frequently performed, a large amount of calculation resources are occupied, so that the whole process becomes very inefficient. The application provides a conversion rate smoothing method based on a priori value, although the calculation precision of the conversion rate smoothing method is different from that of a Bayes smoothing method in a certain degree theoretically, the effect is almost the same on the final result actually reflected, and the running speed is greatly optimized. The specific method comprises the following steps:
Figure BDA0002400937430000121
when the conversion rate is calculated in different feature or feature combination groups, B represents the number of purchases of the corresponding feature or feature combination, C represents the number of clicks of the corresponding feature or feature combination,
Figure BDA0002400937430000122
and
Figure BDA0002400937430000123
the average purchase number and the average click number of the corresponding characteristics or the characteristic combinations in the same time range are respectively, the parameter adjustment range of lambda is between 0 and 1, and the confidence of the statistical value is shown.
Specifically, when the conversion rate is calculated in the item id group, B represents the number of purchases of a certain item, C represents the number of clicks of a certain item,
Figure BDA0002400937430000124
and
Figure BDA0002400937430000125
the average number of purchases and the average number of clicks for all the items in the same time frame, respectively.
It should be noted that the user characteristics are expressed as depicting the click behavior records of different users. However, there are users who click more and buy less frequently in the data set, and there are users who click less and buy more frequently. For a low-frequency user, the historical behavior of the user is difficult to depict, and only the characteristics of the user, such as the category attributes of search words and click commodities, can be counted; for high frequency users, more specific preference characteristics can be characterized. The corresponding user characteristics are specifically as follows:
a. user frequency: whether the product appears in a daily period, whether the product appears in the previous day, whether the corresponding commodity or shop is clicked in the previous day, and the like.
b. User behavior: the number of clicks, whether the first click/the last click, the shop time interval of the previous click, is in proportion to the number of clicks of the commodity.
It should be noted that the characteristics of the goods include price ranking, sales promotion strength of the goods in the category, and the like.
It should be noted that the search advertisement features include the original search category, the input search term that the user indicates his intention, and the obtained category attribute list containing the search term prediction. I.e., including search terms, original item categories, and predicted category attribute features.
It should be noted that the time characteristics include a time window characteristic, a different time granularity characteristic, a time difference characteristic, and the like. The time window features are relatively important features, and specifically comprise a series of click, conversion, cross features and the like which are respectively extracted from users, commodities and the like in two time periods of day (day 1-day 6) and day before a shopping festival (day 7); the shopping day (day8) is used to characterize the day's behavior, such as merchandise and stores, based on statistics of the day's morning.
It should be noted that the ordering feature includes a global ordering and a local ordering, and is specifically shown in fig. 10:
global ordering: the number of times the user clicks on the commodities is sorted, the number of times the user clicks on the stores is sorted, the number of times the commodities are clicked by different users is sorted, the number of times the stores are clicked by different users is sorted, and the like.
Local sorting: the number of times a user clicks on a category/good/store is a ranking of the total categories/goods/stores that the user clicks on, a ranking of the number of times a good is purchased by different users, etc.
In addition, it should be noted that, in the application, when the advertisement conversion rate model is calculated, the XGBoost algorithm is adopted to solve. Specifically, in a conversion prediction scenario, click behavior is necessarily much more than conversion behavior, that is, there is a problem of imbalance between positive and negative samples. The conventional method is to perform prediction after sampling data samples, and generally down-sample most types of samples or up-sample few types of samples. When the sequencing indexes of positive and negative samples of the prediction result are more concerned, the method can avoid that the prediction result is biased to most samples during prediction, and scale _ pos _ weight parameters in the XGboost algorithm can be set at the moment, and the principle is that few samples are up-sampled.
However, the prediction target of the application is a specific probability value, and the adopted evaluation index is a loglos function, which has higher requirements on the accuracy of the probability value. If the ratio of positive and negative samples in the training set is changed by sampling, the accuracy of the predicted probability value is affected because it changes the distribution of the original data. Therefore, the data set is not processed by using a traditional sampling method, namely the scale _ pos _ weight parameter is not specially set, but the max _ delta _ step parameter of the XGboost algorithm is set to be a limited number, and the principle of the XGboost algorithm is to prevent overfitting and help convergence.
It should be further noted that the feature importance is measured by using the feature scoring function of the XGBoost algorithm. Through calculation, the following characteristics can be found to be of greater importance: searching relevant characteristics of the words, wherein the influence of the actual measurement on the result by 6 thousandths is a characteristic group with the largest influence amplitude; the conversion rate characteristic is a characteristic group with second largest influence, and the measured influence on the result is 3 thousandths. These two results also fit the scenario of search advertisement conversion rate prediction greatly. The sorting feature, the time feature, the feature group of the user and the commodity image and the like also have certain influence on the result, and the result is between the kilo point and the ten thousand point.
Finally, comparing the reproduction result of the scheme most similar to the present application with the result of the present application, the following table shows that the loss value of the scheme is smaller, i.e. the obtained result is closer to the real value. The method can be proved to be a high-efficiency and accurate search advertisement conversion rate prediction method which can adapt to special traffic.
Most similar scheme This scheme
loss of logloss value 0.14183 0.13990
The above are embodiments of the method of the present application, and the present application further includes an embodiment of a device for predicting a conversion rate of a search advertisement for a shopping node, specifically as shown in fig. 2, including:
the data acquisition module 301 is configured to acquire a shopping data set on the current day of the shopping festival and before the shopping festival.
The first prediction module 302 is configured to train an advertisement conversion rate model by using a shopping data set before a shopping festival as a first training set, and predict a first prediction result of the current shopping festival.
The data set processing module 303 is configured to use a shopping data set of a part of time of the day of the shopping festival as a second training set, use a shopping data set of another part of time of the day of the shopping festival as a test set, use part of data in the second training set as a verification set, and add the first prediction result as a new feature to the second training set, the verification set, and the test set, respectively;
and the merging prediction module 304 is used for training the advertisement conversion rate model by the second training set, the verification set and the test set added with the new features to obtain the final result of the current shopping festival.
The embodiment of the application further comprises:
and the preprocessing module is used for preprocessing the data in the shopping data set to obtain data in accordance with the input format of the advertisement conversion rate model.
Wherein, the preprocessing module further comprises:
the data restoration module is used for processing missing values and abnormal values in the shopping data set;
the characteristic extraction module is used for processing the original characteristics of the attributes of the search advertisement categories in the shopping data set to obtain characteristic data meeting the format requirement;
and the hierarchical coding module is used for hierarchically coding the characteristic data.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical division, and other divisions may be realized in practice, for example, a plurality of modules may be combined or integrated into another system, or some features may be omitted, or not executed.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method for predicting the conversion rate of search advertisements for shopping nodes is characterized by comprising the following steps:
acquiring a shopping data set on the current day of a shopping festival and before the shopping festival;
taking a shopping data set before the shopping festival as a first training set, training an advertisement conversion rate model, and predicting to obtain a first prediction result of the current shopping festival;
taking the shopping data set of a part of time of the day of the shopping festival as a second training set, taking the shopping data set of another part of time of the day of the shopping festival as a test set, taking part of data in the second training set as a verification set, and respectively adding the first prediction result serving as a new feature into the second training set, the verification set and the test set;
and training the advertisement conversion rate model by adopting the second training set, the verification set and the test set added with new features to obtain the final result of the current shopping festival.
2. The method of claim 1, wherein the training of the advertisement conversion rate model to predict the first prediction result of the current day of the shopping festival is performed by using the shopping data set before the shopping festival as a first training set, and the method further comprises:
and preprocessing the data in the shopping data set to obtain data in accordance with the input format of the advertisement conversion rate model.
3. The method for predicting search advertisement conversion rate for shopping nodes according to claim 2, wherein the preprocessing of the data in the shopping data set is specifically:
processing missing values and abnormal values in the shopping data set;
processing the original attribute characteristics of the search advertisement categories in the shopping data set to obtain characteristic data meeting format requirements;
and coding the characteristic data by adopting a hierarchical coding method based on an information entropy principle.
4. The method for predicting search advertisement conversion rate for shopping node according to claim 3, wherein said encoding the feature data by using a hierarchical encoding method based on information entropy principle specifically comprises:
inputting the characteristic data;
comparing the characteristic data with data in a characteristic library, and if the characteristic library does not contain the characteristic data, calculating a score of the characteristic data by using an information entropy formula;
if the score is greater than a first threshold, discarding the feature data;
if the score is larger than a second threshold and smaller than a first threshold, encoding the characteristic data by adopting a one-hot encoding method;
if the score is smaller than a third threshold value and the feature data are not id features, encoding the feature data by adopting a mean encoding method; if the score is smaller than a third threshold value and the feature data are id features, encoding the feature data by adopting an Embedding encoding method;
and outputting the coded characteristic data.
5. The method of claim 3, wherein the feature data includes a conversion feature, a user feature, a commodity feature, an id feature, a search advertisement feature, a time feature, and a ranking feature.
6. The method of claim 5, wherein the preprocessing the data in the shopping dataset further comprises smoothing the conversion characteristics by a conversion smoothing method based on a priori values, specifically:
Figure FDA0002400937420000021
when the conversion rate is calculated in different feature or feature combination groups, B represents the number of purchases of the corresponding feature or feature combination, C represents the number of clicks of the corresponding feature or feature combination,
Figure FDA0002400937420000022
and
Figure FDA0002400937420000023
the average purchase number and the average click number of corresponding characteristics or characteristic combinations in the same time range are respectively represented, the parameter adjustment range of lambda is between 0 and 1, and the statistical value is representedThe confidence of (3).
7. The method of claim 1, wherein the evaluation criteria for obtaining the shopping data sets on the day of the shopping festival and before the shopping festival are:
Figure FDA0002400937420000024
where N represents the number of test set samples, yiTrue label, p, representing the ith sample in the test setiThe estimated conversion for the ith sample is shown.
8. A search advertisement conversion rate prediction apparatus for a shopping node, comprising:
the data acquisition module is used for acquiring a shopping data set on the current shopping festival and before the shopping festival;
the first prediction module is used for taking a shopping data set before the shopping festival as a first training set, training an advertisement conversion rate model and predicting to obtain a first prediction result of the current shopping festival;
the data set processing module is used for taking the shopping data set of a part of time of the day of the shopping festival as a second training set, taking the shopping data set of another part of time of the day of the shopping festival as a test set, taking part of data in the second training set as a verification set, and respectively adding the first prediction result serving as a new feature into the second training set, the verification set and the test set;
and the merging prediction module is used for training the advertisement conversion rate model by the second training set, the verification set and the test set added with new features to obtain the final result of the current shopping festival.
9. The apparatus for predicting search advertisement conversion rate for shopping malls according to claim 8, further comprising:
and the preprocessing module is used for preprocessing the data in the shopping data set to obtain data in accordance with the input format of the advertisement conversion rate model.
10. The apparatus of claim 9, wherein the preprocessing module further comprises:
the data restoration module is used for processing missing values and abnormal values in the shopping data set;
the characteristic extraction module is used for processing the original characteristics of the attributes of the search advertisement categories in the shopping data set to obtain characteristic data meeting the format requirement;
and the hierarchical coding module is used for hierarchically coding the characteristic data.
CN202010146512.4A 2020-03-05 2020-03-05 Search advertisement conversion rate prediction method and device for shopping node Active CN111352976B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010146512.4A CN111352976B (en) 2020-03-05 2020-03-05 Search advertisement conversion rate prediction method and device for shopping node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010146512.4A CN111352976B (en) 2020-03-05 2020-03-05 Search advertisement conversion rate prediction method and device for shopping node

Publications (2)

Publication Number Publication Date
CN111352976A true CN111352976A (en) 2020-06-30
CN111352976B CN111352976B (en) 2023-05-09

Family

ID=71197263

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010146512.4A Active CN111352976B (en) 2020-03-05 2020-03-05 Search advertisement conversion rate prediction method and device for shopping node

Country Status (1)

Country Link
CN (1) CN111352976B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113571198A (en) * 2021-08-31 2021-10-29 平安医疗健康管理股份有限公司 Conversion rate prediction method, device, equipment and storage medium
CN113610582A (en) * 2021-08-16 2021-11-05 脸萌有限公司 Advertisement recommendation method and device, storage medium and electronic equipment
CN113672801A (en) * 2021-07-30 2021-11-19 北京三快在线科技有限公司 Information processing method and device, storage medium and electronic equipment
CN114428887A (en) * 2021-12-30 2022-05-03 北京百度网讯科技有限公司 Click data denoising method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255651A (en) * 2018-08-22 2019-01-22 重庆邮电大学 A kind of search advertisements conversion intelligent Forecasting based on big data
CN109741113A (en) * 2019-01-10 2019-05-10 博拉网络股份有限公司 A kind of user's purchase intention prediction technique based on big data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255651A (en) * 2018-08-22 2019-01-22 重庆邮电大学 A kind of search advertisements conversion intelligent Forecasting based on big data
CN109741113A (en) * 2019-01-10 2019-05-10 博拉网络股份有限公司 A kind of user's purchase intention prediction technique based on big data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱琛刚;程光;: "基于收视行为的互联网电视节目流行度预测模型", 电子与信息学报 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113672801A (en) * 2021-07-30 2021-11-19 北京三快在线科技有限公司 Information processing method and device, storage medium and electronic equipment
CN113610582A (en) * 2021-08-16 2021-11-05 脸萌有限公司 Advertisement recommendation method and device, storage medium and electronic equipment
CN113571198A (en) * 2021-08-31 2021-10-29 平安医疗健康管理股份有限公司 Conversion rate prediction method, device, equipment and storage medium
CN113571198B (en) * 2021-08-31 2024-07-12 深圳平安医疗健康科技服务有限公司 Conversion rate prediction method, conversion rate prediction device, conversion rate prediction equipment and storage medium
CN114428887A (en) * 2021-12-30 2022-05-03 北京百度网讯科技有限公司 Click data denoising method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111352976B (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN111352976A (en) Search advertisement conversion rate prediction method and device for shopping nodes
CN106485562B (en) Commodity information recommendation method and system based on user historical behaviors
CN109509033B (en) Big data prediction method for user purchasing behavior in consumption financial scene
CN108665311B (en) Electric commercial user time-varying feature similarity calculation recommendation method based on deep neural network
CN108550068B (en) Personalized commodity recommendation method and system based on user behavior analysis
TWI793412B (en) Consumption prediction system and consumption prediction method
CN112200601B (en) Item recommendation method, device and readable storage medium
CN109255651A (en) A kind of search advertisements conversion intelligent Forecasting based on big data
CN102902691A (en) Recommending method and recommending system
CN110880127B (en) Consumption level prediction method and device, electronic equipment and storage medium
CN116431931B (en) Real-time incremental data statistical analysis method
CN111652735A (en) Insurance product recommendation method based on user behavior label characteristics and commodity characteristics
KR101435096B1 (en) Apparatus and method for prediction of merchandise demand using social network service data
CN113570398A (en) Promotion data processing method, model training method, system and storage medium
KR101658714B1 (en) Method and system for predicting online customer action based on online activity history
CN117237038A (en) Commodity accurate exposure processing system based on flow engine
CN115841345A (en) Cross-border big data intelligent analysis method, system and storage medium
CN110209944A (en) A kind of stock analysis teacher recommended method, device, computer equipment and storage medium
CN115965468A (en) Transaction data-based abnormal behavior detection method, device, equipment and medium
Bhargavi et al. Comparative study of consumer purchasing and decision pattern analysis using pincer search based data mining method
Cho et al. Clustering method using weighted preference based on RFM score for personalized recommendation system in u-commerce
CN113065892A (en) Information pushing method, device, equipment and storage medium
CN112016582A (en) Dish recommending method and device
Tekin et al. Click and sales prediction for OTAs’ digital advertisements: Fuzzy clustering based approach
CN112182165B (en) New product quality planning method based on online comments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant