CN106529721A - Advertisement click-through rate predication system based on deep characteristic extraction and predication method thereof - Google Patents

Advertisement click-through rate predication system based on deep characteristic extraction and predication method thereof Download PDF

Info

Publication number
CN106529721A
CN106529721A CN201610983314.7A CN201610983314A CN106529721A CN 106529721 A CN106529721 A CN 106529721A CN 201610983314 A CN201610983314 A CN 201610983314A CN 106529721 A CN106529721 A CN 106529721A
Authority
CN
China
Prior art keywords
data
advertisement
depth characteristic
click
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610983314.7A
Other languages
Chinese (zh)
Other versions
CN106529721B (en
Inventor
许荣斌
谢莹
张磊
张兴义
张以文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN201610983314.7A priority Critical patent/CN106529721B/en
Publication of CN106529721A publication Critical patent/CN106529721A/en
Application granted granted Critical
Publication of CN106529721B publication Critical patent/CN106529721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an advertisement click-through rate prediction system based on deep characteristic extraction and a prediction method thereof. The system comprises the components of an advertisement log data acquisition subsystem which is used for acquiring advertisement clicking log data; a partition detecting system which is used for performing partition detection on the advertisement clicking log data; a ten-layer sparse constraint characteristic extracting hidden-layer subsystem which is used for extracting a deep characteristic of the advertisement data in the advertisement clicking log data after partition detection; and a space constraint model generating subsystem which is used for performing spatial constraint according to the deep characteristic for obtaining a prediction model, wherein on the condition that the new advertisement clicking log data are input, a corresponding prediction result can be obtained by the prediction model. The advertisement click-through rate prediction system and the prediction method can extract the deep characteristic in the advertisement clicking log data. For aiming at a large amount of advertisement clicking log data, partition module detection is firstly preformed for reinforcing sparse planning and combining with spatial constraint for furthermore generating the prediction model.

Description

Ad click rate prognoses system and its Forecasting Methodology that a kind of depth characteristic is extracted
Technical field
The present invention relates to a kind of ad click rate prognoses system and its Forecasting Methodology in advertisement putting field, more particularly to one Plant ad click rate prognoses system and its Forecasting Methodology that depth characteristic is extracted.
Background technology
In the patent found hitherto, the method such as most of logic-based recurrence, Bayes carries out model optimization training, This kind of linear model can not learn the nonlinear characteristic information in data, and parameter is more easily causes over-fitting.It is such Model adopts maximal possibility estimation, needs mass data to ensure performance, is not suitable for estimating sparse ad data.Although having Small part patent and technology are related to the certain methods of deep learning at present, but in magnanimity advertisement log data, advertisement Show frequency and clicking rate in power law distribution, search keyword frequency also presses power law distribution.Due to the ad click number of magnanimity Analyze according to large-scale data is related to;And the click logs of substantial amounts of advertisement and inquiry are all sparse, between feature, there is height There are many preconditions in actual applications in non-linear correlation, the correlation technique for proposing at present, and to ad click data Depth characteristic extractability it is inadequate.The method that existing patent and document are adopted as data volume is increasing, get over by analysis demand Come more urgent, it is impossible to which satisfaction applies present situation.
Such as, Chinese patent description CN105787767A《A kind of ad click rate prediction model acquisition methods and system》Point The other data to the user, the searching keyword and the advertisement carry out cluster dimensionality reduction, respectively obtain the user after cluster Ad data after data, the searching keyword data after cluster and cluster, sets up tensor, using Plutarch resolution of tensor method to institute State tensor to be decomposed, obtain the approximate tensor after the tensor dimensionality reduction;According to other objective attribute target attribute characteristics and described near Like tensor, carry out the support vector machine based on RBF and learn, obtain ad click rate prediction model.This method is adopted SVM based on RBF learns, and solves supporting vector by quadratic programming, and solves quadratic programming and be involved in N (samples This number) calculating of rank matrix correlation, in actual use large-scale training sample is difficult to carry out.
For another example, Chinese patent description CN105654200A《A kind of ad click rate Forecasting Methodology based on deep learning And device》A kind of method and apparatus is proposed, content includes:Obtain the training advertisement of predetermined amount, and each training advertisement correspondence Training clicking rate and training characteristics;The training characteristics of each training advertisement are converted into into training vector, using training vector and The training clicking rate training deep learning model of each training advertisement, wherein, deep learning model is based on nonlinear function reality Existing;Obtain by advertisement to be measured Feature Conversion to be measured into vector to be measured, and will it is to be measured vector as deep learning model Input, obtains prediction clicking rate corresponding with advertisement to be measured.This kind of method is mainly based upon the non-linear letter of general deep learning Number carries out conversion work, does not first carry out the pre- Subarea detecting of effective attribute to initial data before switching;And it is large-scale If ad click data are only simply based on nonlinear function again using Feature Conversion into vector as input, can not be fine Ground catches the internal feature structure of advertisement click logs data.
The content of the invention
Not enough in order to solve the above, the present invention proposes the ad click rate prognoses system that a kind of depth characteristic extracts and its pre- Survey method, which can extract the depth characteristic in advertisement click logs data, first carry out point for mass advertising click logs data Area's module detection, strengthens sparse planning, and fusion space constraint further generates forecast model, is that one kind can be to potential ad click The system and method being predicted.
The present invention solution be:The ad click rate prognoses system that a kind of depth characteristic is extracted, which includes:Advertisement day Will data acquisition subsystem, which is used for gathering advertisement click logs data;Subarea detecting subsystem, which is used for the advertisement point Hitting daily record data carries out Subarea detecting;Ten layers of sparse constraint feature extraction hidden layer subsystem, which is used for after Subarea detecting Advertisement click logs extracting data ad data depth characteristic;Space constraint model generates subsystem, and which is used for basis The depth characteristic carries out space constraint and obtains forecast model;Wherein, it is when there is new advertisement click logs data input, described Forecast model can just obtain corresponding predicting the outcome.
Used as the further improvement of such scheme, the advertisement click logs data are from advertising space data, geographical letter Breath data, page context data, Cookie data, the primary fields of the advertisement click logs data have:The click of advertisement Number of times, exposure frequency, advertisement link information, advertisement position information, inquiry label information, key word of the inquiry information, advertisement title Information, user label information, device therefor information.
Used as the further improvement of such scheme, the ad click data modeling is nothing by the Subarea detecting subsystem To without weight graph G, G=(V, E), wherein, V={ V1,V2,…VNBe N number of back end set;E=[eij] it is two numbers in V Connect the set on side according to node i and j;Subarea detecting subsystem purpose is for the ad click data analysiss and then finds K module
Further, the Subarea detecting subsystem introduces index matrixhikRepresent data section Point i belongs to the probability of module K,Arithmetic number matrix of the probit for N*N is represented, every a line of index matrix H is expressed as category The distribution of back end in same module K;And design wijFor connecting the probability of back end i and j, this probability is considered as It is that the side generated by back end i and j belongs to the probability of same community, the connection probability of back end i and j is:Adjacency matrix W is expressed as including the symmetrical matrix of non-negative elementImplication For the w if having side between back end i and jij=1, otherwise wij=0;For all 1≤i≤N, wii=0;Then, makeFind index matrix H to rebuild adjacency matrix W based on Non-negative Matrix Factorization Subarea detecting method, obtain the K of data Individual module;Using based on W and HHTL1 normal forms between two matrixes are come the loss between weighing;I-th row of index matrix H represents section Module belonging to point i, by companion matrix Z, obtains: The new information for most representing power is found in lower dimensional space and adjacency matrix is rebuild using equation (1) represent.
Yet further, division module is set up in Subarea detecting module maximize model;It is defined on subregion internal edges Quantity and the difference between all paired back end expectation numerical value, its modularity function S are designed as:Column vector data of the h for H, Q are modularity matrix;Using hTH=N simplifying, equation (2) Number of modules K > 2 is expanded to, is obtained:S=LKL(H, Q)=Tr (HQHT) (3);Wherein, Tr (.) is the mark of matrix;It is based on Rayleigh business, the solution of equation (3) is the maximal eigenvector of modularity matrix Q.
Used as the further improvement of such scheme, described ten layers of sparse constraint feature extraction hidden layer subsystem are in non-linear spy Ten layers of sparse storehouse self-encoding encoder are set up in levying extraction model, for each layer of self-encoding encoder extracts depth spy using neutral net Levy.
Further, the space constraint model generates the similar constraint in subsystem design space to generate forecast model;It is right Back end carries out paired space constraint, introduces the regularization of reconstruction attractor figure to carry out the generation of final mask.
Yet further, incorporate the priori that back end i and j belong to same space:First, in order to by two phases Likelihood data node-classification is to the same space, the new expression data row h of back end i and jiAnd hjShould be similar;Next, this A little prioris go further to affect the embedded of other back end in being encoded into model generation system.
The present invention also provides the ad click rate Forecasting Methodology that a kind of depth characteristic is extracted, and which is applied to above-mentioned any depth In the ad click rate prognoses system of feature extraction, the ad click rate Forecasting Methodology that depth characteristic is extracted is comprised the following steps: Collection advertisement click logs data;Subarea detecting is carried out to the advertisement click logs data;Wide after Subarea detecting Accuse the depth characteristic of click logs extracting data ad data;Space constraint is carried out according to the depth characteristic to obtain predicting mould Type;Wherein, when there is new advertisement click logs data input, the forecast model can just obtain corresponding predicting the outcome.
The most of logic-based regression functions of the correlation technique found hitherto carry out model training, to ad click number According to depth characteristic extractability not enough, and as the ad click data of magnanimity are related to large-scale data analysis, it is existing specially The method that profit and document are adopted in feature extraction is relatively simple, and the feature representability of extraction is weaker.With advertising business number Increasing according to measuring, analysis demand is more and more urgent, and current method and system can not meet applies present situation.
Description of the drawings
Fig. 1 is the structural representation of the ad click rate prognoses system that the depth characteristic of the present invention is extracted.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, it is below in conjunction with drawings and Examples, right The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, and It is not used in the restriction present invention.
Fig. 1 is referred to, the ad click rate prognoses system that the depth characteristic of the present invention is extracted includes that advertisement log data is adopted Subsystem, Subarea detecting subsystem, ten layers of sparse constraint feature extraction hidden layer subsystem, space constraint model generate subsystem System.
Advertisement log data acquisition subsystem is used for gathering advertisement click logs data;Subarea detecting subsystem is for institute Stating advertisement click logs data carries out Subarea detecting;Ten layers of sparse constraint feature extraction hidden layer subsystem are for examining through subregion The depth characteristic of the advertisement click logs extracting data ad data after survey;Space constraint model generates subsystem is used for basis The depth characteristic carries out space constraint and obtains forecast model.Wherein, it is when there is new advertisement click logs data input, described Forecast model can just obtain corresponding predicting the outcome.
Advertisement log data acquisition subsystem:Collection advertising space data, geographic information data, page context data, Cookie data etc..The primary fields of the advertisement click logs data have:The number of clicks of advertisement, exposure frequency, advertisement chain Meet information, advertisement position information, inquiry label information, key word of the inquiry information, advertisement title information, user label information, institute Use facility information.
Subarea detecting (subarea detection) subsystem:It is nothing first by the ad click data modeling for collecting To without weight graph G=(V, E), V={ V1,V2,…VNBe N number of back end set.E=[eij], it is two back end in V The set on connection side.Subarea detecting subsystem purpose is for ad click data analysiss and then finds K module Between the internal data of this K module, between relation ratio and external data, relation is even closer.
The system introduces index matrixhikRepresent back end i and belong to i.e. module k in community Probability.Subarea detecting module belongs to the probability of module k by back end i is captured, and every a line of H is expressed as belonging to same module The distribution of back end in area;And design wijFor connecting the probability of back end i and j.This probability further can be considered as It is the probability for being belonged to same community by the side of back end i and j generation.The connection probability of back end i and j is:
Adjacency matrix W is expressed as including the symmetrical matrix of non-negative elementIf implication is data There is side then w between node i and jij=1, otherwise wij=0.For all 1≤i≤N, wii=0.Then, the system is by subregion mould Block test problems regard Non-negative Matrix Factorization asNon-negative square is found based on Non-negative Matrix Factorization Subarea detecting method Battle array H obtains K subregion of data to rebuild adjacency matrix W.Using based on W and HHTL1 normal forms between two matrixes are weighing Between loss.I-th row of index matrix H represents the module belonging to node i, by companion matrix Z, obtains:
We find the new information for most representing power in lower dimensional space using equation (1) and rebuild adjacency matrix table Show.
Analyze based on more than, we set up division module in Subarea detecting subsystem and maximize model.This method is proposed It is a kind of to maximize modularity function S, S be defined as subregion internal edges quantity and all paired back end expect numerical value it Between difference.For example, it is contemplated that the network of Liang Ge communities, S is:
H when back end i belongs to first propertyi1=1, belong to second community then hi2=1.It is back end i With the quantity on the expectation side of j, diBe back end i degree be di=∑jwij.Definition module matrix Q=[qij]∈RN*N, its Element is qij=Wij-didj/N.Modularity function S can be write as:
It is a np hard problem to maximize equation (2), is that this industry proposes many optimized algorithms, such as extremal optimization. In actual applications, we allow h to this methodTH=N is simplifying problem.Promote equation (2) and arrive number of modules K > 2, obtain:
S=LKL(H, Q)=Tr (HQHT) (3)
Tr (.) is the mark of matrix.Based on Rayleigh business, the solution of equation (3) is the maximum spy of modularity matrix Q Levy vector.
Ten layers of sparse constraint feature extraction hidden layer subsystem:For ad click rate prediction work, it is intended that tied Fruit is prediction numerical value between 0 to 1 representing prediction probability.The linear model used with tradition and general deep learning are not The same, this method and system design introduce ten layer stack self-encoding encoders, and sparse constraint in addition in Feature Extraction System, Searching can most reconstruct the non-linear embedded expression of feature.After Subarea detecting module, this method and system can be according to ten Layer sparse constraint feature extraction hidden layer subsystem extracts depth characteristic.
Ten layers of self-encoding encoder are used for the neural network structure that study is similar to the new expression of initial data as far as possible.At this it is In system, we adopt advertisement click logs data module matrix Q=[qij]∈RN*NAs the input of self-encoding encoder.Here, Q In element representation be qij=Wij-didj/N.Self-encoding encoder contains two main parts:Coding and decoding.Coding is former Beginning data Q are mapped to the embedded H=[h of low-dimensionalij]∈Rd*NIn, wherein d < N, hiRepresent contained by the back end i in hidden layer Data line.Can obtain after own coding:
hi=s (FHqi+CH);
And sparse constraint in addition:s.t.rank(FH1FH2)≤r;Wherein FH∈ Rd*1, cH∈Rd*1It is the parameter for needing in an encoding process to learn, s (.) is nonlinear mapping tanh function
Hidden layer is represented into that H maps back original data space during decoding, initial data is reconstructed in representing from hidden layer:mi =s (FMhi+cM), and sparse constraint in addition:
s.t.rank(FM1 FM2)≤r;Wherein FM∈RN*1,cM∈RN*1Be The parameter learnt in decoding.Ten layer stack self-encoding encoders learn in the present system can be in parameter θ={ FH,cH,FM,cMUnder Reconstruct initial data Q to minimize the non-linear expression H of low-dimensional between Q and reconstruct data M.
Wherein Lθ(qi,s(f(qi))) it is the distance function for measuring reconstructed error.This method is made using sigmoid cross entropies For distance function.Sigmoid cross entropies use sigmoid functionsBy qi=[qji]∈RN*1And mi=[mji] ∈RN*1The results needed being mapped as between [0,1], the cross entropy for calculating them afterwards is:
Then self-encoding encoder is trained, the F for obtainingHAnd cHFor producing new expression to all of node.
Equation (4) can be solved with the back-propagation algorithm of stochastic gradient descent.In iterative process each time, ginseng Number θ={ FH,cH,FM,cMRenewal it is as follows:
Wherein α={ H, M }.By defining Aα=FαX+cα, we can obtain
WhereinRepresent j nodes for the contribution of reconstructed error.
Wherein s ' (x) is the derivative of s (x).
The system is by reconstructing raw data module matrix Q and acquisitionTo train ground floor own coding Device, by reconstructing the output of the i-th -1 self-encoding encoder, to obtain, the next one is new to be represented the system again afterwardsTo instruct Practice i-th layer of self-encoding encoder.The self-encoding encoder adopted by general industry, as the increase parameter exponentially level of the number of plies increases, so as to So that optimization efficiency is low.This method is built in Nonlinear feature extraction model to extract the depth characteristic in ad data Ten layers of sparse storehouse self-encoding encoder are found, for each of which layer self-encoding encoder neutral net extracts depth characteristic.
Space constraint model generates subsystem:After depth characteristic of the ad data more than is extracted, this method and system Design space is similar to be constrained to generate forecast model.Paired space constraint is carried out to back end, introduces reconstruction attractor figure canonical Change the generation to carry out final mask.This method incorporates the priori that back end i and j belong to same space:First, it is By two set of metadata of similar data node-classifications to the same space, the new expression data row h of back end i and jiAnd hjShould be similar 's;Secondly, these prioris go further to affect the embedded of other summits in being encoded into model generation system.
It is defined as to constraint matrixIf back end i and j belong to same space, oij=1, instead Oij=0.Space constraint is written as: Wherein Tr (.) is the mark of matrix, diagonal matrixFor regularization Laplacian Matrix.By minimizing RLSE(O, H), we can be with If it was found that wherein corresponding element oij=1, then it is assumed that the two back end i and j is similar in new representation space.
By being merged in equation (5) to constraint and reconstructing loss function in equation (4), this method and system are obtained extensively Announcement clicks on the object function of forecast model:
Wherein λ is in reconstructed error (Section 1 L (Q, M)) and with prior information (Section 2) new table The parameter weighed between showing.
Equation (6) solution is:
Primary advertising click logs data into Subarea detecting system, carry out subregion to initial data after cleaning Detection, then by the ten layers of sparse constraint hidden layer Feature Extraction System of data input after detection, the depth characteristic of ad data is obtained, Carry out the forecast model that space constraint obtains the system.When there is new advertisement click logs data input forecast model, can be with Obtain corresponding predicting the outcome.
The beneficial effect of this method and system is:The most of logic-based regression functions of the correlation technique found hitherto Model training is carried out, it is inadequate to the depth characteristic extractability of ad click data, and due to the ad click data of magnanimity It is related to large-scale data analysis, the method that existing patent and document are adopted in feature extraction is relatively simple, the feature of extraction Representability is weaker.As advertisement traffic data amount is increasing, analysis demand is more and more urgent, and current method and system can not Satisfaction applies present situation.
This method and system it is critical only that following invention:
A) Subarea detecting (subarea detection);
B) hidden layer decomposes sparse constraint (hidden level factorization sparse constrained);
C) space constraint generation technique (subspace constrained generative technology).
The model and algorithm of the system is write based on GTX980GPU, using PYTHON language.Modules are based on numpy With the tool kit sklearn of scipy.With current existing factorisation machine (Factorization Machine, FM) model, Logistic regression (Logistic Regression, LR) model and deep learning method have carried out multiple-group analysis, LR logistic regressions It is the common linear model of ad click rate prediction, it is relatively simple, easily extend and online updating;FM factorisation machine be based on because Son decomposes, and can catch dependency between feature to process high dimensional data.Compared with the deep learning class method being related at present, This method extract depth characteristic can not only lift scheme ad click rate prediction effect, moreover it is possible to reduce feature extraction into Sheet and time.Feature in feature database is screened by method different from the past, also different from each category feature is combined, I Method excavate the higher depth characteristic of discrimination, and carry out sparse nonlinear change and space constraint to feature, eliminate The interference of noise.
We adopt AUC (Area Under roc Curve) in the present system for the evaluation of ad click rate forecast model Index, it is better that this index is closer to 1 explanation institute detection model effect.The system is first by advertisement click logs initial data Subarea detecting is carried out, later stage disposal ability is further speeded up;It is special for advertisement click logs data in ten layers of non-linear hidden layer Point carries out sparse constraint;Incorporate space constraint model preferably to obtain using part prior information when forecast model is generated Outstanding predictive ability.This method and system can be widely applied to Internet advertising industry, with higher application prospect.
Presently preferred embodiments of the present invention is the foregoing is only, not to limit the present invention, all essences in the present invention Any modification, equivalent and improvement made within god and principle etc., should be included within the scope of the present invention.

Claims (10)

1. the ad click rate prognoses system that a kind of depth characteristic is extracted, it is characterised in that:Which includes:
Advertisement log data acquisition subsystem, which is used for gathering advertisement click logs data;
Subarea detecting subsystem, which is used for carrying out Subarea detecting to the advertisement click logs data;
Ten layers of sparse constraint feature extraction hidden layer subsystem, which is used in the advertisement click logs data after Subarea detecting Extract the depth characteristic of ad data;
Space constraint model generates subsystem, and which is used for carrying out space constraint according to the depth characteristic obtaining forecast model;
Wherein, when there is new advertisement click logs data input, the forecast model can just obtain corresponding predicting the outcome.
2. the ad click rate prognoses system that depth characteristic as claimed in claim 1 is extracted, it is characterised in that:The advertisement point Daily record data is hit from advertising space data, geographic information data, page context data, Cookie data, the advertisement point The primary fields for hitting daily record data have:The number of clicks of advertisement, exposure frequency, advertisement link information, advertisement position information, inquiry Label information, key word of the inquiry information, advertisement title information, user label information, device therefor information.
3. the ad click rate prognoses system that depth characteristic as claimed in claim 1 is extracted, it is characterised in that:The subregion inspection It is undirected without weight graph G, G=(V, E), wherein, V={ V that subsystem is surveyed by the ad click data modeling1,V2,…VNIt is N number of The set of back end;E=[eij] it is two data node is and j connect side in V set;Subarea detecting subsystem purpose exists K module is found further in the ad click data analysiss are directed to
4. the ad click rate prognoses system that depth characteristic as claimed in claim 3 is extracted, it is characterised in that:The subregion inspection Survey subsystem and introduce index matrixhikThe probability that back end i belongs to module K is represented,Table Show the arithmetic number matrix that probit is N*N;Capture back end i is belonged to into the probability of module K, every a line table of index matrix H It is shown as belonging to the distribution of back end in same module K;And design wijFor connecting the probability of back end i and j, this probability It is considered as that the side generated by back end i and j belongs to the probability of same community, the connection probability of back end i and j is:Adjacency matrix W is expressed as including the symmetrical matrix of non-negative element Implication For the w if having side between back end i and jij=1, otherwise wij=0;For all 1≤i≤N, wii=0;Then, makeFind index matrix H to rebuild adjacency matrix W based on Non-negative Matrix Factorization Subarea detecting method, obtain data K module;Using based on W and HHTL1 normal forms between two matrixes are come the loss between weighing;I-th row of index matrix H is represented Module belonging to node i, by companion matrix Z, obtains: The new information for most representing power is found in lower dimensional space and adjacency matrix is rebuild using equation (1) represent.
5. the ad click rate prognoses system that depth characteristic as claimed in claim 4 is extracted, it is characterised in that:In Subarea detecting Division module is set up in module maximizes model;It is defined on the quantity of subregion internal edges and expects in all paired back end Difference between numerical value, its modularity function S are designed as:
Wherein, column vector data of the h for H, Q are modularity matrix;Using hTH=N simplifying, when equation (2) expands to number of modules During K > 2, obtain:S=LKL(H, Q)=Tr (HQHT) (3);Wherein, Tr (.) is the mark of matrix;Based on Rayleigh business, The solution of equation (3) is the maximal eigenvector of modularity matrix Q.
6. the ad click rate prognoses system that depth characteristic as claimed in claim 1 is extracted, it is characterised in that:Described ten layers dilute Thin binding characteristic extracts hidden layer subsystem and sets up ten layers of sparse storehouse self-encoding encoder in Nonlinear feature extraction model, for every One layer of self-encoding encoder extracts depth characteristic using neutral net.
7. the ad click rate prognoses system that depth characteristic as claimed in claim 6 is extracted, it is characterised in that:The space is about Beam model generates subsystem and introduces the regularization of reconstruction attractor figure to carry out the generation of final mask.
8. the ad click rate prognoses system that depth characteristic as claimed in claim 6 is extracted, it is characterised in that:The space is about Beam model generates the similar constraint in subsystem design space to generate forecast model;Paired space constraint is carried out to back end.
9. the ad click rate prognoses system that depth characteristic as claimed in claim 8 is extracted, it is characterised in that:Incorporate data section Point i and j belong to the priori in same space:First, in order to by two set of metadata of similar data node-classifications to the same space, data The new expression data row h of node i and jiAnd hjShould be similar;Secondly, these prioris are encoded into model and generate system In go further to affect the embedded of other back end.
10. the ad click rate Forecasting Methodology that a kind of depth characteristic is extracted, which is applied to such as any one in claim 1 to 9 In the ad click rate prognoses system that described depth characteristic is extracted, it is characterised in that:The ad click rate that depth characteristic is extracted Forecasting Methodology is comprised the following steps:
Collection advertisement click logs data;
Subarea detecting is carried out to the advertisement click logs data;
In the depth characteristic of the advertisement click logs extracting data ad data after Subarea detecting;
Space constraint is carried out according to the depth characteristic and obtains forecast model;
Wherein, when there is new advertisement click logs data input, the forecast model can just obtain corresponding predicting the outcome.
CN201610983314.7A 2016-11-08 2016-11-08 A kind of ad click rate forecasting system and its prediction technique that depth characteristic is extracted Active CN106529721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610983314.7A CN106529721B (en) 2016-11-08 2016-11-08 A kind of ad click rate forecasting system and its prediction technique that depth characteristic is extracted

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610983314.7A CN106529721B (en) 2016-11-08 2016-11-08 A kind of ad click rate forecasting system and its prediction technique that depth characteristic is extracted

Publications (2)

Publication Number Publication Date
CN106529721A true CN106529721A (en) 2017-03-22
CN106529721B CN106529721B (en) 2018-12-25

Family

ID=58350151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610983314.7A Active CN106529721B (en) 2016-11-08 2016-11-08 A kind of ad click rate forecasting system and its prediction technique that depth characteristic is extracted

Country Status (1)

Country Link
CN (1) CN106529721B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107168854A (en) * 2017-06-01 2017-09-15 北京京东尚科信息技术有限公司 Detection method, device, equipment and readable storage medium storing program for executing are clicked in Internet advertising extremely
CN107239970A (en) * 2017-05-12 2017-10-10 百川通联(北京)网络技术有限公司 A kind of Behavior-based control daily record determines the method and system of ad click rate
CN108629630A (en) * 2018-05-08 2018-10-09 广州太平洋电脑信息咨询有限公司 A kind of feature based intersects the advertisement recommendation method of joint deep neural network
CN108829763A (en) * 2018-05-28 2018-11-16 电子科技大学 A kind of attribute forecast method of the film review website user based on deep neural network
CN108875916A (en) * 2018-06-27 2018-11-23 北京工业大学 A kind of ad click rate prediction technique based on GRU neural network
CN109299976A (en) * 2018-09-07 2019-02-01 深圳大学 Clicking rate prediction technique, electronic device and computer readable storage medium
CN109993559A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Model training method and system
CN111126614A (en) * 2018-11-01 2020-05-08 百度在线网络技术(北京)有限公司 Attribution method, attribution device and storage medium
WO2020140632A1 (en) * 2019-01-04 2020-07-09 平安科技(深圳)有限公司 Hidden feature extraction method, apparatus, computer device and storage medium
CN111798018A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Behavior prediction method, behavior prediction device, storage medium and electronic equipment
CN112530598A (en) * 2020-12-11 2021-03-19 万达信息股份有限公司 Health risk self-measurement table recommendation method and system based on health data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103310003A (en) * 2013-06-28 2013-09-18 华东师范大学 Method and system for predicting click rate of new advertisement based on click log
CN104951965A (en) * 2015-06-26 2015-09-30 深圳市腾讯计算机***有限公司 Advertisement delivery method and device
CN105654200A (en) * 2015-12-30 2016-06-08 上海珍岛信息技术有限公司 Deep learning-based advertisement click-through rate prediction method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103310003A (en) * 2013-06-28 2013-09-18 华东师范大学 Method and system for predicting click rate of new advertisement based on click log
CN104951965A (en) * 2015-06-26 2015-09-30 深圳市腾讯计算机***有限公司 Advertisement delivery method and device
CN105654200A (en) * 2015-12-30 2016-06-08 上海珍岛信息技术有限公司 Deep learning-based advertisement click-through rate prediction method and device

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239970A (en) * 2017-05-12 2017-10-10 百川通联(北京)网络技术有限公司 A kind of Behavior-based control daily record determines the method and system of ad click rate
CN107168854A (en) * 2017-06-01 2017-09-15 北京京东尚科信息技术有限公司 Detection method, device, equipment and readable storage medium storing program for executing are clicked in Internet advertising extremely
CN109993559A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Model training method and system
CN108629630A (en) * 2018-05-08 2018-10-09 广州太平洋电脑信息咨询有限公司 A kind of feature based intersects the advertisement recommendation method of joint deep neural network
CN108629630B (en) * 2018-05-08 2020-05-12 广州太平洋电脑信息咨询有限公司 Advertisement recommendation method based on feature cross-combination deep neural network
CN108829763A (en) * 2018-05-28 2018-11-16 电子科技大学 A kind of attribute forecast method of the film review website user based on deep neural network
CN108875916A (en) * 2018-06-27 2018-11-23 北京工业大学 A kind of ad click rate prediction technique based on GRU neural network
CN108875916B (en) * 2018-06-27 2021-07-16 北京工业大学 Advertisement click rate prediction method based on GRU neural network
CN109299976A (en) * 2018-09-07 2019-02-01 深圳大学 Clicking rate prediction technique, electronic device and computer readable storage medium
CN109299976B (en) * 2018-09-07 2021-03-23 深圳大学 Click rate prediction method, electronic device and computer-readable storage medium
CN111126614A (en) * 2018-11-01 2020-05-08 百度在线网络技术(北京)有限公司 Attribution method, attribution device and storage medium
CN111126614B (en) * 2018-11-01 2024-01-16 百度在线网络技术(北京)有限公司 Attribution method, attribution device and storage medium
WO2020140632A1 (en) * 2019-01-04 2020-07-09 平安科技(深圳)有限公司 Hidden feature extraction method, apparatus, computer device and storage medium
CN111798018A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Behavior prediction method, behavior prediction device, storage medium and electronic equipment
CN112530598A (en) * 2020-12-11 2021-03-19 万达信息股份有限公司 Health risk self-measurement table recommendation method and system based on health data
CN112530598B (en) * 2020-12-11 2023-07-25 万达信息股份有限公司 Health risk self-measuring table recommendation method based on health data

Also Published As

Publication number Publication date
CN106529721B (en) 2018-12-25

Similar Documents

Publication Publication Date Title
CN106529721B (en) A kind of ad click rate forecasting system and its prediction technique that depth characteristic is extracted
Gharaibeh et al. Improving land-use change modeling by integrating ANN with Cellular Automata-Markov Chain model
CN108805188B (en) Image classification method for generating countermeasure network based on feature recalibration
CN109034448B (en) Trajectory prediction method based on vehicle trajectory semantic analysis and deep belief network
CN114092832B (en) High-resolution remote sensing image classification method based on parallel hybrid convolutional network
CN108009674A (en) Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks
CN113190699A (en) Remote sensing image retrieval method and device based on category-level semantic hash
Wang et al. An enhanced interval PM2. 5 concentration forecasting model based on BEMD and MLPI with influencing factors
Wu et al. A hybrid support vector regression approach for rainfall forecasting using particle swarm optimization and projection pursuit technology
CN109740655B (en) Article scoring prediction method based on matrix decomposition and neural collaborative filtering
CN101546290B (en) Method for improving accuracy of quality forecast of class hierarchy in object-oriented software
Li et al. Bayesian Markov chain random field cosimulation for improving land cover classification accuracy
CN112163106B (en) Second-order similar-perceived image hash code extraction model establishment method and application thereof
CN112560966B (en) Polarized SAR image classification method, medium and equipment based on scattering map convolution network
CN114252879A (en) InSAR inversion and multi-influence factor based large-range landslide deformation prediction method
CN114937173A (en) Hyperspectral image rapid classification method based on dynamic graph convolution network
CN114821340A (en) Land utilization classification method and system
Windheuser et al. An end‐to‐end flood stage prediction system using deep neural networks
Talagala et al. Meta‐learning how to forecast time series
Tayyebi et al. Modeling Historical Land Use Changes at A Regional Scale: Applying Quantity and Locational Error Metrics to Assess Performance of An Artificial Neural Network-Based Back-Cast Model.
Nourani et al. The applications of soft computing methods for seepage modeling: a review
CN116258504B (en) Bank customer relationship management system and method thereof
Gao et al. Determining the weights of influencing factors of construction lands with a neural network algorithm: a case study based on Ya’an City
CN112446542B (en) Social network link prediction method based on attention neural network
Saenz et al. Dimensionality-reduction of climate data using deep autoencoders

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant