WO2015124024A1 - 一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置 - Google Patents

一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置 Download PDF

Info

Publication number
WO2015124024A1
WO2015124024A1 PCT/CN2014/094298 CN2014094298W WO2015124024A1 WO 2015124024 A1 WO2015124024 A1 WO 2015124024A1 CN 2014094298 W CN2014094298 W CN 2014094298W WO 2015124024 A1 WO2015124024 A1 WO 2015124024A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
presentation
search
information
data
Prior art date
Application number
PCT/CN2014/094298
Other languages
English (en)
French (fr)
Inventor
王超
邓钦华
许晟
Original Assignee
北京奇虎科技有限公司
奇智软件(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201410063058.0A external-priority patent/CN104866493B/zh
Priority claimed from CN201410098737.1A external-priority patent/CN104933047B/zh
Application filed by 北京奇虎科技有限公司, 奇智软件(北京)有限公司 filed Critical 北京奇虎科技有限公司
Publication of WO2015124024A1 publication Critical patent/WO2015124024A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising

Definitions

  • the present invention relates to the field of computer technology, and more particularly to a method and apparatus for increasing the exposure of information, and a method and apparatus for determining the value of a search term.
  • the present invention proposes a solution for improving the exposure rate of information.
  • the present invention mainly solves the following problems in view of the shortcomings of the existing solutions:
  • the dynamic adjustment of the opportunity is presented, and the information exposure rate is increased under the premise of ensuring the minimum loss.
  • the present invention provides a method of increasing exposure of information, comprising: determining whether to perform an exposure promotion process for information related to a received query term; if so, checking historical query data associated with the query term Whether the query frequency is greater than or equal to the first threshold; if yes, determining the candidate information based on the historical presentation data related to the query word; based on the basic data of all candidate information and the historical presentation data of the information group to which all candidate information belongs, Estimate the quality of presentation of all candidate information And selecting candidate information having the predicted highest presentation quality parameter as recommendation information to the candidate presentation queue to perform overall presentation contention processing with the estimated highest presentation quality parameter as the presentation quality parameter of the recommendation information.
  • the present invention also provides an apparatus for improving the exposure rate of information, comprising: a first determining module, configured to determine whether to perform an exposure rate improvement process on undisplayed information related to the received query word; and an inspection module, a second determining module, configured to check whether the query frequency of the historical query data associated with the query word is greater than or equal to a first threshold; and the second determining module, configured to determine candidate information based on historical presentation data related to the query term; A prediction quality parameter for estimating the quality of all candidate information based on the basic data of all the candidate information and the historical presentation data of the information group to which all candidate information belongs, and a recommendation module for using the candidate information having the highest predicted quality parameter as the prediction The recommendation information is recommended to the candidate presentation queue to perform overall presentation contention processing with the estimated highest presentation quality parameter as the presentation quality parameter of the recommendation information.
  • a method for determining a value of a search term comprising: inputting feature data of a search term to be tested into a value regression model; and acquiring the search term to be tested based on a value regression model Value data. ;
  • the value regression model is obtained by clustering existing search words based on click relationship data and/or presentation relationship data to obtain a clustered search word set; classifying the search word set into A collection of search terms of different values; model training using different sets of search terms to obtain a value regression model.
  • an apparatus for determining a value of a search term comprising: an input module, configured to input feature data of a search term to be tested into a value regression model; and an acquisition module, configured to The value regression model obtains the value data of the search term to be tested; wherein the value regression model is obtained by the following module: a clustering module, configured to use the existing search term based on the click relationship data and/or the presentation relationship data Clustering is performed to obtain a clustered search word set; a classification module is used to classify the search word set into different value search word sets; and a model acquisition module is used to perform model training using different value search word sets. Get a value regression model.
  • a computer program comprising computer readable code, a method of causing an exposure of the enhanced information and determining a search term when the electronic device runs the computer readable code The method of value is implemented.
  • a computer readable medium storing a computer program as described above is provided.
  • the technical solution of the method and apparatus for improving the exposure of information according to the present invention has the following beneficial effects: dynamically adjusting the presentation opportunity according to historical presentation data, and increasing information exposure under the premise of ensuring minimum loss. Rate, while giving more opportunities to show in unit time The same information and improved user experience.
  • the value of the search term can be more accurately determined and the valuable data information (such as an advertisement) can be selected based on the search term value data to improve the user experience and improve the information click rate. , improve information exposure.
  • FIG. 1 is a flow chart of a method of increasing exposure of information according to an embodiment of the present invention
  • FIG. 2 is a structural diagram of an apparatus for increasing exposure of information according to an embodiment of the present invention.
  • FIG. 3 shows a flow chart of a method of obtaining a value regression model in accordance with one embodiment of the present invention
  • FIG. 4 shows a flow chart of a method of determining the value of a search term in accordance with one embodiment of the present invention
  • FIG. 5 is a block diagram showing an apparatus for determining a value of a search term according to an embodiment of the present invention
  • Figure 6 shows a block diagram of an electronic device for performing the method of the present invention
  • Figure 7 shows a schematic diagram of a memory unit for holding or carrying program code implementing a method in accordance with the present invention.
  • FIG. 1 is a flowchart of a method of increasing exposure of information according to an embodiment of the present invention.
  • step S110 it is determined whether or not the exposure rate promotion processing is performed for the unexpressed information related to the received query word.
  • the method of the present invention first needs to determine whether to perform an exposure promotion process for the unexpressed information related to the received query word. Can also target Each time the query request is executed, the exposure improvement process is performed, but such a realization is less efficient, that is, it may give a reluctant expectation that a large result shows too much opportunity, thereby reducing the efficiency of the entire system. Therefore, a determination can be made for each query request, and the ratio of the exposure improvement processing required in the system is controlled within a range, for example, the ratio control of the query request selected to participate in the exposure promotion processing and the total query request Not more than 5%. It should be understood that this ratio can be adjusted as needed.
  • the server has a historical query database storing historical query data, and the historical query data in the database is used to provide historical request information of each query word, so that whether the related query term is not displayed may be obtained.
  • the past information performs the adjustment parameters of the exposure improvement process.
  • acquiring an adjustment parameter based on the historical request data related to the query, acquiring an adjustment parameter; determining, based on the adjustment parameter and the random number generated by the system, whether to perform an exposure rate on the undisplayed information related to the received query.
  • an adjustment parameter is acquired based on historical query data related to the query.
  • the benchmark of the adjustment parameter can be set to 1.0, and based on the benchmark 1.0, according to the historical request data (which may be, but not limited to, the frequency of the query and the click rate, etc.), it may be, but is not limited to, using the following formula to adjust the pre- Estimate to get the adjustment parameters:
  • Adjustment parameter 1.0 + alpha * click rate + beta * log (gama / frequency) formula 1
  • the exposure rate enhancement processing is performed for the unexpressed information related to the received query.
  • the random number can be generated using a uniform distribution (for example, a target ratio of 5%, that is, a uniform distribution with a parameter of 20), so that its final result satisfies the target ratio (5%) described above.
  • a uniform distribution for example, a target ratio of 5%, that is, a uniform distribution with a parameter of 20
  • random numbers can also be generated in other ways.
  • Judgment Threshold Target Ratio * Adjustment Parameters.
  • the determination threshold is greater than or equal to 5%, it is determined that the exposure rate promotion process is performed for the undisplayed information related to the received query; when the determination threshold is less than 5%, it is determined that the related query is not related to the received query.
  • the information that has been presented does not perform the exposure enhancement process. It should be understood that other thresholds may be selected as desired without being limited to the specific threshold values described above.
  • the information that needs to be presented may be advertising information, and it may be determined whether The received advertisement information related to the query word is not subjected to the exposure improvement processing. That is to say, firstly, based on the historical request data related to the query, obtaining an adjustment parameter; then, based on the adjustment parameter and the random number generated by the system, determining whether the unrelated item related to the received query is not displayed
  • the advertising information performs an exposure improvement process.
  • the information may include at least one of the following: information indicating that the number of times is below a predetermined value, information of a predetermined area, information of a predetermined time period.
  • the information may be advertisement information with a number of presentations of less than 10 times, advertisement information of Beijing, and the like. It should be understood that the information of the present invention may also be other types of information.
  • the exposure promotion process is performed for the undisplayed information associated with the received query term. If it is determined at step 110 that the exposure promotion process is performed for the undisplayed information associated with the received query term, then at step 120, it is checked if the query frequency of the historical query data associated with the query term is greater than Equal to the first threshold.
  • the first threshold can be, for example, two per hour.
  • the frequency of querying historical query data associated with the query term is checked. If the frequency of the query is high, for example, greater than or equal to twice per hour, the method proceeds to step 130. If the frequency of the query is low, for example less than twice per hour, the method ends.
  • the first threshold is not limited to the above values, but any suitable value may be selected as the first threshold as needed.
  • candidate information is determined based on historical presentation data associated with the query term.
  • the historical presentation data related to the query word is searched, and the historical presentation data associated with the query word whose number of days of presentation is less than the second threshold is determined.
  • the second threshold can be 10 times. That is, it is said that the history presentation data associated with the query word is searched for and determined less than 10 times per day, and the information corresponding to the found history presentation data is determined as the candidate information.
  • historical presentation data associated with the query term that is displayed at other time levels less than a certain threshold may also be determined. For example, the weekly (seven days) presentation times are less than 70 times, or the 60 hour presentation times are less than 25 times, and so on.
  • the number of days of presentation of the information ID "A1234123”, the information ID "A1231312”, and the information ID “A1343141” is less than the second threshold (for example, 10 times), These three pieces of information are thus determined as candidate information.
  • the presentation quality parameters of all candidate information are estimated based on the basic data of all candidate information and the historical presentation data of the information group to which all candidate information belongs.
  • the presentation quality parameters of all candidate information are estimated based on the basic data of all candidate information and the historical presentation data of the information group to which all candidate information belongs.
  • the information ID "A1234123” and the information ID "A1231312” belong to one information group "G111223"
  • the information ID "A1343141” belongs to another information group "G222121”.
  • Querying various basic data of the above three candidate information, and querying historical presentation data of other same group information IDs in the above two information groups to which they belong, and the historical presentation data of each information ID includes presentation quality parameters of each information ID, The highest presentation quality parameter from each of the groups is used as an estimated presentation quality parameter for candidate information in the group.
  • the highest presentation quality parameter a (the presentation quality parameter of the information A) in the information group "G111223” is taken as the estimated presentation quality parameter a of the candidate information (information ID "A1234123", information ID "A1231312”).
  • the highest presentation quality parameter b (the presentation quality parameter of the information b) in the information group "G222121” is taken as the estimated presentation quality parameter b of the candidate information (information ID "A1343141").
  • the candidate information having the predicted highest presentation quality parameter is recommended as recommendation information to the candidate presentation queue to perform overall presentation with the estimated highest presentation quality parameter as the presentation quality parameter of the recommendation information.
  • Competitive processing
  • the candidate information having the predicted highest presentation quality parameter (parameter a) is recommended as recommendation information into the candidate presentation queue.
  • the information ID "A1234123” and the information ID "A1231312" of the candidate information having the predicted highest presentation quality parameter a can be recommended into the candidate presentation queue.
  • only one of the plurality of candidate information may be recommended to the candidate presentation queue at a time by polling.
  • the overall presentation of the competition process is the selection of candidate results based on the ranking.
  • search advertisement information As an example, first, the advertisement information that enters the overall competition processing is scored and sorted according to a predetermined rule, for example, according to the scores from large to small.
  • sort score ad creative quality * keyword bid price.
  • the advertising information is presented. However, not all ads will be shown. For example, search ads are generally 3 on the left and 8 on the right.
  • FIG. 2 is a structural diagram 200 of an apparatus for increasing the exposure rate of information according to an embodiment of the present invention.
  • the device 200 may include: a first determining module 210, an checking module 220, a second determining module 230, The estimation module 240 and the recommendation module 250.
  • the first determining module 210 may be configured to determine whether to perform an exposure promotion process for the undisplayed information related to the received query word.
  • the first determining module 210 may further include: an obtaining submodule 211 and a first determining submodule 212.
  • the obtaining sub-module 211 can be configured to obtain an adjustment parameter based on the historical query data related to the query term, and the first determining sub-module 212 can be configured to determine whether to target the based on the adjusted parameter and the random number generated by the system.
  • the received information related to the query word is not subjected to the exposure improvement process.
  • the checking module 220 can be configured to check whether the query frequency of the historical query data associated with the query word is greater than or equal to a first threshold.
  • the checking module 220 may be further configured to abandon the exposure rate promotion process if the query frequency of the historical query data associated with the query term is less than the first threshold.
  • the first threshold can be, for example, two per hour.
  • the second determining module 230 can be configured to determine candidate information based on historical presentation data related to the query term.
  • the second determining module 230 may further include: a searching submodule 231 and a second determining submodule 232.
  • the searching sub-module 231 can be configured to search for historical presentation data associated with the query word whose number of days of presentation is less than a second threshold; and the second determining sub-module 232 can be configured to correspond to the searched historical presentation data.
  • the information is determined as candidate information.
  • the estimation module 240 can be configured to estimate the presentation quality parameters of all candidate information based on the basic data of all candidate information and the historical presentation data of the information group to which all candidate information belongs.
  • the recommendation module 250 may be configured to recommend candidate information having the predicted highest presentation quality parameter as recommendation information to the candidate presentation queue to perform overall presentation contention processing with the estimated highest presentation quality parameter as the presentation quality parameter of the recommendation information.
  • the apparatus 200 may further include: a presentation quality parameter determination module (not shown), the module may be configured to: if the recommendation information is obtained in the overall presentation competition process, the recommendation information The presentation quality parameter obtained in the presentation process is determined as an initial presentation quality parameter of the recommendation information.
  • a presentation quality parameter determination module (not shown) the module may be configured to: if the recommendation information is obtained in the overall presentation competition process, the recommendation information The presentation quality parameter obtained in the presentation process is determined as an initial presentation quality parameter of the recommendation information.
  • a method for determining the value of a search term mainly includes the following steps:
  • Step 1 Count the number of ad impressions and ad clicks for all search terms in the ad impression log. the amount;
  • Step 3 if the search term click rate is less than a threshold and the number of advertisement presentations is greater than a threshold, the search term is low value; conversely, if the search term click rate is greater than a threshold and the number of advertisements is greater than a threshold, then The search terms are of high value.
  • the specific examples are as follows: for example, the threshold of the search term click rate is 5%, the threshold of the search term exhibiting threshold is 50; and the search term "prose of the sunset" is 100, and the number of clicks is 1, the word is low value.
  • the search term "laptop" ad shows 10,000 times and the number of clicks is 1000, the word is high value.
  • the implementation it is necessary to manually specify the search term click rate threshold and the search term display threshold, and the effect depends greatly on the worker's experience; and the implementation can only judge whether the value is high or low, and cannot give a value.
  • the specific value is not smooth enough in practical applications; moreover, the implementation mainly comes from statistics, so the promotion is poor, the coverage rate is relatively low, and the accuracy rate also has room for improvement, which cannot fully meet the needs of the search advertising system.
  • FIG. 3 is a flow chart of a method of obtaining a value regression model in accordance with one embodiment of the present invention.
  • the existing search words are clustered based on the click relationship data and/or the presentation relationship data to obtain a clustered search word set.
  • the number of common presentations of different search terms can be obtained and the presentation relationship data can be calculated based on the number of common presentations.
  • a certain search word is Q1, and the data displayed by the search engine based on the search word is D1, D2, D3, D4; and another search word input is Q2, and the search word is displayed by the search engine based on the search word.
  • the data is D2, D3, D5, D7, then their common presentation times are 2 (D2, D3); at this point, some correlation can be used to describe the relationship between Q1 and Q2, for example, this correlation can be assumed.
  • the correlation may also be defined as the number of presentations of the common presentation number / Q2 or the number of common presentations / (the number of presentations of Q1 + the number of presentations of Q2) and the like.
  • the presentation relationship data between the search terms can be obtained.
  • a certain search word is Q1
  • the data that is displayed by the search engine and clicked by the user based on the search word is D1, D2, D3, D4; and another search word that is input is Q2, based on the search word.
  • the data displayed by the search engine and clicked by the user is D2, D3, D4, D7, then their common clicks are 3 (D2, D3, D4); at this time, a correlation can be used to describe between Q1 and Q2.
  • click relationship data between the search terms can be obtained.
  • the correlation may be defined as the number of clicks of the common click/Q2 or the number of common clicks/(the number of clicks of Q1 + the number of clicks of Q2) and the like.
  • the number of common clicks, the number of joint presentations, the click relationship data, and the presentation relationship data respectively represent the number of common clicks, the number of joint presentations, the click relationship data, and the presentation relationship data between the two search words. That is to say, the above parameters refer to the correlation parameters between the two search terms.
  • the calculated relationship may be calculated based on at least one of the click relationship data, the presentation relationship data, the common presentation count, and the common click count.
  • the presentation data of Q1 is expressed as ⁇ D1, D2, D3, D4>
  • the presentation data of Q1 is represented as ⁇ D2, D3, D5, D7>
  • the Q1 and Q2 search are calculated using the clustering algorithm.
  • the clustering distance between words Through a similar method, the clustering distance of all the search words is calculated, thereby realizing the clustering of the search words.
  • the clustering distance between the search terms may be calculated based on at least one of the click relationship data, the presentation relationship data, the common click count, and the common presentation times, using a spectral clustering or kmeans clustering algorithm, thereby implementing the search term Clustering, and thus obtaining a clustered set of search terms.
  • the set of search words is classified into a set of search words of different values.
  • all collections can be classified into a predetermined number of collections of search terms.
  • the collection may be classified into three categories: a high value search word set, a medium value search word set, and a low value search word set, wherein the high value search term
  • the value data of the search words in the set is greater than the value data of the search words in the set of search words of the medium value; and the value data of the search words in the set of search words of the medium value is greater than the value of the search words in the set of search words of the low value according to.
  • All collections of search terms are classified into a set of search terms for a predetermined number according to certain rules.
  • log data has been utilized to predetermine its value data.
  • the value of the search term can be measured by the value brought by the search in a thousand times, which reflects the profitability of the search term in the unit search, that is, its value.
  • the value data of the search term can be obtained, and each search term is determined to be, for example, three levels of high, medium, and low according to the value data distribution.
  • the aggregated value data of the clustered search word set can be obtained.
  • the clustered search word set can be assigned as a collection of search words of different values.
  • search term can be divided into more grades or fewer grades
  • set of search words can also be divided into more grades or fewer grades.
  • model training is performed using a set of search words of different values to obtain a value regression model.
  • the model training is carried out using a set of search words of different values, and finally the value regression model is obtained.
  • each search word in each search term set can be used as a sample of value data corresponding to the set of search words, specifically, taking the above example, each of the high value search word sets
  • the search term is used as a one-sample, one-sample search term in the two-sample, medium-value search term as a one-sample and each search term in the low-value search term set is trained as a zero-sample using a logistic regression algorithm.
  • the value regression model is formed.
  • the search words in cluster 1 are, for example, “laptop”, “mac air”, “thinkpad”, etc., and the commercial value is marked as 1 (higher business) Value);
  • the search words in cluster 2 are "Andy Lau”, “Zhang Xueyou”, “Andy Lau's album”, etc., the commercial value is marked as 0 (low business value);
  • the search word in cluster 3 is "5 inch mobile phone has How big is it, "Is the android phone smooth?", and the commercial value is marked as 0.5 (medium business value). That is to say, the parameters of the value regression model are obtained through training, so that the value data of the search term is predicted by using the value regression model.
  • FIG. 4 is a flow chart of a method of determining the value of a search term in accordance with an embodiment of the present invention.
  • the feature data of the search term to be tested is input to the value regression model.
  • the value regression model established by the method shown in FIG. 3
  • the parameters of the value regression model have been obtained through the model training shown in Fig. 3.
  • the feature data of the search term to be tested is input into the model.
  • the feature data of the search term may include, for example, but is not limited to, the length of the search term, the category of the search term, the result of the search term segmentation, and the like.
  • the search words in cluster 1 are, for example, “laptop”, “mac air”, “thinkpad”, etc., and the commercial value is marked as 1 ( Higher business value);
  • the search terms in cluster 2 are "Andy Lau”, “Zhang Xueyou”, “Andy Lau's album”, etc., the commercial value is marked as 0 (low business value);
  • the search word in cluster 3 is "5 inch” How big is the mobile phone, "Whether the android phone is smooth", etc., the commercial value is marked as 0.5 (medium business value).
  • the feature data of the search term "Toshiba Notebook" to be tested is input into a value regression model.
  • step S420 based on the value regression model, the value data of the search term to be tested is obtained.
  • the feature data of the search term "Toshiba notebook” is input into the value regression model, and the value data that the trained model will give to the "Toshiba notebook” is, for example, 0.8 (a number greater than 0.5 and less than or equal to 1) ).
  • the value data of the search term "Li Lianjie” obtained is, for example, 0.1 (a number less than 0.5 greater than 0).
  • FIG. 5 is a block diagram showing the structure of an apparatus 500 for determining the value of a search term according to an embodiment of the present invention.
  • Apparatus 500 can include an input module 510 and an acquisition module 520.
  • the input module 510 can be used to input the search term to be tested into a value regression model.
  • the obtaining module 530 can be configured to obtain value data of the search term to be tested based on a value regression model.
  • the value regression model can be obtained by the following module:
  • a clustering module (not shown), which can be used to cluster existing search words based on click relationship data and/or presentation relationship data to obtain a clustered search word set;
  • a classification module (not shown) that can be used to classify a collection of search terms into a collection of search terms of different values
  • a model acquisition module (not shown) that can be used to model training with a collection of search terms of different values to obtain a value regression model.
  • the set of search words of different values may include a high value search word set, a medium value search word set, and a low value search word set, wherein the value data of the search word in the high value search word set The value data of the search term in the set of search words greater than the medium value; and the value data of the search term in the set of search words of the medium value is greater than the value data of the search term in the set of search words of the low value.
  • the value data of the search word in the high-value search word set is 1.
  • the value data of the search word in the set of the search value of the medium value is 0.5
  • the value data of the search word in the low-value search word set is 0.
  • the clustering module may further include a relational data acquisition sub-module, a calculation sub-module, and an acquisition sub-module.
  • the relationship data obtaining sub-module may be configured to obtain a common click count of different search terms and calculate a click relationship data and/or a common presentation number of different search words based on the common click times, and calculate the presentation relationship data based on the common presentation times.
  • the calculating submodule may be configured to calculate a clustering distance between the existing search words based on at least one of the click relationship data, the presentation relationship data, the common presentation times, and the common click times;
  • the obtaining sub-module may be configured to cluster existing search words based on the cluster distance to obtain a clustered search word set.
  • the common clicks, the common presentation times, the click relationship data, and the presentation relationship data respectively represent the number of common clicks, the common presentation times, the click relationship data, and the presentation relationship data between the two search words.
  • the model acquisition module may be further configured to:
  • Each search word in the high-value search word set is used as a one-sample and one-value search word set in each of the low-value search word sets as a one-sample, medium-value search word set.
  • a zero sample is trained using the logistic regression algorithm to form the value regression model.
  • modules in the client in the embodiment can be adaptively changed and placed in one or more clients different from the embodiment.
  • the modules in the embodiments can be combined into one module, and further they can be divided into a plurality of sub-modules or sub-units or sub-components.
  • any combination of the features disclosed in the specification, including the accompanying claims, the abstract and the drawings, and any methods so disclosed, or All processes or units of the client are combined.
  • Each feature disclosed in this specification may be replaced by alternative features that provide the same, equivalent or similar purpose.
  • the various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof.
  • a microprocessor or digital signal processor may be used in practice to implement some of the means for increasing the exposure of information and the means for determining the value of a search term in accordance with an embodiment of the present invention. Or some or all of the features of all components.
  • the invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein.
  • Such a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
  • FIG. 6 illustrates an electronic device that can implement the method of increasing the exposure of information of the present invention and a method of determining the value of a search term.
  • the electronic device conventionally includes a processor 610 and a computer program product or computer readable medium in the form of a memory 620.
  • the memory 620 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), an EPROM, a hard disk, or a ROM.
  • Memory 620 has a memory space 630 for program code 631 for performing any of the method steps described above.
  • storage space 630 for program code may include various program code 631 for implementing various steps in the above methods, respectively.
  • the program code can be read from or written to one or more computer program products.
  • These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is typically a portable or fixed storage unit as described with reference to FIG.
  • the storage unit may have a storage section or a storage space or the like arranged similarly to the storage 620 in the electronic device of FIG.
  • the program code can be compressed, for example, in an appropriate form.
  • the storage unit comprises a program 631' for performing the steps of the method according to the invention, ie a code readable by a processor, such as 610, which, when executed by the electronic device, causes the electronic device to perform the above Each step in the described method.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置。提升信息的曝光率的方法包括:确定是否针对与接收到的查询词相关的信息执行曝光率提升处理(S110);如果是,则检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值(S120);如果是,则基于与所述查询词相关的历史展现数据,确定候选信息(S130);基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数(S140);以及将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列,以便以预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理(S150)。提高了信息曝光率,在单位时间内给出更多的展现机会给不同的信息并且提高了用户体验。

Description

一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置 技术领域
本发明涉及计算机技术领域,更具体地涉及一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置。
背景技术
随着互联网业务的发展,在互联网上出现越来越多的各类业务,例如广告信息业务等等。对于互联网上的信息业务而言,信息的曝光或展现是信息主(例如,广告主)实现广告信息效果的基本保证,是搜索信息主定制创意和竞争价格的主要目的,也是信息主实现价值的基础。但是在实际的信息竞价***设计中,需要同时考虑效率和公平的平衡,因为本质上公平是对潜力的认可,必将对未来的效率带来提升。在现实中的问题是,很多信息主定制的创意在展现表现上差异很大。一方面,一些全新设计的、预期收益效率更好的创意得不到有效展现,另一方面,一些曾经得到展现的、但是收益效率在随着时间下降的信息却在***中不断被展现,这样对于效率和收益的最大化都带来负面影响。
针对互联网的信息业务的上述问题,本发明提出了一种提升信息的曝光率的解决方案,针对现有方案的缺点,本发明主要解决以下几个问题:
能够兼顾效率和公平,在单位时间内变换出更多的展现机会给不同的信息,避免一枝独秀;
通过提供更多选择空间,通过信息的多样性创造更多元的用户体验,避免重复性疲劳带来的转化率降低;并且
根据历史展现数据,对展现机会做动态调整,在保证最小损失的前提下,增加信息曝光率。
发明内容
为了提升未被展现过的信息的曝光率,本发明的主要目的在于提供一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置、计算机程序以及计算机可读介质。
本发明提供了一种提升信息的曝光率的方法,包括:确定是否针对与接收到的查询词相关的信息执行曝光率提升处理;如果是,则检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值;如果是,则基于与所述查询词相关的历史展现数据,确定候选信息;基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量 参数;以及将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列,以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
本发明还提供了一种提升信息的曝光率的装置,包括:第一确定模块,用于确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理;检查模块,用于检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值;第二确定模块,用于基于与所述查询词相关的历史展现数据,确定候选信息;预估模块,用于基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数;推荐模块,用于将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
依据本发明的一个方面,提供了一种确定搜索词的价值的方法,其特征在于,包括:将待测搜索词的特征数据输入价值回归模型;基于价值回归模型,获取所述待测搜索词的价值数据。;
其中,所述价值回归模型是通过如下方式获取的:将已有搜索词基于点击关系数据和/或展现关系数据而进行聚类,以获得聚类后的搜索词集合;将搜索词集合分类为不同价值的搜索词集合;利用不同价值的搜索词集合进行模型训练以获取价值回归模型。
依据本发明的另一个方面,提供了一种确定搜索词的价值的装置,其特征在于,包括:输入模块,用于将待测搜索词的特征数据输入价值回归模型;获取模块,用于基于价值回归模型,获取所述待测搜索词的价值数据;其中,所述价值回归模型是通过如下模块获取的:聚类模块,用于将已有搜索词基于点击关系数据和/或展现关系数据而进行聚类,以获得聚类后的搜索词集合;分类模块,用于将搜索词集合分类为不同价值的搜索词集合;模型获取模块,用于利用不同价值的搜索词集合进行模型训练以获取价值回归模型。
根据本发明的另一个方面,提供了一种计算机程序,其包括计算机可读代码,当电子设备运行所述计算机可读代码时,导致所述的提升信息的曝光率的方法和确定搜索词的价值的方法被执行。
根据本发明的再一个方面,提供了一种计算机可读介质,其中存储了如上所述的计算机程序。
与现有技术相比,根据本发明提升信息的曝光率的方法和装置的技术方案存在以下有益效果:根据历史展现数据,对展现机会做动态调整,在保证最小损失的前提下,增加信息曝光率,同时在单位时间内给出更多的展现机会给不 同的信息并且提高了用户体验。
根据本发明的确定搜索词的价值的方法和装置,可以更加准确地确定搜索词的价值并基于搜索词价值数据选择展现其中有价值的数据信息(例如广告)从而提高用户体验并提高信息点击率,提升信息曝光率。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
附图说明
普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1是根据本发明一实施例的提升信息的曝光率的方法的流程图;
图2是根据本发明一实施例的提升信息的曝光率的装置的结构图;
图3示出了根据本发明一个实施例的获取价值回归模型的方法的流程图;
图4示出了根据本发明一个实施例的确定搜索词的价值的方法的流程图;以及。
图5示出了根据本发明一个实施例的确定搜索词的价值的装置的结构图;
图6示出了用于执行本发明的方法的电子设备的框图;以及
图7示出了用于保持或者携带实现根据本发明的方法的程序代码的存储单元示意图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
下面将参考附图,详细描述本发明改进的技术方案。
如图1所示,图1是根据本发明一实施例的提升信息的曝光率的方法的流程图。
在步骤S110处,确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理。
具体而言,本发明的方法在接收到查询词之后,需要首先确定是否要针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理。也可以针对 每一次查询请求,执行曝光率提升处理,但是这样变现效率较低,即可能给予悔恨期望较大的结果太多展现机会,从而降低整个***的效率。因此,可以针对每次查询请求进行一下判定,将***中的需要进行曝光率提升处理的比率控制在一个范围内,例如,被选中参与曝光率提升处理的查询请求与总的查询请求的比率控制为不超过5%。应该理解,该比率可以按照需求进行调整。
具体而言,在服务器端具有一个存储历史查询数据的历史查询数据库,利用该数据库中的历史查询数据,可以提供各个查询词的历史请求信息,从而可以获取是否针对该查询词相关的未被展现过的信息执行曝光率提升处理的调整参数。
具体而言,基于与所述查询相关的历史请求数据,获取调整参数;基于所述调整参数和***产生的随机数,确定是否针对与接收到的查询相关的未被展现过的信息执行曝光率提升处理。
更具体地,首先,基于与所述查询相关的历史查询数据,获取调整参数。例如,可以将调整参数的基准设置为1.0,并在基准1.0的基础上,根据历史请求数据(可以是但不限于查询的频率和点击率等),可以但不限于使用如下的公式进行调整预估从而获取调整参数:
调整参数=1.0+alpha*点击率+beta*log(gama/频率)公式1
在公式中,例如alpha=0.2,beta=0.3,gama=1000;
然后,基于所述调整参数和***产生的随机数,确定是否针对与接收到的查询相关的未被展现过的信息执行曝光率提升处理。
具体而言,例如,随机数可以使用均匀分布(例如,目标比率为5%,即为参数为20的均匀分布)生成,从而使其最终结果满足上面所述的目标比率(5%)。当然,应该理解,随机数也可以使用其他方式来生成。
然后,如上所述,基于随机数,使得最终结果满足目标比率;然后基于调整参数,能够获得确定是否针对与接收到的查询相关的未被展现过的信息执行曝光率提升处理的判断阈值,该判断阈值=目标比率*调整参数。从而基于该判断阈值确定是否针对与接收到的查询相关的未被展现过的信息执行曝光率提升处理。
例如,当该判断阈值大于等于5%,确定针对与接收到的查询相关的未被展现过的信息执行曝光率提升处理;当该判断阈值小于5%,确定针对与接收到的查询相关的未被展现过的信息不执行曝光率提升处理。应当理解,可以按照需要选择其他的阈值而不限于上述具体的阈值数值。
举例而言,例如需要展现的信息可以是广告信息,可以确定是否针对与接 收到的查询词相关的未被展现过的广告信息执行曝光率提升处理。也即是说,首先,基于与所述查询相关的历史请求数据,获取调整参数;然后,基于所述调整参数和***产生的随机数,确定是否针对与接收到的查询相关的未被展现过的广告信息执行曝光率提升处理。
根据本发明的实施例,所述信息可以包括以下至少之一:展现次数在预定值以下的信息、预定地域的信息、预定时段的信息。例如,信息可以是展现次数在10次以下的广告信息、北京市的广告信息等等。应该理解,本发明的信息也可以是其他类型的信息。
如果在步骤110处确定针对与接收到的该查询词相关的未被展现过的信息执行曝光率提升处理,则在步骤120处,检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值。第一阈值例如可以为:两次每小时。
具体而言,检查与该查询词相关联的历史查询数据的查询频率。如果查询频率较高,例如大于等于每小时两次,则该方法进入到步骤130。如果查询频率较低,例如小于每小时两次,则该方法结束。
应该理解,第一阈值不限于以上数值,而是可以按照需要选取任何适当的数值作为第一阈值。
接下来,在步骤130处,基于与所述查询词相关的历史展现数据,确定候选信息。
具体而言,在预先建立的历史展现数据的数据库中,查找与该查询词相关的历史展现数据,确定天级展现次数小于第二阈值的与该查询词相关联的历史展现数据。例如,第二阈值可以为10次。也即是说查找并确定每天展现次数小于10次的与该查询词相关联的历史展现数据,并且将与查找到的历史展现数据对应的信息确定为候选信息。应该理解,也可以确定其他时间级的展现次数小于某个阈值的与该查询词相关联的历史展现数据。例如,周级(七天)展现次数小于70次、或者60小时展现次数小于25次等等。
例如与该查询词“年货大礼包”相关的历史展现数据中,信息ID“A1234123”、信息ID“A1231312”和信息ID“A1343141”的天级展现次数小于第二阈值(例如,10次),从而将这三个信息确定为候选信息。
确定了候选信息之后,在步骤140处,基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数。
具体而言,基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数。
例如,信息ID“A1234123”、信息ID“A1231312”属于一个信息组“G111223”, 信息ID“A1343141”属于另一个信息组“G222121”。查询上述三个候选信息的各种基本数据,并且查询它们所属的上述两个信息组中的其他同组信息ID的历史展现数据,各个信息ID的历史展现数据包括各个信息ID的展现质量参数,从各个信息组中最高的展现质量参数作为该组中的候选信息的预估展现质量参数。例如,将信息组“G111223”中例如最高展现质量参数a(信息A的展现质量参数)作为候选信息(信息ID“A1234123”、信息ID“A1231312”)的预估展现质量参数a。将信息组“G222121”中例如最高展现质量参数b(信息b的展现质量参数)作为候选信息(信息ID“A1343141”)的预估展现质量参数b。
然后,在步骤150处,将将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列,以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
承接上述例子,例如,如果参数a大于参数b,则将具有预估的最高展现质量参数(参数a)的候选信息作为推荐信息推荐到候选展现队列中。
具体而言,可以将具有预估的最高展现质量参数a的候选信息的信息ID“A1234123”、信息ID“A1231312”都推荐到候选展现队列中。
可选地,也可以通过轮询的方式,一次仅将多个候选信息中的一个信息推荐到候选展现队列中。
应该理解,也可以使用其他适当的方式来实现将多个候选信息中的一个信息推荐到候选展现队列中。以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
整体展现竞争处理是在排序基础上的候选结果选取。以搜索广告信息为例,首先,对进入整体展现竞争处理的广告信息按照预定规则进行打分并排序,例如按照分值从大到小进行排列。例如,搜索广告信息排序的标准例如可以由两部分决定:广告创意质量,关键词的竞拍价格。即,排序分值=广告创意质量*关键词竞拍价格。接下来,计算推左和过滤结果。此步骤类似于分类,将优质广告推左,将劣质广告过滤。最后,依赖于业务需求,对广告信息进行展现。但是,并不一定是所有广告都会得到展现,例如,搜索广告一般是左侧3条,右侧8条结果。
从而,通过本发明的提升信息的曝光率的方案,提升了信息进入整体展现竞争处理的机会,并最终提升了信息被最终展现的机会。
本发明还提供了一种提升信息的曝光率的装置。如图2所示,图2是根据本发明一实施例的提升信息的曝光率的装置的结构图200。
装置200可以包括:第一确定模块210、检查模块220、第二确定模块230、 预估模块240以及推荐模块250。
其中,第一确定模块210可以用于确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理。
根据本申请的实施例,第一确定模块210可以进一步包括:获取子模块211和第一确定子模块212。其中获取子模块211可以用于基于与所述查询词相关的历史查询数据,获取调整参数;并且第一确定子模块212可以用于基于所述调整参数和***产生的随机数,确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理。
检查模块220可以用于检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值。
在一种实施例中,检查模块220进一步可以被配置成:如果与该查询词相关联的历史查询数据的查询频率小于第一阈值,则放弃曝光率提升处理。第一阈值例如可以为:两次每小时。
第二确定模块230可以用于基于与所述查询词相关的历史展现数据,确定候选信息。
根据本申请的实施例,第二确定模块230可以进一步包括:查找子模块231和第二确定子模块232。其中,查找子模块231可以用于查找天级展现次数小于第二阈值的与该查询词相关联的历史展现数据;并且,第二确定子模块232可以用于将与查找到的历史展现数据对应的信息确定为候选信息。
预估模块240可以用于基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数。
推荐模块250可以用于将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
根据本申请的实施例,装置200还可以包括进一步包括:展现质量参数确定模块(未示出),该模块可以用于如果该推荐信息在整体展现竞争处理中获得了展现,则将该推荐信息在该展现过程中获得的展现质量参数确定为该推荐信息的初始展现质量参数。
由于图2所描述的本发明的装置所包括的各个模块的具体实施方式与本发明的方法中的步骤的具体实施方式是相对应的,由于已经对图1进行了详细的描述,所以为了不模糊本发明,在此不再对各个模块的具体细节进行描述。
在确定搜索词的价值的方法一个确定搜索词的价值的方法实现方式中,主要包括以下几个步骤:
步骤1,在广告展现日志中统计所有搜索词的广告展现数量和广告点击数 量;
步骤2,计算搜索词的广告点击率=检索词广告点击数量/检索词广告展现数量;
步骤3,如果检索词广告点击率小于一个阈值并且广告展现数量大于一个阈值,则这个检索词为低价值的;反之,如果检索词广告点击率大于一个阈值并且广告展现数量大于一个阈值,则这个检索词为高价值的。具体例子如下:比如搜索词点击率的阈值为5%,搜索词展现阈值的阈值为50;而搜索词“落日余晖的散文”广告展现次数为100,点击次数为1,则这个词为低价值的;而搜索词“笔记本电脑”广告展现次数为10000,点击次数为1000,则这个词为高价值的。
在该实现方式中,需要人工指定搜索词点击率阈值和搜索词展现阈值,效果的好坏极大依赖工作者的经验;并且该实现方式只能判断价值高或者低,无法给出一个价值的具体数值,在实际应用中不够平滑;而且,该实现方式主要来自于统计,所以推广性较差,覆盖率比较低,并且准确率也有提升空间,不能完全满足搜索广告***的需要。
下面将参考附图,详细描述本发明改进的确定搜索词的价值的方法的技术方案。
为了更好地理解本发明的技术方案,首先介绍本发明的价值回归模型的获取方法。如图3所示,图3是根据本发明一个实施例的获取价值回归模型的方法的流程图。
在步骤S310处,将已有搜索词基于点击关系数据和/或展现关系数据而进行聚类,以获得聚类后的搜索词集合。
具体来说,首先,需要获取不同搜索词的共同点击次数并基于所述共同点击次数计算点击关系数据和/或获取不同搜索词的共同展现次数并基于所述共同展现次数计算展现关系数据。
例如,可以获取不同搜索词的共同展现次数并基于所述共同展现次数计算展现关系数据。
假设被输入的某个搜索词为Q1,而基于该搜索词被搜索引擎展现的数据为D1,D2,D3,D4;而被输入的另一搜索词为Q2,基于该搜索词被搜索引擎展现的数据为D2,D3,D5,D7,则它们的共同展现次数为2(D2,D3);此时可以使用某种相关性来描述Q1和Q2之间的展现关系,例如可以假设这个相关性被定义成共同展现次数/Q1的展现数,则此时Q1,Q2的展现关系可以表示为展现相关度2/4=0.5。
应该理解,也可以使用任何适当的其他的方式来表示两个搜索词之间的展 现关系,而不限于上面的方式。例如也可以将相关性定义为共同展现次数/Q2的展现数或者共同展现次数/(Q1的展现数+Q2的展现数)等等。
类似地,可以获取到搜索词两两之间的展现关系数据。
此外,还可以获取不同搜索词的共同点击次数并基于所述共同点击次数计算点击关系数据。
假设被输入的某个搜索词为Q1,而基于该搜索词被搜索引擎展现并被用户点击的数据为D1,D2,D3,D4;而被输入的另一搜索词为Q2,基于该搜索词被搜索引擎展现并被用户点击的数据为D2,D3,D4,D7,则它们的共同点击次数为3(D2,D3,D4);此时可以使用某种相关性来描述Q1和Q2之间的点击关系,例如可以假设这个相关性被定义成共同点击次数/Q1的点击数,则此时Q1,Q2的点击关系可以表示为点击相关度3/4=0.75。
类似地,可以获取到搜索词两两之间的点击关系数据。
应该理解,也可以使用任何适当的其他的方式来表示两个搜索词之间的点击关系,而不限于上面的方式。例如也可以将相关性定义为共同点击次数/Q2的点击数或者共同点击次数/(Q1的点击数+Q2的点击数)等等。
应当理解,共同点击次数、共同展现次数、点击关系数据、展现关系数据分别表示两个搜索词之间的共同点击次数、共同展现次数、点击关系数据、展现关系数据。也即是说,上述参数是指两两搜索词之间的相关性参数。
在获取了点击关系数据、展现关系数据、共同点击次数、共同展现次数中的至少一个之后,可以基于所述点击关系数据、展现关系数据、共同展现次数和共同点击次数中的至少一个,计算已有搜索词之间的聚类距离。然后,基于所述聚类距离将已有搜索词进行聚类,以获得聚类后的搜索词集合。
承接上面的例子,例如Q1的展现数据被表示为<D1,D2,D3,D4>,Q1的展现数据被表示为<D2,D3,D5,D7>,然后使用聚类算法计算Q1和Q2搜索词之间的聚类距离。通过类似的方法,计算出所有的搜索词的聚类距离,从而实现搜索词的聚类。例如,可以使用谱聚类或者kmeans聚类算法并基于点击关系数据、展现关系数据、共同点击次数、共同展现次数中的至少一个而计算搜索词之间的聚类距离,从而实现对搜索词进行聚类,并且从而获得聚类后的搜索词集合。
在步骤S320处,将搜索词集合分类为不同价值的搜索词集合。
具体而言,可以将所有集合分类为预定数量的搜索词集合。可选地,例如在本发明的一个优选实施例中,可以将集合分类为三类:高价值的搜索词集合、中价值的搜索词集合以及低价值的搜索词集合,其中高价值的搜索词集合中搜索词的价值数据大于中价值的搜索词集合中搜索词的价值数据;以及中价值的搜索词集合中搜索词的价值数据大于低价值的搜索词集合中搜索词的价值数 据。按照一定规则将所有的搜索词集合分类为预订数量的搜索词集合。更具体而言,针对每个搜索词,已经利用日志统计数据预先确定其价值数据。例如可以近似地用千次搜索带来的价值来衡量该搜索词的价值数据,它反映了单位搜索内搜索词的盈利能力,也就是它的价值。这样,利用日志统计数据,可以获取搜索词的价值数据,并根据价值数据分布将每个搜索词确定为例如高、中、低三个档次。然后,再根据单个搜索词的价值数据,就能够得到聚类后的搜索词集合的集合价值数据。同理可将聚类后的搜索词集合分配为不同价值的搜索词集合。
应该理解,对搜索词和/或搜索词集合划分不同价值的一定规则是灵活且可变的,其可以根据***需求而做出调整。例如可以将搜索词划分成更多的档次或者更少的档次,同样也可以将搜索词集合划分成更多的档次或者更少的档次。这些划分方式都在本发明的保护范围之内。
在步骤S330处,利用不同价值的搜索词集合进行模型训练以获取价值回归模型。
将搜索词分类之后,利用不同价值的搜索词集合进行模型训练,最终获取价值回归模型。
具体而言,可以将每个搜索词集合中的每个搜索词作为一份对应该搜索词集合的价值数据的样本,具体地,承接上面的示例,将高价值的搜索词集合中的每个搜索词作为一份2样本、中价值的搜索词集合中的每个搜索词作为1份1样本并且低价值的搜索词集合中的每个搜索词作为1份0样本利用逻辑回归算法进行训练以形成所述价值回归模型。例如,假设在价值回归模型中,存在3个聚类的标注数据:聚类1中的搜索词例如为“笔记本电脑”、“mac air”、“thinkpad”等,商业价值标注为1(高等商业价值);聚类2中的搜索词为“刘德华”、“张学友”、“刘德华的专辑”等,商业价值标注为0(低商业价值);聚类3中的搜索词为“5寸手机有多大”,“android手机是否流畅”等,商业价值标注为0.5(中商业价值)。也即是说,通过训练获取到该价值回归模型的参数,从而利用该价值回归模型对待测搜索词的价值数据进行预测。
应当理解,如何对不同价值的搜索词集合中的搜索词进行样本化的方式也可以是其他任何适当的方式而不限于上述的方式。
至此,参照图3描述了价值回归模型的构建方法。
下面,利用形成的价值回归模型并参考图4来描述本发明的确定搜索词的价值的方法。如图4所示,图4是根据本发明一实施例的确定搜索词的价值的方法的流程图。
在步骤S410处,将待测搜索词的特征数据输入价值回归模型。具体而言, 为了利用如图3所示的方法所建立的价值回归模型来预测待测的搜索词的价值数据,首先需要提取待测搜索词的特征数据并且将其输入价值回归模型。通过图3所示的模型训练已经获得了该价值回归模型的参数,现在将待测搜索词的特征数据输入该模型。搜索词的特征数据例如可以包括但不限于搜索词的长度、搜索词的类别、搜索词分词后的结果等。
举例而言,比如在价值回归模型中,存在3个聚类的标注数据:聚类1中的搜索词例如为“笔记本电脑”、“mac air”、“thinkpad”等,商业价值标注为1(高等商业价值);聚类2中的搜索词为“刘德华”、“张学友”、“刘德华的专辑”等,商业价值标注为0(低商业价值);聚类3中的搜索词为“5寸手机有多大”,“android手机是否流畅”等,商业价值标注为0.5(中商业价值)。例如,首先,将待测搜索词“东芝笔记本”的特征数据输入价值回归模型。
在步骤S420处,基于价值回归模型,获取所述待测搜索词的价值数据。
承接上述例子,例如将待测搜索词“东芝笔记本”的特征数据输入价值回归模型,则训练的模型对“东芝笔记本”将给出的价值数据例如是0.8(是大于0.5小于等于1的一个数)。再例如,基于价值回归模型,获取到待测搜索词“李连杰”的价值数据例如是0.1(小于0.5大于0的一个数)。
本发明还提供了一种确定搜索词的价值的装置。如图5所示,图5是根据本发明一实施例的确定搜索词的价值的装置500的结构框图。
装置500可以包括输入模块510以及获取模块520。其中,输入模块510可以用于将待测搜索词输入价值回归模型。获取模块530可以用于基于价值回归模型,获取所述待测搜索词的价值数据。
根据本发明的实施例,价值回归模型可以是通过如下模块获取的:
聚类模块(未示出),其可以用于将已有搜索词基于点击关系数据和/或展现关系数据而进行聚类,以获得聚类后的搜索词集合;
分类模块(未示出),其可以用于将搜索词集合分类为不同价值的搜索词集合;
模型获取模块(未示出),其可以用于利用不同价值的搜索词集合进行模型训练以获取价值回归模型。
根据本发明的实施例,上述不同价值的搜索词集合可以包括高价值的搜索词集合、中价值的搜索词集合以及低价值的搜索词集合,其中高价值的搜索词集合中搜索词的价值数据大于中价值的搜索词集合中搜索词的价值数据;以及中价值的搜索词集合中搜索词的价值数据大于低价值的搜索词集合中搜索词的价值数据。
其中,高价值的搜索词集合中搜索词的价值数据为1、中价值的搜索词集合中搜索词的价值数据为0.5以及低价值的搜索词集合中搜索词的价值数据为0。
根据本发明的实施例,其中,聚类模块可以进一步包括关系数据获取子模块、计算子模块以及获取子模块。
其中,关系数据获取子模块,可以用于获取不同搜索词的共同点击次数并基于所述共同点击次数计算点击关系数据和/或不同搜索词的共同展现次数基于所述共同展现次数计算展现关系数据;
计算子模块,可以用于基于所述点击关系数据、展现关系数据、共同展现次数和共同点击次数中的至少一个,计算已有搜索词之间的聚类距离;
获取子模块,可以用于基于所述聚类距离将已有搜索词进行聚类,以获得聚类后的搜索词集合。
其中,共同点击次数、共同展现次数、点击关系数据、展现关系数据分别表示两个搜索词之间的共同点击次数、共同展现次数、点击关系数据、展现关系数据。
根据本发明的实施例,模型获取模块可以进一步被配置成:
将高价值的搜索词集合中的每个搜索词作为一份2样本、中价值的搜索词集合中的每个搜索词作为一份1样本并且低价值的搜索词集合中的每个搜索词作为一份0样本利用所述逻辑回归算法进行训练以形成所述价值回归模型。
由于本实施例的装置所实现的功能基本相应于前述图3和图4所示的方法实施例,故本实施例的描述中未详尽之处,可以参见前述实施例中的相关说明,在此不做赘述。
在此提供的算法和显示不与任何特定计算机、虚拟***或者其它设备固有相关。各种通用***也可以与基于在此的示教一起使用。根据上面的描述,构造这类***所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确 记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。
本领域那些技术人员可以理解,可以对实施例中的客户端中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个客户端中。可以把实施例中的模块组合成一个模块,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者客户端的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的提升信息的曝光率的装置和确定搜索词的价值的装置中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
例如,图6示出了可以实现本发明的提升信息的曝光率的方法和确定搜索词的价值的方法的电子设备。该电子设备传统上包括处理器610和以存储器620形式的计算机程序产品或者计算机可读介质。存储器620可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器620具有用于执行上述方法中的任何方法步骤的程序代码631的存储空间630。例如,用于程序代码的存储空间630可以包括分别用于实现上面的方法中的各种步骤的各个程序代码631。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。 这样的计算机程序产品通常为如参考图7所述的便携式或者固定存储单元。该存储单元可以具有与图6的电子设备中的存储器620类似布置的存储段或者存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括用于执行根据本发明的方法步骤的程序631’,即可以由例如诸如610之类的处理器读取的代码,这些代码当由电子设备运行时,导致该电子设备执行上面所描述的方法中的各个步骤。
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本发明的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
应该注意的是,上述实施例对本发明进行的详细说明并不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”或“包括”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。
此外,还应当注意,本说明书中使用的语言主要是为了可读性和教导的目的而选择的,而不是为了解释或者限定本发明的主题而选择的。因此,在不偏离所附权利要求书的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。对于本发明的范围,对本发明所做的公开是说明性的,而非限制性的,本发明的范围由所附权利要求书限定。

Claims (25)

  1. 一种提升信息的曝光率的方法,其特征在于,包括:
    确定是否针对与接收到的查询词相关的信息执行曝光率提升处理;
    如果是,则检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值;
    如果是,则基于与所述查询词相关的历史展现数据,确定候选信息;
    基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数;以及
    将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列,以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
  2. 根据权利要求1所述的方法,其特征在于,进一步包括:
    如果与该查询词相关联的历史查询数据的查询频率小于第一阈值,则放弃曝光率提升处理。
  3. 根据权利要求1所述的方法,其特征在于,进一步包括:
    如果该推荐信息在整体展现竞争处理中获得了展现,则将该推荐信息在该展现过程中获得的展现质量参数确定为该推荐信息的初始展现质量参数。
  4. 根据权利要求1所述的方法,其特征在于,确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理进一步包括:
    基于与所述查询词相关的历史查询数据,获取调整参数;以及
    基于所述调整参数和***产生的随机数,确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理。
  5. 根据权利要求1所述的方法,其特征在于,基于与所述查询词相关的历史展现数据,确定候选信息,进一步包括:
    查找天级展现次数小于第二阈值的与该查询词相关联的历史展现数据;并且,
    将与查找到的历史展现数据对应的信息确定为候选信息。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述信息包括以下至少之一:展现次数在预定值以下的信息、预定地域的信息、预定时段的信息。
  7. 一种提升信息的曝光率的装置,其特征在于,包括:
    第一确定模块,用于确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理;
    检查模块,用于检查与该查询词相关联的历史查询数据的查询频率是否大于等于第一阈值;
    第二确定模块,用于基于与所述查询词相关的历史展现数据,确定候选信息;
    预估模块,用于基于所有候选信息的基本数据及所有候选信息所属的信息组的历史展现数据,预估所有候选信息的展现质量参数;
    推荐模块,用于将具有预估的最高展现质量参数的候选信息作为推荐信息推荐到候选展现队列以便以该预估的最高展现质量参数作为该推荐信息的展现质量参数进行整体展现竞争处理。
  8. 根据权利要求7所述的装置,其特征在于,检查模块进一步被配置成:
    如果与该查询词相关联的历史查询数据的查询频率小于第一阈值,则放弃曝光率提升处理。
  9. 根据权利要求7所述的装置,其特征在于,进一步包括:
    展现质量参数确定模块,用于如果该推荐信息在整体展现竞争处理中获得了展现,则将该推荐信息在该展现过程中获得的展现质量参数确定为该推荐信息的初始展现质量参数。
  10. 根据权利要求7所述的装置,其特征在于,第一确定模块进一步包括:
    获取子模块,用于基于与所述查询词相关的历史查询数据,获取调整参数;以及
    第一确定子模块,用于基于所述调整参数和***产生的随机数,确定是否针对与接收到的查询词相关的未被展现过的信息执行曝光率提升处理。
  11. 根据权利要求7所述的装置,其特征在于,第二确定模块进一步包括:
    查找子模块,用于查找天级展现次数小于第二阈值的与该查询词相关联的历史展现数据;并且,
    第二确定子模块,用于将与查找到的历史展现数据对应的信息确定为候选信息。
  12. 一种确定搜索词的价值的方法,其特征在于,包括:
    将待测搜索词的特征数据输入价值回归模型;
    基于价值回归模型,获取所述待测搜索词的价值数据;
    其中,所述价值回归模型是通过如下方式获取的:
    将已有搜索词基于点击关系数据和/或展现关系数据而进行聚类,以获得聚类后的搜索词集合;
    将搜索词集合分类为不同价值的搜索词集合;
    利用不同价值的搜索词集合进行模型训练以获取价值回归模型。
  13. 根据权利要求12所述的方法,其特征在于,所述不同价值的搜索词集合包括高价值的搜索词集合、中价值的搜索词集合以及低价值的搜索词集合, 其中高价值的搜索词集合中搜索词的价值数据大于中价值的搜索词集合中搜索词的价值数据;以及中价值的搜索词集合中搜索词的价值数据大于低价值的搜索词集合中搜索词的价值数据。
  14. 根据权利要求13所述的方法,其特征在于,高价值的搜索词集合中搜索词的价值数据为1、中价值的搜索词集合中搜索词的价值数据为0.5以及低价值的搜索词集合中搜索词的价值数据为0。
  15. 根据权利要求12所述的方法,其特征在于,将已有搜索词基于所述已有搜索词之间的点击关系数据和展现关系数据而进行聚类,以获得聚类后的搜索词集合,进一步包括:
    获取不同搜索词的共同点击次数并基于所述共同点击次数计算点击关系数据和/或获取不同搜索词的共同展现次数并基于所述共同展现次数计算展现关系数据;
    基于所述点击关系数据、展现关系数据、共同展现次数和共同点击次数中的至少一个,计算已有搜索词之间的聚类距离;
    基于所述聚类距离将已有搜索词进行聚类,以获得聚类后的搜索词集合。
  16. 根据权利要求15所述的方法,其特征在于,共同点击次数、共同展现次数、点击关系数据、展现关系数据分别表示两个搜索词之间的共同点击次数、共同展现次数、点击关系数据、展现关系数据。
  17. 根据权利要求13所述的方法,其特征在于,利用不同价值的搜索词集合进行模型训练以获取价值回归模型,进一步包括:将每个搜索词集合中的每个搜索词作为一份对应该搜索词集合的价值数据的样本,具体地,
    将高价值的搜索词集合中的每个搜索词作为一份2样本、中价值的搜索词集合中的每个搜索词作为一份1样本并且低价值的搜索词集合中的每个搜索词作为一份0样本利用所述逻辑回归算法进行训练以形成所述价值回归模型。
  18. 一种确定搜索词的价值的装置,其特征在于,包括:
    输入模块,用于将待测搜索词的特征数据输入价值回归模型;
    获取模块,用于基于价值回归模型,获取所述待测搜索词的价值数据;
    其中,所述价值回归模型是通过如下模块获取的:
    聚类模块,用于将已有搜索词基于点击关系数据和/或展现关系数据而进行聚类,以获得聚类后的搜索词集合;
    分类模块,用于将搜索词集合分类为不同价值的搜索词集合;
    模型获取模块,用于利用不同价值的搜索词集合进行模型训练以获取价值回归模型。
  19. 根据权利要求18所述的装置,其特征在于,所述不同价值的搜索词集 合包括高价值的搜索词集合、中价值的搜索词集合以及低价值的搜索词集合,其中高价值的搜索词集合中搜索词的价值数据大于中价值的搜索词集合中搜索词的价值数据;以及中价值的搜索词集合中搜索词的价值数据大于低价值的搜索词集合中搜索词的价值数据。
  20. 根据权利要求19所述的装置,其特征在于,高价值的搜索词集合中搜索词的价值数据为1、中价值的搜索词集合中搜索词的价值数据为0.5以及低价值的搜索词集合中搜索词的价值数据为0。
  21. 根据权利要求18所述的装置,其特征在于,聚类模块进一步包括:
    关系数据获取子模块,用于获取不同搜索词的共同点击次数并基于所述共同点击次数计算点击关系数据和/或获取不同搜索词的共同展现次数基于所述共同展现次数计算展现关系数据;
    计算子模块,用于基于所述点击关系数据、展现关系数据、共同展现次数和共同点击次数中的至少一个,计算已有搜索词之间的聚类距离;以及
    获取子模块,用于基于所述聚类距离将已有搜索词进行聚类,以获得聚类后的搜索词集合。
  22. 根据权利要求21所述的装置,其特征在于,共同点击次数、共同展现次数、点击关系数据、展现关系数据分别表示两个搜索词之间的共同点击次数、共同展现次数、点击关系数据、展现关系数据。
  23. 根据权利要求19所述的装置,其特征在于,模型获取模块进一步被配置成:
    将每个搜索词集合中的每个搜索词作为一份对应该搜索词集合的价值数据的样本,具体地,
    将高价值的搜索词集合中的每个搜索词作为一份2样本、中价值的搜索词集合中的每个搜索词作为一份1样本并且低价值的搜索词集合中的每个搜索词作为一份0样本利用所述逻辑回归算法进行训练以形成所述价值回归模型。
  24. 一种计算机程序,包括计算机可读代码,当电子设备运行所述计算机可读代码运行时,导致权利要求1-6和12-17中的任一项权利要求所述的提升信息的曝光率的方法和确定搜索词的价值的方法被执行。
  25. 一种计算机可读介质,其中存储了如权利要求24所述的计算机程序。
PCT/CN2014/094298 2014-02-24 2014-12-19 一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置 WO2015124024A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201410063058.0 2014-02-24
CN201410063058.0A CN104866493B (zh) 2014-02-24 2014-02-24 一种提升信息的曝光率的方法和装置
CN201410098737.1 2014-03-17
CN201410098737.1A CN104933047B (zh) 2014-03-17 2014-03-17 一种确定搜索词的价值的方法和装置

Publications (1)

Publication Number Publication Date
WO2015124024A1 true WO2015124024A1 (zh) 2015-08-27

Family

ID=53877618

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/094298 WO2015124024A1 (zh) 2014-02-24 2014-12-19 一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置

Country Status (1)

Country Link
WO (1) WO2015124024A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447724A (zh) * 2015-12-15 2016-03-30 腾讯科技(深圳)有限公司 内容项推荐方法及装置
US20170293934A1 (en) * 2015-05-11 2017-10-12 Tencent Technology (Shenzhen) Company Limited Method for determining validity of delivering of promotion information, monitoring server and terminal
CN110210882A (zh) * 2018-03-21 2019-09-06 腾讯科技(深圳)有限公司 推广位匹配方法和装置、推广信息展示方法和装置
CN111428125A (zh) * 2019-01-10 2020-07-17 北京三快在线科技有限公司 排序方法、装置、电子设备及可读存储介质
CN112749333A (zh) * 2020-07-24 2021-05-04 腾讯科技(深圳)有限公司 资源搜索方法、装置、计算机设备和存储介质
CN112765452A (zh) * 2020-12-31 2021-05-07 北京百度网讯科技有限公司 搜索推荐方法、装置及电子设备
CN113111085A (zh) * 2021-04-08 2021-07-13 达而观信息科技(上海)有限公司 基于流式数据的自动化层级探索方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1877581A (zh) * 2006-07-12 2006-12-13 百度在线网络技术(北京)有限公司 应用于互联网搜索引擎的广告展现***及广告展现方法
CN101331475A (zh) * 2005-12-14 2008-12-24 微软公司 在线商业意图的自动检测
CN101980210A (zh) * 2010-11-12 2011-02-23 百度在线网络技术(北京)有限公司 一种标的词分类分级方法及***
CN101980211A (zh) * 2010-11-12 2011-02-23 百度在线网络技术(北京)有限公司 一种机器学习模型及其建立方法
US20110231241A1 (en) * 2010-03-18 2011-09-22 Yahoo! Inc. Real-time personalization of sponsored search based on predicted click propensity
CN102387411A (zh) * 2010-09-06 2012-03-21 康佳集团股份有限公司 一种机顶盒及其播放广告的方法
CN103164454A (zh) * 2011-12-15 2013-06-19 百度在线网络技术(北京)有限公司 关键词分组方法及***

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101331475A (zh) * 2005-12-14 2008-12-24 微软公司 在线商业意图的自动检测
CN1877581A (zh) * 2006-07-12 2006-12-13 百度在线网络技术(北京)有限公司 应用于互联网搜索引擎的广告展现***及广告展现方法
US20110231241A1 (en) * 2010-03-18 2011-09-22 Yahoo! Inc. Real-time personalization of sponsored search based on predicted click propensity
CN102387411A (zh) * 2010-09-06 2012-03-21 康佳集团股份有限公司 一种机顶盒及其播放广告的方法
CN101980210A (zh) * 2010-11-12 2011-02-23 百度在线网络技术(北京)有限公司 一种标的词分类分级方法及***
CN101980211A (zh) * 2010-11-12 2011-02-23 百度在线网络技术(北京)有限公司 一种机器学习模型及其建立方法
CN103164454A (zh) * 2011-12-15 2013-06-19 百度在线网络技术(北京)有限公司 关键词分组方法及***

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170293934A1 (en) * 2015-05-11 2017-10-12 Tencent Technology (Shenzhen) Company Limited Method for determining validity of delivering of promotion information, monitoring server and terminal
US10719847B2 (en) * 2015-05-11 2020-07-21 Tencent Technology (Shenzhen) Company Limited Method for determining validity of delivering of promotion information, monitoring server and terminal
CN105447724A (zh) * 2015-12-15 2016-03-30 腾讯科技(深圳)有限公司 内容项推荐方法及装置
CN110210882A (zh) * 2018-03-21 2019-09-06 腾讯科技(深圳)有限公司 推广位匹配方法和装置、推广信息展示方法和装置
CN111428125A (zh) * 2019-01-10 2020-07-17 北京三快在线科技有限公司 排序方法、装置、电子设备及可读存储介质
CN111428125B (zh) * 2019-01-10 2023-05-30 北京三快在线科技有限公司 排序方法、装置、电子设备及可读存储介质
CN112749333A (zh) * 2020-07-24 2021-05-04 腾讯科技(深圳)有限公司 资源搜索方法、装置、计算机设备和存储介质
CN112749333B (zh) * 2020-07-24 2024-01-16 腾讯科技(深圳)有限公司 资源搜索方法、装置、计算机设备和存储介质
CN112765452A (zh) * 2020-12-31 2021-05-07 北京百度网讯科技有限公司 搜索推荐方法、装置及电子设备
CN112765452B (zh) * 2020-12-31 2024-02-27 北京百度网讯科技有限公司 搜索推荐方法、装置及电子设备
CN113111085A (zh) * 2021-04-08 2021-07-13 达而观信息科技(上海)有限公司 基于流式数据的自动化层级探索方法和装置
CN113111085B (zh) * 2021-04-08 2024-01-30 达观数据有限公司 基于流式数据的自动化层级探索方法和装置

Similar Documents

Publication Publication Date Title
WO2015124024A1 (zh) 一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置
Ibrahim et al. Decoding the sentiment dynamics of online retailing customers: Time series analysis of social media
CN104281622B (zh) 一种社交媒体中的信息推荐方法和装置
WO2021174944A1 (zh) 基于目标对象活跃度的消息推送方法及相关设备
US10348550B2 (en) Method and system for processing network media information
CN105760400B (zh) 一种基于搜索行为的推送消息排序方法及装置
TWI512653B (zh) Information providing method and apparatus, method and apparatus for determining the degree of comprehensive relevance
US8341101B1 (en) Determining relationships between data items and individuals, and dynamically calculating a metric score based on groups of characteristics
WO2018053966A1 (zh) 点击率预估
CN110457577B (zh) 数据处理方法、装置、设备和计算机存储介质
WO2019169978A1 (zh) 资源推荐方法及装置
WO2018040069A1 (zh) 信息推荐***及方法
WO2018149337A1 (zh) 一种信息投放方法、装置及服务器
CN108921398B (zh) 店铺质量评价方法及装置
CN107526810B (zh) 建立点击率预估模型的方法及装置、展示方法及装置
US8234230B2 (en) Data classification tool using dynamic allocation of attribute weights
CN108369674B (zh) 使用目标聚类方法对具有混合属性类型的客户进行细分的***和方法
TWI803823B (zh) 資源資訊推送方法、裝置、伺服器及存儲介質
CN105956882A (zh) 一种获取采购需求的方法及装置
WO2022081267A1 (en) Product evaluation system and method of use
CN109740036B (zh) Ota平台酒店排序方法及装置
WO2019242453A1 (zh) 信息处理方法及装置、存储介质、电子装置
CN106445965B (zh) 信息推广处理方法及装置
CN111626767A (zh) 资源数据的发放方法、装置及设备
CN117541322B (zh) 一种基于大数据分析的广告内容智能生成方法及***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14883023

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14883023

Country of ref document: EP

Kind code of ref document: A1