WO2020047861A1 - 用于生成排序模型的方法和装置 - Google Patents

用于生成排序模型的方法和装置 Download PDF

Info

Publication number
WO2020047861A1
WO2020047861A1 PCT/CN2018/104683 CN2018104683W WO2020047861A1 WO 2020047861 A1 WO2020047861 A1 WO 2020047861A1 CN 2018104683 W CN2018104683 W CN 2018104683W WO 2020047861 A1 WO2020047861 A1 WO 2020047861A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
document
click
initial model
deviation
Prior art date
Application number
PCT/CN2018/104683
Other languages
English (en)
French (fr)
Inventor
汪洋
胡子牛
彭曲
李航
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Priority to PCT/CN2018/104683 priority Critical patent/WO2020047861A1/zh
Priority to US16/980,897 priority patent/US11403303B2/en
Publication of WO2020047861A1 publication Critical patent/WO2020047861A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Definitions

  • Embodiments of the present application relate to the field of computer technology, and in particular, to a method and an apparatus for generating a ranking model.
  • Ranking learning is a sorting method based on supervised learning. Its task is to sort a set of documents. It hopes to be able to design algorithms by using manually labeled data to mine the rules hidden in the data, so as to complete the ordering of documents that reflect the relevance of arbitrary query requirements.
  • click data is used to train the sorting model, and the search results are sorted by the sorting model.
  • the existing way of training the ranking model is to first estimate the position bias of the click based on the click data, and then use the single document method (PointWise Approach) to train the ranking model based on the click data and the position bias.
  • the embodiments of the present application provide a method and device for generating a ranking model.
  • an embodiment of the present application provides a method for generating a ranking model.
  • the method includes: obtaining a sample set, where the samples in the sample set include query information, a clicked first position document in the query result, and The clicked second position document; the following training steps are performed: for the samples in the sample set, the query information, the first position document and the second position document in the sample are input to the initial model, and the scores of the input documents are obtained respectively Based on the obtained score, the click deviation of the first position and the non-click deviation of the second position, the target value of the sample is determined, wherein the click deviation and the non-click deviation are used to characterize the position of the document in the query result against the document, respectively.
  • the degree of influence of the probability of being clicked and the probability of not being clicked update the initial model based on the target values of each sample; determine whether the initial model has been trained; and in response to determining that the initial model training is complete, determine the updated initial model as Sorting models.
  • the training step further includes: re-estimating the click bias and non-click bias of each location based on the updated initial model and sample set to Click deviation and non-click deviation for each position are updated.
  • the method further comprises: in response to determining that the initial model has not been trained, using the updated initial model and the click bias and non-click bias of each updated position to continue the training step.
  • the target value of the sample is determined based on the obtained score, the click deviation of the first position, and the non-click deviation of the second position, including: the obtained score, the click deviation of the first position, and the The non-click deviation of the two positions is input to a pre-established gradient calculation formula, and the gradient calculation result is determined as the target value of the sample.
  • the initial model is a decision tree; and updating the initial model based on the target values of each sample includes: creating a decision tree to fit the target values of each sample; and updating based on the created decision tree Initial model.
  • determining whether the initial model has been trained includes: determining the number of decision trees that have been created, comparing the number with a preset number; and determining whether the initial model has been trained according to the comparison result.
  • the target value of the sample is determined based on the obtained score, the click deviation of the first position, and the non-click deviation of the second position, including: the obtained score, the click deviation of the first position, and the The non-click deviation of the two positions is input to a pre-established loss function to obtain a loss value, and the loss value is determined as the target value of the sample.
  • determining whether the initial model has been trained includes: determining an average value of target values of each sample, comparing the average value with a preset value; and determining whether the initial model has been trained according to the comparison result.
  • an embodiment of the present application provides an apparatus for generating a ranking model.
  • the apparatus includes: an obtaining unit configured to obtain a sample set, where the samples in the sample set include query information and clicked ones in the query result. A first location document and a second location document that has not been clicked; a first training unit configured to perform the following training steps: for a sample in a sample set, query information in the sample, the first location document, and the second location document Input to the initial model, and get the scores of the input documents. Based on the obtained scores, the click bias of the first position and the non-click bias of the second position, determine the target value of the sample.
  • the bias is used to characterize the impact of the position of the document in the query result on the probability of the document being clicked and the probability of not being clicked; the initial model is updated based on the target values of each sample; determining whether the initial model has been trained; responding to It is determined that the initial model training is completed, and the updated initial model is determined as the ranking model.
  • the first training unit is further configured to: after updating the initial model based on the target values of each sample, based on the updated initial model and sample set, re-estimating the click bias and non-click of each location Bias to update click and non-click bias for each location.
  • the apparatus further includes: a second training unit configured to continue execution in response to determining that the initial model is not trained, using the updated initial model and the click bias and non-click bias of each updated position Training steps.
  • the first training unit is further configured to input the obtained score, the click deviation of the first position, and the non-click deviation of the second position into a pre-established gradient calculation formula, and determine the gradient calculation result. Is the target value for the sample.
  • the initial model is a decision tree; and the first training unit is further configured to: create a decision tree to fit the target values of each sample; and update the initial model based on the created decision tree.
  • the first training unit is further configured to: determine the number of decision trees that have been created, compare the number with a preset number; and determine whether the initial model has been trained according to the comparison result.
  • the first training unit is further configured to: input the obtained score, the click deviation of the first position, and the non-click deviation of the second position into a pre-established loss function, obtain a loss value, and loss The value is determined as the target value for the sample.
  • the first training unit is further configured to: determine an average value of the target values of each sample, compare the average value with a preset value; and determine whether the initial model has been trained according to the comparison result.
  • an embodiment of the present application provides a method for generating information, including: in response to receiving a query request including target query information, retrieving candidate documents matching the target query information, and collecting them into a candidate document set; The candidate documents in the candidate document set are input into a ranking model generated by using the method described in any one of the embodiments of the first aspect above, and the scores of the candidate documents are obtained; The candidate documents in are sorted and the sorted results are returned.
  • an embodiment of the present application provides an apparatus for generating information, including: a retrieval unit configured to retrieve a candidate document matching the target query information in response to receiving a query request containing the target query information, Summarized into a candidate document set; an input unit configured to input the candidate documents in the candidate document set into a ranking model generated by using the method described in any one of the embodiments of the first aspect above to obtain the score of each candidate document; ranking The unit is configured to sort the candidate documents in the candidate document set in descending order of scores, and return a ranking result.
  • an embodiment of the present application provides an electronic device including: one or more processors; a storage device storing one or more programs thereon, and when one or more programs are processed by one or more processors During execution: A sample set is obtained, where the samples in the sample set include query information, the first-position document clicked and the second-position document not clicked in the query result; the following training steps are performed: for the samples in the sample set, The query information in the sample, the first position document and the second position document are input to the initial model, and the scores of the input documents are obtained, based on the obtained score, the click deviation of the first position and the unclick deviation of the second position.
  • an embodiment of the present application provides a computer-readable medium having a computer program stored thereon.
  • the processor causes the processor to obtain a sample set, where the samples in the sample set include a query.
  • Information, the clicked first position document and the unclicked second position document in the query result ; perform the following training steps: for the samples in the sample set, query information in the sample, the first position document, and the second position document Input to the initial model, and get the scores of the input documents.
  • the click bias of the first position and the non-click bias of the second position determine the target value of the sample.
  • the bias is used to characterize the impact of the position of the document in the query result on the probability of the document being clicked and the probability of not being clicked; the initial model is updated based on the target value of each sample; determining whether the initial model is trained; It is determined that the initial model training is completed, and the updated initial model is determined as the ranking model.
  • the method and device for generating a ranking model can obtain the sample set and use the samples in the sample set to train the initial model.
  • the samples in the sample set may include query information, a first-position document clicked in the query result, and a second-position document not clicked.
  • the query information, the first position document, and the second position document in the sample are input to the initial model, and the scores of the first position document and the second position document can be obtained.
  • the target value of the sample can be determined.
  • the initial model can be updated based on the target values of each sample. Finally, it can be determined whether the initial model has been trained.
  • the trained initial model can be determined as the ranking model.
  • a model for sorting can be obtained, which is helpful for enriching the generation mode of the model.
  • the model takes into account not only click bias, but also non-click bias.
  • the ranking model improves the accuracy of sorting.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for generating a ranking model according to the present application
  • FIG. 3 is a schematic diagram of an application scenario of a method for generating a ranking model according to the present application
  • FIG. 4 is a flowchart of another embodiment of a method for generating a ranking model according to the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for generating a ranking model according to the present application.
  • FIG. 6 is a flowchart of an embodiment of a method for generating information according to the present application.
  • FIG. 7 is a schematic structural diagram of an embodiment of an apparatus for generating information according to the present application.
  • FIG. 8 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 to which the method for generating a ranking model or the apparatus for generating a ranking model of the present application can be applied.
  • the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105.
  • the network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as information browsing applications, search applications, instant messaging tools, email clients, social platform software, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, and 103 can be various electronic devices with a display screen, including but not limited to smart phones, tablet computers, laptop computers, and desktop computers.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the server 105 may be a server that provides various services, and may be, for example, a processing server that supports a search engine.
  • the processing server may store a sample set or obtain a sample set from another device.
  • a sample set can contain multiple samples.
  • the sample may include query information, a first-position document clicked and a second-position document not clicked in the query result.
  • the processing server can use the samples in the sample set to train the initial model, and can store the training results (such as the generated ranking model). In this way, after the user sends a query request using the terminal devices 101, 102, and 103, the server 105 can determine to sort the query results, and then return the sorted query results to the terminal devices 101, 102, and 103.
  • the server 105 may be hardware or software.
  • the server can be implemented as a distributed server cluster consisting of multiple servers or as a single server.
  • the server can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the method for generating the ranking model provided by the embodiment of the present application is generally executed by the server 105, and accordingly, the apparatus for generating the ranking model is generally set in the server 105.
  • terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • a flowchart 200 of one embodiment of a method for generating a ranking model according to the present application is shown.
  • the method for generating a ranking model includes the following steps:
  • Step 201 Obtain a sample set.
  • an execution subject for example, the server 105 shown in FIG. 1 of the method for generating a ranking model may obtain a sample set in multiple ways.
  • the execution subject may obtain the existing sample set stored therein from another server (for example, a database server) for storing samples by using a wired connection method or a wireless connection method.
  • a user may collect samples through a terminal device (such as the terminal devices 101, 102, and 103 shown in FIG. 1), and store these samples locally to generate a sample set.
  • wireless connection methods may include, but are not limited to, 3G / 4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future. .
  • the samples in the sample set may be obtained in advance from historical behavior information (for example, click data, query requests, etc.) of the user.
  • a sample set can include a large number of samples.
  • the samples in the sample set may include query information, a first-position document clicked in the query result, and a second-position document not clicked.
  • the query information may be a characteristic representation of a query string in a query request sent by a user (for example, a feature vector may be used for representation).
  • the first position document may be any clicked document in the query result.
  • the position of the document in the query results may be referred to as the first position.
  • the second position document may be any unclicked document in the query result.
  • the position of the document in the query result may be referred to as a second position.
  • the document here can be expressed in the form of a feature vector of the document.
  • a user enters a string "machine learning" in a search engine, and the string "machine learning” is a query string.
  • the query result returned by the search engine the user clicked on the document ranked fifth in the query result, but did not click on the document ranked sixth in the query result.
  • the fifth-ranked document clicked by the user may be referred to as a first position document, and the fifth position may be used as the first position.
  • the sixth-ranked document that is not clicked by the user may be referred to as the second-position document, and the sixth-ranked document is used as the second-position document.
  • the user after the user sends a query request, for the returned query result, the user usually only clicks on a small number of documents in the query result. Therefore, for a certain query request, multiple samples can be formed.
  • the query result returned contains 10 documents. The user clicked on two of them. It can be composed of 16 samples.
  • the document in the sample can be marked in advance.
  • the first position document in the sample may correspond to a label indicating that the document is clicked
  • the second position document in the sample may correspond to a label indicating that the document is not clicked.
  • the above-mentioned execution subject may perform the training steps of steps 202 to 204.
  • Step 202 For the samples in the sample set, input the query information, the first position document, and the second position document in the sample into the initial model, and obtain the scores of the input documents, based on the obtained scores and the first position.
  • the click deviation and the unclick deviation of the second position determine the target value of the sample.
  • the foregoing execution subject may be executed as follows:
  • the query information, the first position document and the second position document in the sample are input to the initial model, and scores of the input documents are obtained.
  • the initial model may output the score of the first position document and the score of the second position document by performing feature extraction and analysis on the query information, the first position document, and the second position document.
  • the output score can be used to characterize the relevance of the document calculated by the initial model to the query information. The higher the score of a document, the more relevant the document is to the query information.
  • the initial model may be various existing model structures (such as Ranknet, lambdaRank, SVM, Rank, lambdaMart, decision tree, etc.) applicable to the document pair method (PairWiseApproach) created based on the machine learning technology.
  • the initial model can perform feature extraction on the document and query information, then analyze and process the extracted features, and finally output the score of the document.
  • the document pairing method is one of the methods in ranking learning algorithms.
  • the target value of the sample is determined based on the obtained score, the click deviation of the first position and the non-click deviation of the second position.
  • the obtained score, the click deviation of the first position and the non-click deviation of the second position can be input into a pre-established target value calculation formula to obtain the target value of the sample.
  • the above-mentioned target value calculation formula may be a function or formula related to document score, click deviation, and non-click deviation, which is established in advance.
  • it may be a pre-established gradient calculation formula, a pre-established loss function, a partial derivative of the pre-established loss function, and the like.
  • the value output by the above target value calculation formula is the target value.
  • the above-mentioned target value calculation formula may also be other forms of functions or formulas established in advance, and is not limited to the above list.
  • the click bias can be used to characterize the impact of the position of the document in the query result on the probability of the document being clicked.
  • the non-click bias can be used to characterize the impact of the position of the document in the query result on the probability that the document has not been clicked.
  • the click deviation and non-click deviation of each position can be represented by numerical values.
  • the initial values of the click deviation and the non-click deviation of each position can be set in advance (for example, the initial values are set to 1).
  • click bias can also be referred to as position bias.
  • the execution body may input the obtained score, the click deviation of the first position, and the non-click deviation of the second position into a pre-established loss function to obtain a loss value.
  • the above-mentioned loss value (that is, the value of the objective function) is determined as the target value of the sample.
  • the loss function can be used to estimate the degree of inconsistency between the predicted value of the initial model and the true value. It is a non-negative real-valued function. In general, the smaller the loss value, the better the model's robustness.
  • the loss function here may be based on an existing loss function (such as a cross-entropy loss function), and combined with a click bias and a non-click bias, and established in advance.
  • the product of click deviation and non-click deviation can be used as the denominator, and the cross-entropy loss function can be used as the numerator to establish the loss function here.
  • the denominator of the loss function used is the click deviation of the first position where the first position document in the sample is located and the non-click of the second position where the second position document in the sample is located. The product of the deviations.
  • the above-mentioned execution body may input the obtained score, the click deviation of the first position, and the non-click deviation of the second position into a gradient calculation formula established in advance, and the gradient calculation result Determine the target value for the sample.
  • the gradient calculation formula here may be based on an existing gradient calculation formula (for example, a gradient calculation formula used by a lambdaRank model and a lambdaMART model), and is established in advance by combining click deviation and non-click deviation.
  • the product of click deviation and non-click deviation can be used as the denominator
  • the existing gradient calculation formula used by models such as lambdaRank, lambdaMART, etc. can be used as the numerator to establish the gradient calculation formula here.
  • the denominator of the gradient calculation formula used is the click deviation of the first position where the first position document in the sample is located, and the uncorrected second position of the second position document in the sample. Click on the product of the deviations.
  • Step 203 Update the initial model based on the target value of each sample.
  • the execution body may update the initial model based on the target values of each sample.
  • the initial model can be updated in different ways.
  • the execution entity may first determine an average value of the loss values of each sample. Then, the back-propagation algorithm can be used to find the gradient of the average value of the loss value relative to the initial model parameters, and then the gradient descent algorithm is used to update the initial model parameters based on the gradient. It should be noted that the above-mentioned back-propagation algorithm, gradient descent algorithm, and machine learning method are well-known technologies that are widely studied and applied at present, and will not be repeated here. In practice, the initial model can adopt model structures such as Ranknet, SVM and Rank.
  • the execution body may directly use a gradient descent algorithm to update the initial model parameters based on the gradient.
  • the initial model can use model structures such as lambdaRank.
  • the initial model may be a Decision Tree
  • the target value of each sample may be a gradient.
  • the above-mentioned execution body may first create a new decision tree and fit the target values of each sample.
  • the initial model can then be updated based on the decision tree created.
  • the MART Multiple Additive Regression Tree
  • MART can also be called GBDT (Gradient Boosting Decision Tree), GBRT (Gradient Boosting Regression Tree), TreeNet (Decision Tree Network), and so on. It should be noted that the MART algorithm is a well-known technology that is widely studied and applied at present, and will not be repeated here.
  • step 204 it is determined whether the training of the initial model is completed.
  • the above-mentioned execution subject may use various methods to determine whether the initial model has been trained.
  • the number of executions of a training step may be determined.
  • the initial model may be a decision tree.
  • the execution body can record the number of decision trees created. Each time a decision tree is created, the above execution body can update the recorded quantity. After the initial model is updated in step 203, the execution body may determine the number of decision trees created. Based on the comparison result between this number and a preset number, it is determined whether the initial model has been trained. For example, in response to determining that the number of decision trees created is not less than a preset number, it may be determined that initial model training is complete. In response to determining that the number of decision trees created is less than a preset number, it may be determined that the initial model is not trained.
  • the execution body may first determine an average value of the target values of each sample. Then, the above average value can be compared with a preset value, and based on the comparison result, it is determined whether the initial model has been trained. For example, in response to determining that the target loss value is less than or equal to the preset value, it may be determined that the initial model training is completed. In response to determining that the target loss value is greater than the preset value, it may be determined that the initial model is not trained.
  • the preset value can generally be used to represent the ideal situation of the degree of inconsistency between the predicted value and the true value. That is, when the target loss value is less than or equal to a preset value, the predicted value can be considered to be close to or close to the true value. In practice, the preset value can be set according to actual needs.
  • the execution entity may compare the loss value of each sample with a preset value.
  • the above-mentioned execution subject may lose the proportion of the samples with the value less than or equal to the preset value to the samples in the sample set. And when the ratio reaches a preset sample ratio (such as 95%), it can be determined that the initial model training is completed.
  • execution subject may also determine whether the initial model has been trained in other ways, and is not limited to the above-mentioned various implementations.
  • Step 205 In response to determining that the initial model training is completed, determine the updated initial model as a ranking model.
  • the above-mentioned execution body may determine the initial model updated in step 203 as the ranking model.
  • the execution entity may further re-estimate the click deviation and non-click deviation of each position based on the updated initial model and the sample set. To update click and non-click deviations for each location.
  • the specific implementation is as follows:
  • the execution subject may first input the query information, the first location document, and the second location document in each sample in the sample set into the updated initial model.
  • the score of each document in each sample is thus given.
  • the current non-click deviation of each location can be fixed, the obtained score can be input into the gradient calculation formula, and the gradient calculation formula used can be equal to zero, thereby estimating the click deviation of each location.
  • the estimated click deviation of each position can be fixed, the obtained score can be input into the gradient calculation formula, and the gradient calculation formula used can be equal to zero, thereby estimating the unclick deviation of each location.
  • the click deviation and the non-click deviation of each position are updated.
  • the execution entity may first input the query information, the first location document, and the second location document in each sample in the sample set to the updated initial model. The score of each document in each sample is thus given. Then, we can get partial derivatives of the loss function and get the gradient calculation formula of the loss function. After that, the current non-click deviation of each position can be fixed, the obtained score is input into the obtained gradient calculation formula, and the gradient calculation formula used is equal to zero, thereby estimating the click deviation of each position.
  • the estimation may be performed sequentially according to the position order. That is, the click deviation of the first position is estimated first; then the click deviation of the second position is estimated; and so on.
  • the estimation of the non-click deviation of each position may be performed sequentially according to the position order. That is, first estimate the unclicked deviation of the first position; then estimate the unclicked deviation of the second position; and so on.
  • the non-click bias for each location you can use a sample that contains documents that are not clicked at that location in the query results. Thereby, the click deviation and non-click deviation of each position can be updated.
  • the above-mentioned execution body may use the updated initial model and the click bias and non-click bias of each updated position to continue performing the training step.
  • FIG. 3 is a schematic diagram of an application scenario of the method for generating a ranking model according to this embodiment.
  • a terminal training device 301 used by a user may be installed with a model training application.
  • the server 302 that provides background support for the application can run a method for generating a ranking model, including:
  • the samples in the sample set may include query information 303, a first position document 304 that was clicked in the query result, and a second position document 305 that was not clicked.
  • the following training steps can be performed based on the sample set: for the samples in the training set, the query information, the first position document, and the second position document in the sample are input to the initial model 306 to obtain the input first position document and the third position document. The score of the second position document. Then, based on the obtained score, the click deviation of the first position and the non-click deviation of the second position, a target value 307 of the sample is determined. After that, the initial model can be updated based on the target values of each sample. Finally, it can be determined whether the initial model has been trained. If the initial model training is completed, the trained initial model can be determined as the ranking model.
  • the samples in the sample set can be used to train the initial model.
  • the samples in the sample set may include query information, a first-position document clicked in the query result, and a second-position document not clicked.
  • the query information, the first position document, and the second position document in the sample are input to the initial model, and the scores of the first position document and the second position document can be obtained.
  • the target value of the sample can be determined.
  • the initial model can be updated based on the target values of each sample.
  • the ranking model trained by the method provided in the foregoing embodiment of the present application not only considers the click bias, but also considers the non-click bias. Therefore, this sorting model is suitable for the document learning method. Because the document pair method has a better ranking effect than the single document method (PointWise Approach), the ranking model trained by using the method provided by the above embodiments of the present application can improve the ranking accuracy.
  • a flowchart 400 of yet another embodiment of a method for generating a ranking model is shown.
  • the process 400 of the method for generating a ranking model includes the following steps:
  • Step 401 Obtain a sample set.
  • an execution subject for example, the server 105 shown in FIG. 1
  • a sample set can include a large number of samples.
  • the samples in the sample set may include query information, a first-position document clicked in the query result, and a second-position document not clicked.
  • the first position document may be any clicked document in the query result.
  • the position of the document in the query results may be referred to as the first position.
  • the second position document may be any unclicked document in the query result.
  • the position of the document in the query result may be referred to as a second position.
  • the above-mentioned execution subject may perform the training steps of steps 402 to 405.
  • Step 402 For the samples in the sample set, input the query information, the first position document, and the second position document in the sample into the initial model, and obtain the scores of the input documents, respectively, and obtain the scores and the first position.
  • the click deviation and the non-click deviation of the second position are input into a gradient calculation formula established in advance, and the gradient calculation result is determined as the target value of the sample.
  • the foregoing execution subject may be executed as follows:
  • the query information, the first position document and the second position document in the sample are input to the initial model, and scores of the input documents are obtained.
  • the initial model can use a decision tree.
  • the obtained score, the click deviation of the first position and the non-click deviation of the second position are input into a gradient calculation formula established in advance, and the gradient calculation result is determined as the target value of the sample.
  • the gradient calculation formula here may be an existing calculation formula based on an existing gradient calculation formula (for example, a gradient calculation formula used by a lambdaMART model) as a numerator, and a product of a click deviation and a non-click deviation as a denominator.
  • the denominator of the gradient calculation formula used is the click deviation of the first position where the first position document in the sample is located and the unclick deviation of the second position where the second position document in the sample is located. product.
  • Step 403 Create a decision tree, fit the target values of each sample, and update the initial model based on the created decision tree.
  • the execution body may first create a new decision tree and fit the target values of each sample.
  • the initial model can then be updated using the MART algorithm based on the decision tree created.
  • Step 404 Based on the updated initial model and the above sample set, re-estimate the click deviation and non-click deviation of each position to update the click deviation and non-click deviation of each position.
  • the execution body may re-evaluate the click deviation and non-click deviation of each location based on the updated initial model and the sample set, so as to correct the click deviation and non-click of each location.
  • the deviation is updated.
  • the execution body may first input the query information, the first location document, and the second location document in each sample in the sample set to the updated initial model. The score of each document in each sample is thus given. Then, the current non-click deviation of each location can be fixed, the obtained score can be input into the gradient calculation formula, and the gradient calculation formula used can be equal to zero, thereby estimating the click deviation of each location.
  • the estimated click deviation of each position can be fixed, the obtained score can be input into the gradient calculation formula, and the gradient calculation formula used can be equal to zero, thereby estimating the unclick deviation of each location. Thereby, the click deviation and the non-click deviation of each position are updated.
  • Step 405 Determine whether the number of decision trees that have been created is less than a preset number.
  • the execution body may record the number of decision trees created. Each time a decision tree is created, the above execution body can update the recorded quantity.
  • the execution body may determine whether the number of decision trees created is less than a preset number. If not, it can be determined that the training of the initial model is complete; otherwise, it can be determined that the initial model is not trained.
  • Step 406 In response to determining that the number of decision trees that have been created is not less than a preset number, it is determined that the initial model training is completed, and the updated initial model is determined as a ranking model.
  • the model updated in step 403 may be determined as the ranking model.
  • the execution subject may continue to execute the training step by using the updated initial model and the click deviation and non-click deviation of the updated positions.
  • the process 400 of the method for generating a ranking model in this embodiment involves the steps of updating the click deviation and the non-click deviation, and, in When the training is not completed, the updated click bias, non-click bias and the updated initial model are used to continue training to obtain the steps of ranking the models. Therefore, the solution described in this embodiment can learn the ranking model from the click data offline, and estimate the click deviation and non-click deviation of the position during the model learning process. Compared with the previous ranking learning method (that is, firstly, the click bias is estimated, and then the estimated click bias is used as a fixed value to learn the ranking model using a single document method).
  • the method for generating a ranking model in this embodiment On the basis of improving the accuracy of sorting, the correction of click data and model training can be performed at the same time, which improves the training efficiency.
  • this application provides an embodiment of an apparatus for generating a ranking model.
  • This apparatus embodiment corresponds to the method embodiment shown in FIG. 2.
  • the device can be applied to various electronic devices.
  • the apparatus 500 for generating a ranking model includes: an obtaining unit 501 configured to obtain a sample set, where the samples in the sample set include query information and clicked ones in a query result; A first location document and a second location document that has not been clicked; the first training unit 502 is configured to perform the following training steps: for a sample in a sample set, query information in the sample, the first location document, and the second location
  • the documents are input to the initial model, and the scores of the entered documents are obtained. Based on the obtained scores, the click deviation of the first position and the non-click deviation of the second position, the target value of the sample is determined.
  • the click bias is used to characterize the impact of the position of the document in the query result on the probability of the document being clicked and the probability of not being clicked; the initial model is updated based on the target value of each sample; in response to determining that the initial model training is completed, The updated initial model is determined as the ranking model.
  • the first training unit 502 may be further configured to, after updating the initial model based on the target values of each sample, based on the updated initial model and the sample set. , Re-estimate the click deviation and non-click deviation of each location to update the click deviation and non-click deviation of each location.
  • the device may further include a second training unit (not shown in the figure).
  • the second training unit may be configured to continue to perform the training step in response to determining that the initial model is not trained and using the updated initial model and the click bias and non-click bias of each updated position.
  • the above-mentioned first training unit 502 may be further configured to: input the obtained score, the click deviation of the first position, and the non-click deviation of the second position into a pre-established The gradient calculation formula determines the gradient calculation result as the target value of the sample.
  • the initial model may be a decision tree.
  • the above-mentioned first training unit 502 may be further configured to: create a decision tree and fit the target values of each sample; and update the initial model based on the created decision tree.
  • the above-mentioned first training unit 502 may be further configured to: determine the number of decision trees that have been created, compare the above-mentioned number with a preset number; and determine the initial value according to the comparison result Whether the model is trained.
  • the above-mentioned first training unit 502 may be further configured to: input the obtained score, the click deviation of the first position, and the non-click deviation of the second position into a pre-established The loss function is used to obtain the loss value, and the above loss value is determined as the target value of the sample.
  • the first training unit 502 may be further configured to: determine an average value of target values of each sample, and compare the average value with a preset value; according to the comparison result, Determine whether the initial model is trained.
  • the apparatus can acquire the sample set to use the samples in the sample set to train the initial model.
  • the samples in the sample set may include query information, a first-position document clicked in the query result, and a second-position document not clicked.
  • the query information, the first position document, and the second position document in the sample are input to the initial model, and the scores of the first position document and the second position document can be obtained.
  • the initial model can be updated based on the target values of each sample.
  • it can be determined whether the initial model has been trained. If the initial model training is completed, the trained initial model can be determined as the ranking model. Thereby, a model for sorting can be obtained, which is helpful for enriching the generation mode of the model.
  • the ranking model trained by the method provided in the foregoing embodiment of the present application not only considers the click bias, but also considers the non-click bias. Therefore, this sorting model is suitable for the document learning method. Because the document pair method has a better ranking effect than the single document method (PointWise Approach), the ranking model trained by using the method provided by the above embodiments of the present application can improve the ranking accuracy.
  • FIG. 6 illustrates a process 600 of an embodiment of a method for generating information provided in this application.
  • the method for generating information may include the following steps:
  • step 601 in response to receiving a query request containing target query information, candidate documents matching the target query information are retrieved and summarized into a candidate document set.
  • an execution subject of the method for generating information may receive a query request including the target query information through a wired connection or a wireless connection. Then, the candidate documents matching the above target query information can be retrieved and summarized into a candidate document set.
  • the query request may be sent by a terminal device (for example, the terminal devices 101, 102, and 103 shown in FIG. 1).
  • Step 602 The candidate documents in the candidate document set are input into a ranking model to obtain a score of each candidate document.
  • the above-mentioned execution body may input the candidate documents in the above candidate document set into the ranking model to obtain the score of each candidate document.
  • the above-mentioned sorting model may be generated by using the method described in the foregoing embodiment of FIG. 2. For a specific generation process, refer to the related description of the embodiment in FIG. 2, and details are not described herein again.
  • step 603 the candidate documents in the candidate document set are sorted according to the scores in descending order, and the ranking results are returned.
  • the execution entity may sort the candidate documents in the candidate document set according to the scores obtained in step 602 in descending order, and return the ranking result.
  • the method for generating information in this embodiment may be used to test the ranking models generated in the foregoing embodiments. Based on the test results, the ranking model can be continuously optimized. This method may also be a practical application method of the ranking model generated by the foregoing embodiments.
  • the ranking models generated by the above embodiments are used to score documents and then sort, which helps to improve the performance of sorting.
  • the present application provides an embodiment of an apparatus for generating information.
  • This device embodiment corresponds to the method embodiment shown in FIG. 6, and the device can be specifically applied to various electronic devices.
  • the apparatus 700 for generating information includes a retrieval unit 701 configured to retrieve candidates matching the above-mentioned target query information in response to receiving a query request including the target query information.
  • Documents summarized as a collection of candidate documents.
  • the input unit 702 is configured to input the candidate documents in the candidate document set into a ranking model generated by using the method described in the embodiment of FIG. 2 to obtain a score of each candidate document.
  • the sorting unit 703 is configured to sort the candidate documents in the above candidate document set in the order of the scores from large to small, and return a ranking result.
  • FIG. 8 illustrates a schematic structural diagram of a computer system 800 suitable for implementing an electronic device according to an embodiment of the present application.
  • the electronic device shown in FIG. 8 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
  • the computer system 800 includes a central processing unit (CPU) 801 which can be loaded into a random access memory (RAM) 803 according to a program stored in a read-only memory (ROM) 802 or a program stored in a storage portion 808. Instead, perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read-only memory
  • storage portion 808 a program stored in a storage portion 808.
  • various programs and data required for the operation of the system 800 are also stored.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • An input / output (I / O) interface 805 is also connected to the bus 804.
  • the following components are connected to the I / O interface 805: an input portion 806 including a touch screen, a touch pad, etc .; an output portion 807 including a liquid crystal display (LCD), etc., and a speaker; a storage portion 808 including a hard disk, etc .; and a device such as a LAN card Communication part 809 of a network interface card such as a modem.
  • the communication section 809 performs communication processing via a network such as the Internet.
  • the driver 810 is also connected to the I / O interface 805 as needed.
  • a removable medium 811, such as a semiconductor memory, is installed on the drive 810 as necessary, so that a computer program read therefrom is installed into the storage section 808 as necessary.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication section 809, and / or installed from a removable medium 811.
  • CPU central processing unit
  • the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present application may be implemented by software or hardware.
  • the described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit and a first training unit. Among them, the names of these units do not constitute a limitation on the unit itself in some cases.
  • the obtaining unit may also be described as a “unit obtaining a sample set”.
  • the present application also provides a computer-readable medium, which may be included in the device described in the foregoing embodiments; or may exist alone without being assembled into the device.
  • the computer-readable medium carries one or more programs.
  • the device causes the device to: obtain a sample set; perform the following training steps: for the samples in the sample set,
  • the query information, the first position document, and the second position document are input to the initial model, and the scores of the input documents are obtained respectively.
  • the click deviation of the first position and the non-click deviation of the second position it is determined
  • the target value of the sample, where the click bias and non-click bias are used to characterize the impact of the position of the document in the query result on the probability of the document being clicked and the probability of not being clicked The model is updated; it is determined whether the initial model training is completed; and in response to determining that the initial model training is completed, the updated initial model is determined as a ranking model.
  • the device when the one or more programs are executed by the device, the device may be further configured to: after updating the initial model based on the target values of the samples, based on the updated initial model and the sample set, Re-evaluate the click deviation and non-click deviation of each location to update the click deviation and non-click deviation of each location.
  • the device may also be caused to: in response to determining that the initial model is not trained, use the updated initial model and the click bias and non-click of each updated position Deviation, continue to perform the above training steps.
  • the above-mentioned determining the target value of the sample based on the obtained score, the click deviation of the first position and the non-click deviation of the second position may include: comparing the obtained score, the click deviation of the first position, and the The non-click deviation of the two positions is input to a pre-established gradient calculation formula, and the gradient calculation result is determined as the target value of the sample.
  • the initial model may be a decision tree; and updating the initial model based on the target values of each sample may include: creating a decision tree and fitting the target values of each sample; based on the created decision tree, Update the initial model.
  • the above-mentioned determining whether the initial model has been trained may include: determining the number of decision trees that have been created, comparing the above number with a preset number; and determining whether the initial model has been trained according to the comparison result.
  • the above-mentioned determining the target value of the sample based on the obtained score, the click deviation of the first position and the non-click deviation of the second position may include: comparing the obtained score, the click deviation of the first position, and the The non-click deviation of the two positions is input to a pre-established loss function to obtain a loss value, and the above-mentioned loss value is determined as the target value of the sample.
  • the above-mentioned determining whether the initial model is completely trained may include: determining an average value of target values of each sample, and comparing the average value with a preset value; and determining whether the initial model is completely trained according to the comparison result.
  • the device may also cause the device to: in response to receiving a query request including target query information, retrieve the information related to the target
  • the candidate documents that match the query information are summarized into a candidate document set.
  • the candidate documents in the candidate document set are input into the ranking model generated by the method described in the above embodiment to obtain the score of each candidate document. In small order, the candidate documents in the above candidate document set are sorted, and the ranking results are returned.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Algebra (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种生成排序模型的方法和装置。该方法具体包括:获取样本集(201);对于该样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值(202);基于各样本的目标值,对初始模型进行更新(203);确定初始模型是否训练完成(204);将更新后的初始模型确定为排序模型(205)。

Description

用于生成排序模型的方法和装置 技术领域
本申请实施例涉及计算机技术领域,具体涉及用于生成排序模型的方法和装置。
背景技术
排序学习(Learning To Rank,L2R或者LTR)是一种基于监督学习的排序方法。其任务是对一组文档进行排序,其希望能够通过使用人工标注的数据来进行算法设计,挖掘出隐藏在数据中的规律,从而完成对任意查询需求给出反映相关性的文档排序。
对于搜索排序,通常使用点击数据训练排序模型,通过排序模型进行搜索结果的排序。现有的训练排序模型的方式,通常是首先基于点击数据估计点击的位置偏差(position bias),而后基于点击数据和该位置偏差,利用单文档方法(PointWise Approach)训练排序模型。
发明内容
本申请实施例提出了用于生成排序模型的方法和装置。
第一方面,本申请实施例提供了一种用于生成排序模型的方法,该方法包括:获取样本集,其中,样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始 模型确定为排序模型。
在一些实施例中,在基于各样本的目标值,对初始模型进行更新之后,训练步骤还包括:基于更新后的初始模型和样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
在一些实施例中,该方法还包括:响应于确定初始模型未训练完成,使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行训练步骤。
在一些实施例中,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,包括:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。
在一些实施例中,初始模型为决策树;以及基于各样本的目标值,对初始模型进行更新,包括:创建决策树,对各样本的目标值进行拟合;基于所创建的决策树,更新初始模型。
在一些实施例中,确定初始模型是否训练完成,包括:确定已创建的决策树的数量,将数量与预设数量进行比较;根据比较结果,确定初始模型是否训练完成。
在一些实施例中,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,包括:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值,将损失值确定为该样本的目标值。
在一些实施例中,确定初始模型是否训练完成,包括:确定各样本的目标值的平均值,将平均值与预设值进行比较;根据比较结果,确定初始模型是否训练完成。
第二方面,本申请实施例提供了一种用于生成排序模型的装置,该装置包括:获取单元,被配置成获取样本集,其中,样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;第一训练单元,被配置成执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档 输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
在一些实施例中,第一训练单元进一步被配置成:在基于各样本的目标值,对初始模型进行更新之后,基于更新后的初始模型和样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
在一些实施例中,该装置还包括:第二训练单元,被配置成响应于确定初始模型未训练完成,使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行训练步骤。
在一些实施例中,第一训练单元,进一步被配置成:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。
在一些实施例中,初始模型为决策树;以及第一训练单元,进一步被配置成:创建决策树,对各样本的目标值进行拟合;基于所创建的决策树,更新初始模型。
在一些实施例中,第一训练单元,进一步被配置成:确定已创建的决策树的数量,将数量与预设数量进行比较;根据比较结果,确定初始模型是否训练完成。
在一些实施例中,第一训练单元,进一步被配置成:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值,将损失值确定为该样本的目标值。
在一些实施例中,第一训练单元,进一步被配置成:确定各样本的目标值的平均值,将平均值与预设值进行比较;根据比较结果,确定初始模型是否训练完成。
第三方面,本申请实施例提供了一种用于生成信息的方法,包括:响应于接收到包含目标查询信息的查询请求,检索与目标查询信息相 匹配的候选文档,汇总为候选文档集合;将候选文档集合中的候选文档输入采用如上述第一方面中任一实施例所描述的方法生成的排序模型中,得到各候选文档的得分;按照得分由大到小的顺序,对候选文档集合中的候选文档进行排序,返回排序结果。
第四方面,本申请实施例提供了一种用于生成信息的装置,包括:检索单元,被配置成响应于接收到包含目标查询信息的查询请求,检索与目标查询信息相匹配的候选文档,汇总为候选文档集合;输入单元,被配置成将候选文档集合中的候选文档输入采用如上述第一方面中任一实施例所描述的方法生成的排序模型中,得到各候选文档的得分;排序单元,被配置成按照得分由大到小的顺序,对候选文档集合中的候选文档进行排序,返回排序结果。
第五方面,本申请实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器执行时:获取样本集,其中,样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
第六方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时,使得所述处理器:获取样本集,其中,样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目 标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
本申请实施例提供的用于生成排序模型的方法和装置,通过获取样本集,可以利用样本集中的样本进行初始模型的训练。其中,样本集中的样本可以包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档。这样,将样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,便可以得到第一位置文档和第二位置文档的得分。基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,即可确定样本的目标值。之后,可以基于各样本的目标值,对初始模型进行更新。最后,可以确定初始模型是否训练完成,若初始模型训练完成,就可以将训练后的初始模型确定为排序模型。从而能够得到一种用于排序的模型,有助于丰富模型的生成方式。同时,该模型不仅考虑了点击偏差,还考虑了未点击偏差,由此,该排序模型提高了排序的准确性。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1是本申请的一个实施例可以应用于其中的示例性***架构图;
图2是根据本申请的用于生成排序模型的方法的一个实施例的流程图;
图3是根据本申请的用于生成排序模型的方法的一个应用场景的示意图;
图4是根据本申请的用于生成排序模型的方法的又一个实施例的流程图;
图5是根据本申请的用于生成排序模型的装置的一个实施例的结构示意图;
图6是根据本申请用于生成信息的方法的一个实施例的流程图;
图7是根据本申请用于生成信息的装置的一个实施例的结构示意图;
图8是适于用来实现本申请实施例的电子设备的计算机***的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
图1示出了可以应用本申请的用于生成排序模型的方法或用于生成排序模型的装置的示例性***架构100。
如图1所示,***架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如资讯浏览类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是提供各种服务的服务器,例如,可以是对搜索引擎提供支持的处理服务器。处理服务器可以存储有样本集或者从其他设备中获取样本集。样本集中可以包含多个样本。其中,样本可以包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档。此外,处理服务器可以利用样本集中的样本,对初始模型进行训练,并可以将训练结果(如生成的排序模型)进行存储。这样,在用户利用终端设备101、102、103发送查询请求后,服务器105可以确定对查询结果进行排序,进而,将排序后的查询结果返回给终端设备101、102、103。
需要说明的是,服务器105可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。
需要说明的是,本申请实施例所提供的用于生成排序模型的方法一般由服务器105执行,相应地,用于生成排序模型的装置一般设置于服务器105中。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
继续参考图2,示出了根据本申请的用于生成排序模型的方法的一个实施例的流程200。该用于生成排序模型的方法,包括以下步骤:
步骤201,获取样本集。
在本实施例中,该用于生成排序模型的方法的执行主体(例如图1所示的服务器105)可以通过多种方式来获取样本集。例如,执行主体可以通过有线连接方式或无线连接方式,从用于存储样本的另一服务器(例如数据库服务器)中获取存储于其中的现有的样本集。再例如,用户可以通过终端设备(例如图1所示的终端设备101、102、103)来收集样本,并将这些样本存储在本地,从而生成样本集。需要指出的是,上述无线连接方式可以包括但不限于3G/4G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、 以及其他现在已知或将来开发的无线连接方式。
此处,样本集中的样本可以预先从用户的历史行为信息(例如,可以包括点击数据、查询请求等)中获取。样本集中可以包括大量的样本。其中,上述样本集中的样本可以包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档。这里,查询信息可以是用户所发送的查询请求中的查询字符串的特征表示(例如可以使用特征向量进行表示)。第一位置文档可以是查询结果中的任一被点击的文档。可以将该文档在查询结果中所在的位置称为第一位置。第二位置文档可以是查询结果中任一未被点击的文档。可以该文档在查询结果中所在的位置称为第二位置。需要说明的是,这里的文档可以用文档的特征向量等形式进行表示。
作为示例,用户在搜索引擎中输入字符串“机器学习”,则字符串“机器学习”即为查询字符串。在搜索引擎返回的查询结果中,用户点击了排在查询结果的第五位的文档,没有点击排在查询结果的第六位的文档。则可以将上述用户点击的排在第五位的文档称为第一位置文档,将第五位作为第一位置。同时,可以将上述用户未点击的排在第六位的文档称为第二位置文档,将第六位作为第二位置。
可以理解的是,在用户发送某个查询请求后,对于所返回的查询结果,用户通常只点击查询结果中的少量文档。因而,对于某一个查询请求,可以构成多个样本。作为示例,返回的查询结果包含10个文档。用户点击了其中2个。则可以组成16个样本。
可以理解的是,为区分样本中的文档是否被点击,可以预先对样本中的文档进行标注。对于每一个样本,该样本中的第一位置文档可以对应有用于指示该文档被点击的标注,该样本中的第二位置文档可以对应有用于指示该文档未被点击的标注。
在本实施例中,在获取样本集之后,上述执行主体可以执行步骤202至步骤204的训练步骤。
步骤202,对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点 击偏差,确定该样本的目标值。
在本实施例中,对于样本集中的样本,上述执行主体可以按照如下步骤执行:
第一步,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分。其中,初始模型可以通过对查询信息、第一位置文档和第二位置文档进行特征提取、分析等,输出第一位置文档的得分和第二位置文档的得分。所输出的得分可以用于表征初始模型所计算出的文档与查询信息的相关程度。文档的得分越高,该文档与查询信息的相关程度越大。
此处,初始模型可以是基于机器学习技术而创建的各种适用于文档对方法(PairWise Approach)的现有的模型结构(例如Ranknet、lambdaRank、SVM Rank、lambdaMart、决策树等模型结构)。初始模型可以对文档和查询信息进行特征提取,而后对所提取的特征进行分析等处理,最终输出文档的分数。实践中,文档对方法是排序学习算法中的一种方法。
第二步,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值。
此处,可以将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入到预先建立的目标值计算式中,得到该样本的目标值。此处,上述目标值计算式可以是预先建立的与文档得分、点击偏差和未点击偏差相关的函数或者公式。例如,可以是预先建立的梯度计算公式、预先建立的损失函数,预先建立的损失函数的偏导式等。上述目标值计算公式所输出的值即为目标值。需要说明的是,上述目标值计算式还可以是预先建立的其他形式的函数或公式,不限于上述列举。
此处,点击偏差可以用于表征文档在查询结果中的位置对文档被点击的概率的影响程度。未点击偏差可以用于表征文档在查询结果中的位置对文档未被点击的概率的影响程度。这里,各个位置的点击偏差和未点击偏差均可以用数值来表示。并且,各个位置的点击偏差和未点击偏差的初始值可以被预先设置(例如初始值均被设置为1)。
可以理解的是,理论上,当文档与查询信息的相关性越高时,该 文档被点击的概率越大,未被点击的概率越小。然而,由于文档在查询结果中的位置不同,会对文档被点击的概率、未被点击的概率产生影响。例如,在两个文档与查询信息的相关性相同时,用户通常首先浏览排序靠前的位置的文档。通常认为排序靠前的文档与检索信息的相关性较大。由此导致排序越靠前的文档被用户点击的概率越大,未被用户点击的概率越小。因而,对排序学习过程中,仅利用点击数据所训练的模型不能准确地反映出文档与查询信息的相关性,还需要考虑到文档在排序结果中的位置对于该文档被点击的概率、未被点击的概率产生的影响。实践中,点击偏差也可以称为位置偏向性(position bias)。
在本实施例的一些可选的实现方式中,上述执行主体可以将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值。并将上述损失值(即目标函数的值)确定为该样本的目标值。实践中,损失函数可以用来估量初始模型的预测值与真实值的不一致程度。它是一个非负实值函数。一般情况下,损失值越小,模型的鲁棒性就越好。这里的损失函数可以是基于现有的损失函数(例如交叉熵损失函数),并结合点击偏差、未点击偏差而预先建立的。作为示例,可以将点击偏差与未点击偏差的乘积作为分母,将交叉熵损失函数作为分子,建立此处的损失函数。此时,对于某个样本,所使用的损失函数的分母即为该样本中的第一位置文档所在的第一位置的点击偏差与该样本中的第二位置文档所在的第二位置的不点击偏差的乘积。
在本实施例的一些可选的实现方式中,上述执行主体可以将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。这里的梯度计算公式可以是基于现有的梯度计算公式(例如lambdaRank模型、lambdaMART模型所使用的梯度计算公式),并结合点击偏差、未点击偏差而预先建立的。作为示例,可以将点击偏差与未点击偏差的乘积作为分母,将如lambdaRank、lambdaMART等模型所使用的现有的梯度计算公式作为分子,建立此处的梯度计算公式。此时,对于 某个样本,所使用的梯度计算公式的分母即为该样本中的第一位置文档所在的第一位置的点击偏差与该样本中的第二位置文档所在的第二位置的未点击偏差的乘积。
步骤203,基于各样本的目标值,对初始模型进行更新。
在本实施例中,上述执行主体可以基于各样本的目标值,对初始模型进行更新。此处,针对不同的初始模型和不同的目标值(例如损失值或者梯度等),可以使用不同的方式进行初始模型的更新。
在本实施例的一些可选的实现方式中,当样本的目标值是损失值时,上述执行主体可以首先确定各样本的损失值的平均值。而后,可以利用反向传播算法求得损失值的平均值相对于初始模型参数的梯度,而后利用梯度下降算法基于梯度更新初始模型参数。需要说明的是,上述反向传播算法、梯度下降算法以及机器学习方法是目前广泛研究和应用的公知技术,在此不再赘述。实践中,初始模型可以采用Ranknet、SVM Rank等模型结构。
在本实施例的一些可选的实现方式中,当样本的目标值是梯度时,上述执行主体可以直接利用梯度下降算法基于梯度更新初始模型参数。实践中,初始模型可以采用lambdaRank等模型结构。
在本实施例的一些可选的实现方式中,初始模型可以是决策树(Decision Tree),各样本的目标值可以是梯度。在得到各样本的目标值后,上述执行主体可以首先创建一个新的决策树,对各样本的目标值进行拟合。而后,可以基于所创建的决策树,更新初始模型。此处,可以使用MART(Multiple Additive Regression Tree,多重增量回归树)算法进行初始模型的更新。此处,MART也可以称为GBDT(Gradient Boosting Decision Tree,梯度渐进决策树)、GBRT(Gradient Boosting Regression Tree,梯度渐进回归树)、TreeNet(决策树网络)等。需要说明的是,MART算法是目前广泛研究和应用的公知技术,在此不再赘述。
步骤204,确定初始模型是否训练完成。
在本实施例中,上述执行主体可以利用各种方式确定初始模型是否训练完成。作为示例,可以确定训练步骤的执行次数。响应于确定 执行次数达到预设次数,可以确定训练完成。响应于确定执行次数未达到预设次数,可以确定未训练完成。
在本实施例的一些可选的实现方式中,初始模型可以是决策树。上述执行主体可以记录所创建的决策树的数量。在每创建一个决策树时,上述执行主体可以对所记录的数量进行更新。在步骤203对初始模型更新后,上述执行主体可以确定所创建的决策树的数量。基于该数量与预设数量的比较结果,确定初始模型是否训练完成。例如,响应于确定所创建的决策树的数量不小于预设数量,可以确定初始模型训练完成。响应于确定所创建的决策树的数量小于预设数量,可以确定初始模型未训练完成。
在本实施例的一些可选的实现方式中,当目标值为损失值时,上述执行主体可以首先确定各样本的目标值的平均值。而后,可以将上述平均值与预设值进行比较,基于比较结果,确定初始模型是否训练完成。例如,响应于确定上述目标损失值小于或等于上述预设值,可以确定初始模型训练完成。响应于确定上述目标损失值大于上述预设值,可以确定初始模型未训练完成。需要说明的是,预设值一般可以用于表示预测值与真实值之间的不一致程度的理想情况。也就是说,当目标损失值小于或等于预设值时,可以认为预测值接近或近似真实值。实践中,预设值可以根据实际需求来设置。
在本实施例的一些可选的实现方式中,当目标值为损失值时,上述执行主体可以分别将各样本的损失值与预设值进行比较。上述执行主体可以损失值小于或等于预设值的样本占样本集中的样本的比例。且在该比例达到预设样本比例(如95%)时,可以确定初始模型训练完成。
需要说明的是,上述执行主体还可以通过其他方式确定初始模型是否训练完成,不限于上述各种实现方式。
步骤205,响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
在本实施例中,响应于确定初始模型训练完成,上述执行主体可以将步骤203更新后的初始模型确定为排序模型。
在本实施例的一些可选的实现方式中,在步骤203对初始模型进行更新之后,上述执行主体还可以基于更新后的初始模型和上述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。具体执行如下:
当步骤202中所确定的各样本的目标是为梯度时,上述执行主体可以首先将样本集中的各个样本中的查询信息、第一位置文档、第二位置文档输入至更新后的初始模型。从而给出各个样本中的各文档的得分。而后,可以固定各个位置当前的未点击偏差,将所得到的得分输入至梯度计算公式,并使所使用的梯度计算公式等于零,从而估计出各个位置的点击偏差。之后,可以固定所估计出的各个位置的点击偏差,将所得到的得分输入至梯度计算公式,并使所使用的梯度计算公式等于零,从而估计出各个位置的未点击偏差。由此,实现对各个位置的点击偏差和未点击偏差的更新。
当步骤202中所确定的各样本的目标是为损失值时,上述执行主体可以首先将样本集中的各个样本中的查询信息、第一位置文档、第二位置文档输入至更新后的初始模型。从而给出各个样本中的各文档的得分。而后,可以对损失函数求偏导,得到损失函数的梯度计算公式。之后,可以固定各个位置当前的未点击偏差,将所得到的得分输入至所得到的梯度计算公式,并使所使用的梯度计算公式等于零,从而估计出各个位置的点击偏差。此处,对各个位置的点击偏差进行估计,可以是按照位置顺序依次进行估计。即,首先估计第一个位置的点击偏差;而后估计第二个位置的点击偏差;以此类推。对每一个位置的点击偏差进行估计时,可以使用包含处于查询结果的该位置且被点击的文档的样本。在估计出各个位置的点击偏差之后,可以固定所估计出的各个位置的点击偏差,将所得到的得分输入至所得到的梯度计算公式,并使所使用的梯度计算公式等于零,从而估计出各个位置的未点击偏差。此处,对各个位置的未点击偏差进行估计,可以是按照位置顺序依次进行估计。即,首先估计第一个位置的未点击偏差;而后估计第二个位置的未点击偏差;以此类推。对每一个位置的未点击偏差进行估计时,可以使用包含处于查询结果的该位置且未被点击 的文档的样本。由此,可以实现对各个位置的点击偏差和未点击偏差的更新。
在本实施例的一些可选的实现方式中,响应于确定初始模型未训练完成,上述执行主体可以使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行上述训练步骤。
继续参见图3,图3是根据本实施例的用于生成排序模型的方法的应用场景的一个示意图。在图3的应用场景中,用户(例如技术人员)所使用的终端设备301上可以安装有模型训练类应用。当用户打开该应用,并上传样本集或样本集的存储路径后,对该应用提供后台支持的服务器302可以运行用于生成排序模型的方法,包括:
首先,可以获取样本集。其中,样本集中的样本可以包括查询信息303、查询结果中被点击的第一位置文档304和未被点击的第二位置文档305。之后,可以基于样本集执行如下训练步骤:对于训练集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型306,得到所输入的第一位置文档和第二位置文档的得分。而后,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定样本的目标值307。之后,可以基于各样本的目标值,对初始模型进行更新。最后,可以确定初始模型是否训练完成,若初始模型训练完成,就可以将训练后的初始模型确定为排序模型。
本申请的上述实施例提供的方法,通过获取样本集,可以利用样本集中的样本进行初始模型的训练。其中,样本集中的样本可以包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档。这样,将样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,便可以得到第一位置文档和第二位置文档的得分。基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,即可确定样本的目标值。之后,可以基于各样本的目标值,对初始模型进行更新。最后,可以确定初始模型是否训练完成,若初始模型训练完成,就可以将训练后的初始模型确定为排序模型。从而能够得到一种用于排序的模型,有助于丰富模型的生成方式。
此外,在以往的排序学习中,通常仅考虑点击偏差,未对未点击 偏差进行考虑,导致不能直接适用于文档对方法(PairWise Approach)进行排序学习。本申请的上述实施例提供的方法所训练得到的排序模型,不仅考虑了点击偏差,还考虑了未点击偏差。由此,该排序模型适用于文档对方法进行排序学习。由于文档对方法相较于单文档方法(PointWise Approach),具有更好的排序效果,因而,使用本申请的上述实施例提供的方法所训练得到的排序模型,可以使排序的准确率提高。
进一步参考图4,其示出了用于生成排序模型的方法的又一个实施例的流程400。该用于生成排序模型的方法的流程400,包括以下步骤:
步骤401,获取样本集。
在本实施例中,该用于生成排序模型的方法的执行主体(例如图1所示的服务器105)可以获取样本集。样本集中可以包括大量的样本。其中,上述样本集中的样本可以包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档。第一位置文档可以是查询结果中的任一被点击的文档。可以将该文档在查询结果中所在的位置称为第一位置。第二位置文档可以是查询结果中任一未被点击的文档。可以该文档在查询结果中所在的位置称为第二位置。
在获取样本集之后,上述执行主体可以执行步骤402至步骤405的训练步骤。
步骤402,对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。
在本实施例中,对于样本集中的样本,上述执行主体可以按照如下步骤执行:
第一步,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分。其中,初始模型可以使用决策树。
第二步,将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。这里的梯度计算公式可以是以现有的梯度计算公式(例如lambdaMART模型所使用的梯度计算公式)作为分子,以点击偏差与未点击偏差的乘积作为分母所建立的计算公式。对于某个样本,所使用的梯度计算公式的分母即为该样本中的第一位置文档所在的第一位置的点击偏差与该样本中的第二位置文档所在的第二位置的未点击偏差的乘积。
步骤403,创建决策树,对各样本的目标值进行拟合,基于所创建的决策树,更新初始模型。
在本实施例中,上述执行主体可以首先创建一个新的决策树,对各样本的目标值进行拟合。而后,可以基于所创建的决策树,使用MART算法更新初始模型。
步骤404,基于更新后的初始模型和上述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
在本实施例中,对初始模型进行更新之后,上述执行主体可以基于更新后的初始模型和上述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。具体地,上述执行主体可以首先将样本集中的各个样本中的查询信息、第一位置文档、第二位置文档输入至更新后的初始模型。从而给出各个样本中的各文档的得分。而后,可以固定各个位置当前的未点击偏差,将所得到的得分输入至梯度计算公式,并使所使用的梯度计算公式等于零,从而估计出各个位置的点击偏差。之后,可以固定所估计出的各个位置的点击偏差,将所得到的得分输入至梯度计算公式,并使所使用的梯度计算公式等于零,从而估计出各个位置的未点击偏差。由此,实现对各个位置的点击偏差和未点击偏差的更新。
步骤405,确定已创建的决策树的数量是否小于预设数量。
在本实施例中,上述执行主体可以记录所创建的决策树的数量。在每创建一个决策树时,上述执行主体可以对所记录的数量进行更新。 此处,上述执行主体可以确定所创建的决策树的数量是否小于预设数量。若不小于,可以确定初始模型训练完成;反之,可以确定初始模型未训练完成。
步骤406,响应于确定已创建的决策树的数量不小于预设数量,确定初始模型训练完成,将更新后的初始模型确定为排序模型。
在本实施例中,响应于确定所创建的决策树的数量不小于预设数量,可以确定初始模型训练完成。此时,可以将步骤403更新后的模型确定为排序模型。
在本实施例中,响应于确定初始模型未训练完成,上述执行主体可以使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行上述训练步骤。
从图4中可以看出,与图2对应的实施例相比,本实施例中的用于生成排序模型的方法的流程400涉及了对点击偏差、未点击偏差进行更新的步骤,以及,在训练未完成时,使用更新后的点击偏差、未点击偏差以及更新后的初始模型继续训练,得到排序模型的步骤。由此,本实施例描述的方案可以在线下从点击数据中学习出排序模型,并在模型学习过程中估计位置的点击偏差和未点击偏差。相对于以往的排序学习的方式(即首先根据对点击偏差进行估计,而后将所估计的点击偏差作为固定值,利用单文档方式学习排序模型),本实施例中的用于生成排序模型的方法,在提高了排序的准确性的基础上,可以使点击数据的纠偏和模型训练同时进行,提高了训练效率。
进一步参考图5,作为对上述各图所示方法的实现,本申请提供了一种用于生成排序模型的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图5所示,本实施例所述的用于生成排序模型的装置500包括:获取单元501,被配置成获取样本集,其中,上述样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;第一训练单元502,被配置成执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得 分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
在本实施例的一些可选的实现方式中,上述第一训练单元502可以进一步被配置成在上述基于各样本的目标值,对初始模型进行更新之后,基于更新后的初始模型和上述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
在本实施例的一些可选的实现方式中,该装置还可以包括第二训练单元(图中未示出)。其中,上述第二训练单元可以被配置成响应于确定初始模型未训练完成,使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行上述训练步骤。
在本实施例的一些可选的实现方式中,上述第一训练单元502可以进一步被配置成:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。
在本实施例的一些可选的实现方式中,初始模型可以为决策树。以及上述第一训练单元502可以进一步被配置成:创建决策树,对各样本的目标值进行拟合;基于所创建的决策树,更新初始模型。
在本实施例的一些可选的实现方式中,上述第一训练单元502可以进一步被配置成:确定已创建的决策树的数量,将上述数量与预设数量进行比较;根据比较结果,确定初始模型是否训练完成。
在本实施例的一些可选的实现方式中,上述第一训练单元502可以进一步被配置成:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值,将上述损失值确定为该样本的目标值。
在本实施例的一些可选的实现方式中,上述第一训练单元502可以进一步被配置成:确定各样本的目标值的平均值,将上述平均值与 预设值进行比较;根据比较结果,确定初始模型是否训练完成。
本申请的上述实施例提供的装置,通过获取样本集,可以利用样本集中的样本进行初始模型的训练。其中,样本集中的样本可以包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档。这样,将样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,便可以得到第一位置文档和第二位置文档的得分。之后,可以基于各样本的目标值,对初始模型进行更新。最后,可以确定初始模型是否训练完成,若初始模型训练完成,就可以将训练后的初始模型确定为排序模型。从而能够得到一种用于排序的模型,有助于丰富模型的生成方式。
此外,在以往的排序学习中,通常仅考虑点击偏差,未对未点击偏差进行考虑,导致不能直接适用于文档对方法(PairWise Approach)进行排序学习。本申请的上述实施例提供的方法所训练得到的排序模型,不仅考虑了点击偏差,还考虑了未点击偏差。由此,该排序模型适用于文档对方法进行排序学习。由于文档对方法相较于单文档方法(PointWise Approach),具有更好的排序效果,因而,使用本申请的上述实施例提供的方法所训练得到的排序模型,可以使排序的准确率提高。
请参见图6,其示出了本申请提供的用于生成信息的方法的一个实施例的流程600。该用于生成信息的方法可以包括以下步骤:
步骤601,响应于接收到包含目标查询信息的查询请求,检索与目标查询信息相匹配的候选文档,汇总为候选文档集合。
在本实施例中,用于生成信息的方法的执行主体(例如图1所示的服务器105)可以通过有线连接或者无线连接方式,接收包含目标查询信息的查询请求。而后,可以检索与上述目标查询信息相匹配的候选文档,汇总为候选文档集合。其中,上述查询请求可以由终端设备(例如图1所示的终端设备101、102、103)发送。
步骤602,将候选文档集合中的候选文档输入排序模型中,得到各候选文档的得分。
在本实施例中,上述执行主体可以将上述中候选文档集合中的候 选文档输入排序模型中,得到各候选文档的得分。此处,上述排序模型可以是采用如上述图2实施例所描述的方法而生成的。具体生成过程可以参见图2实施例的相关描述,此处不再赘述。
步骤603,按照得分由大到小的顺序,对候选文档集合中的候选文档进行排序,返回排序结果。
在本实施例中,上述执行主体可以按照步骤602所得到的得分由大到小的顺序,对上述候选文档集合中的候选文档进行排序,并返回排序结果。
需要说明的是,本实施例用于生成信息的方法可以用于测试上述各实施例所生成的排序模型。进而根据测试结果可以不断地优化排序模型。该方法也可以是上述各实施例所生成的排序模型的实际应用方法。采用上述各实施例所生成的排序模型,来进行文档打分,进而进行排序,有助于提高排序的性能。
继续参见图7,作为对上述图6所示方法的实现,本申请提供了一种用于生成信息的装置的一个实施例。该装置实施例与图6所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图7所示,本实施例所述的用于生成信息的装置700包括:检索单元701,被配置成响应于接收到包含目标查询信息的查询请求,检索与上述目标查询信息相匹配的候选文档,汇总为候选文档集合。输入单元702,被配置成将上述候选文档集合中的候选文档输入采用如上述图2实施例所描述的方法生成的排序模型中,得到各候选文档的得分。排序单元703,被配置成按照得分由大到小的顺序,对上述候选文档集合中的候选文档进行排序,返回排序结果。
可以理解的是,该装置700中记载的诸单元与参考图6描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作、特征以及产生的有益效果同样适用于装置700及其中包含的单元,在此不再赘述。
下面参考图8,其示出了适于用来实现本申请实施例的电子设备的计算机***800的结构示意图。图8示出的电子设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图8所示,计算机***800包括中央处理单元(CPU)801,其可以根据存储在只读存储器(ROM)802中的程序或者从存储部分808加载到随机访问存储器(RAM)803中的程序而执行各种适当的动作和处理。在RAM 803中,还存储有***800操作所需的各种程序和数据。CPU 801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。
以下部件连接至I/O接口805:包括触摸屏、触摸板等的输入部分806;包括诸如液晶显示器(LCD)等以及扬声器等的输出部分807;包括硬盘等的存储部分808;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分809。通信部分809经由诸如因特网的网络执行通信处理。驱动器810也根据需要连接至I/O接口805。可拆卸介质811,诸如半导体存储器等等,根据需要安装在驱动器810上,以便于从其上读出的计算机程序根据需要被安装入存储部分808。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分809从网络上被下载和安装,和/或从可拆卸介质811被安装。在该计算机程序被中央处理单元(CPU)801执行时,执行本申请的方法中限定的上述功能。需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者 与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本申请各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括获取单元和第一训练单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,获取单元还可以被描述为“获取样本集的单元”。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的装置中所包含的;也可以是单独存在,而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该装置执行时,使得该装置:获取样本集;执行如下训练步骤:对于该样本集中的样本,将该样本中 的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
可选的,当上述一个或者多个程序被该装置执行时,还可以使得该装置:在上述基于各样本的目标值,对初始模型进行更新之后,基于更新后的初始模型和上述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
可选的,当上述一个或者多个程序被该装置执行时,还可以使得该装置:响应于确定初始模型未训练完成,使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行上述训练步骤。
可选的,上述基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,可以包括:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。
可选的,初始模型可以为决策树;以及上述基于各样本的目标值,对初始模型进行更新,可以包括:创建决策树,对各样本的目标值进行拟合;基于所创建的决策树,更新初始模型。
可选的,上述确定初始模型是否训练完成,可以包括:确定已创建的决策树的数量,将上述数量与预设数量进行比较;根据比较结果,确定初始模型是否训练完成。
可选的,上述基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,可以包括:将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值,将上述损失值确定为该样本的目标值。
可选的,上述确定初始模型是否训练完成,可以包括:确定各样 本的目标值的平均值,将上述平均值与预设值进行比较;根据比较结果,确定初始模型是否训练完成。
此外,上述计算机可读介质承载的一个或者多个程序,当上述一个或者多个程序被该装置执行时,也可以使得该装置:响应于接收到包含目标查询信息的查询请求,检索与上述目标查询信息相匹配的候选文档,汇总为候选文档集合;将上述候选文档集合中的候选文档输入采用上述实施例所描述的方法所生成的排序模型中,得到各候选文档的得分;按照得分由大到小的顺序,对上述候选文档集合中的候选文档进行排序,返回排序结果。
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (20)

  1. 一种用于生成排序模型的方法,包括:
    获取样本集,其中,所述样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;
    执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
  2. 根据权利要求1所述的用于生成排序模型的方法,其中,在所述基于各样本的目标值,对初始模型进行更新之后,所述训练步骤还包括:
    基于更新后的初始模型和所述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
  3. 根据权利要求2所述的用于生成排序模型的方法,其中,所述方法还包括:
    响应于确定初始模型未训练完成,使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行所述训练步骤。
  4. 根据权利要求1所述的用于生成排序模型的方法,其中,所述基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,包括:
    将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目 标值。
  5. 根据权利要求4所述的用于生成排序模型的方法,其中,初始模型为决策树;以及
    所述基于各样本的目标值,对初始模型进行更新,包括:
    创建决策树,对各样本的目标值进行拟合;
    基于所创建的决策树,更新初始模型。
  6. 根据权利要求5所述的用于生成排序模型的方法,其中,所述确定初始模型是否训练完成,包括:
    确定已创建的决策树的数量,将所述数量与预设数量进行比较;
    根据比较结果,确定初始模型是否训练完成。
  7. 根据权利要求1所述的用于生成排序模型的方法,其中,所述基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,包括:
    将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值,将所述损失值确定为该样本的目标值。
  8. 根据权利要求7所述的用于生成排序模型的方法,其中,所述确定初始模型是否训练完成,包括:
    确定各样本的目标值的平均值,将所述平均值与预设值进行比较;
    根据比较结果,确定初始模型是否训练完成。
  9. 一种用于生成排序模型的装置,包括:
    获取单元,被配置成获取样本集,其中,所述样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;
    第一训练单元,被配置成执行如下训练步骤:对于样本集中的样 本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
  10. 根据权利要求9所述的用于生成排序模型的装置,其中,所述第一训练单元进一步被配置成:
    在所述基于各样本的目标值,对初始模型进行更新之后,基于更新后的初始模型和所述样本集,重新估计各个位置的点击偏差和未点击偏差,以对各个位置的点击偏差和未点击偏差进行更新。
  11. 根据权利要求10所述的用于生成排序模型的装置,其中,所述装置还包括:
    第二训练单元,被配置成响应于确定初始模型未训练完成,使用更新后的初始模型以及更新后的各个位置的点击偏差和未点击偏差,继续执行所述训练步骤。
  12. 根据权利要求9所述的用于生成排序模型的装置,其中,所述第一训练单元,进一步被配置成:
    将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的梯度计算公式,将梯度计算结果确定为该样本的目标值。
  13. 根据权利要求12所述的用于生成排序模型的装置,其中,初始模型为决策树;以及
    所述第一训练单元,进一步被配置成:
    创建决策树,对各样本的目标值进行拟合;
    基于所创建的决策树,更新初始模型。
  14. 根据权利要求13所述的用于生成排序模型的装置,其中,所述第一训练单元,进一步被配置成:
    确定已创建的决策树的数量,将所述数量与预设数量进行比较;
    根据比较结果,确定初始模型是否训练完成。
  15. 根据权利要求9所述的用于生成排序模型的装置,其中,所述第一训练单元,进一步被配置成:
    将所得到的得分、第一位置的点击偏差和第二位置的未点击偏差输入至预先建立的损失函数,得到损失值,将所述损失值确定为该样本的目标值。
  16. 根据权利要求15所述的用于生成排序模型的装置,其中,所述第一训练单元,进一步被配置成:
    确定各样本的目标值的平均值,将所述平均值与预设值进行比较;
    根据比较结果,确定初始模型是否训练完成。
  17. 一种用于生成信息的方法,包括:
    响应于接收到包含目标查询信息的查询请求,检索与所述目标查询信息相匹配的候选文档,汇总为候选文档集合;
    将所述候选文档集合中的候选文档输入采用如权利要求1-8之一所述的方法生成的排序模型中,得到各候选文档的得分;
    按照得分由大到小的顺序,对所述候选文档集合中的候选文档进行排序,返回排序结果。
  18. 一种用于生成信息的装置,包括:
    检索单元,被配置成响应于接收到包含目标查询信息的查询请求,检索与所述目标查询信息相匹配的候选文档,汇总为候选文档集合;
    输入单元,被配置成将所述候选文档集合中的候选文档输入采用 如权利要求1-8之一所述的方法生成的排序模型中,得到各候选文档的得分;
    排序单元,被配置成按照得分由大到小的顺序,对所述候选文档集合中的候选文档进行排序,返回排序结果。
  19. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,其上存储有一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行时:
    获取样本集,其中,所述样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;
    执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新;确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
  20. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时,使得所述处理器:
    获取样本集,其中,所述样本集中的样本包括查询信息、查询结果中被点击的第一位置文档和未被点击的第二位置文档;
    执行如下训练步骤:对于样本集中的样本,将该样本中的查询信息、第一位置文档和第二位置文档输入至初始模型,分别得到所输入的各文档的得分,基于所得到的得分、第一位置的点击偏差和第二位置的未点击偏差,确定该样本的目标值,其中,点击偏差、未点击偏差分别用于表征文档在查询结果中的位置对文档被点击的概率、未被点击的概率的影响程度;基于各样本的目标值,对初始模型进行更新; 确定初始模型是否训练完成;响应于确定初始模型训练完成,将更新后的初始模型确定为排序模型。
PCT/CN2018/104683 2018-09-07 2018-09-07 用于生成排序模型的方法和装置 WO2020047861A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/104683 WO2020047861A1 (zh) 2018-09-07 2018-09-07 用于生成排序模型的方法和装置
US16/980,897 US11403303B2 (en) 2018-09-07 2018-09-07 Method and device for generating ranking model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/104683 WO2020047861A1 (zh) 2018-09-07 2018-09-07 用于生成排序模型的方法和装置

Publications (1)

Publication Number Publication Date
WO2020047861A1 true WO2020047861A1 (zh) 2020-03-12

Family

ID=69722088

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/104683 WO2020047861A1 (zh) 2018-09-07 2018-09-07 用于生成排序模型的方法和装置

Country Status (2)

Country Link
US (1) US11403303B2 (zh)
WO (1) WO2020047861A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563361B (zh) * 2020-04-01 2024-05-14 北京小米松果电子有限公司 文本标签的提取方法及装置、存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428125B (zh) * 2019-01-10 2023-05-30 北京三快在线科技有限公司 排序方法、装置、电子设备及可读存储介质
US11973743B2 (en) 2019-12-13 2024-04-30 TripleBlind, Inc. Systems and methods for providing a systemic error in artificial intelligence algorithms
US10924460B2 (en) 2019-12-13 2021-02-16 TripleBlind, Inc. Systems and methods for dividing filters in neural networks for private data computations
US11528259B2 (en) 2019-12-13 2022-12-13 TripleBlind, Inc. Systems and methods for providing a systemic error in artificial intelligence algorithms
US11431688B2 (en) 2019-12-13 2022-08-30 TripleBlind, Inc. Systems and methods for providing a modified loss function in federated-split learning
WO2022109215A1 (en) 2020-11-20 2022-05-27 TripleBlind, Inc. Systems and methods for providing a blind de-identification of privacy data
US11625377B1 (en) 2022-02-03 2023-04-11 TripleBlind, Inc. Systems and methods for enabling two parties to find an intersection between private data sets without learning anything other than the intersection of the datasets
US11539679B1 (en) 2022-02-04 2022-12-27 TripleBlind, Inc. Systems and methods for providing a quantum-proof key exchange

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037401A1 (en) * 2007-07-31 2009-02-05 Microsoft Corporation Information Retrieval and Ranking
CN106445979A (zh) * 2015-08-13 2017-02-22 北京字节跳动网络技术有限公司 一种智能频道排序方法和装置
CN107402954A (zh) * 2017-05-26 2017-11-28 百度在线网络技术(北京)有限公司 建立排序模型的方法、基于该模型的应用方法和装置

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7505964B2 (en) * 2003-09-12 2009-03-17 Google Inc. Methods and systems for improving a search ranking using related queries
US20090265290A1 (en) * 2008-04-18 2009-10-22 Yahoo! Inc. Optimizing ranking functions using click data
US8671093B2 (en) * 2008-11-18 2014-03-11 Yahoo! Inc. Click model for search rankings
US20100318531A1 (en) * 2009-06-10 2010-12-16 Microsoft Corporation Smoothing clickthrough data for web search ranking
US20110208735A1 (en) * 2010-02-23 2011-08-25 Microsoft Corporation Learning Term Weights from the Query Click Field for Web Search
US8326815B2 (en) * 2010-03-16 2012-12-04 Yahoo! Inc. Session based click features for recency ranking
US20110270828A1 (en) * 2010-04-29 2011-11-03 Microsoft Corporation Providing search results in response to a search query
US20120271806A1 (en) * 2011-04-21 2012-10-25 Microsoft Corporation Generating domain-based training data for tail queries
US9064016B2 (en) * 2012-03-14 2015-06-23 Microsoft Corporation Ranking search results using result repetition
US20130246383A1 (en) * 2012-03-18 2013-09-19 Microsoft Corporation Cursor Activity Evaluation For Search Result Enhancement
US9104733B2 (en) * 2012-11-29 2015-08-11 Microsoft Technology Licensing, Llc Web search ranking
US20150332169A1 (en) * 2014-05-15 2015-11-19 International Business Machines Corporation Introducing user trustworthiness in implicit feedback based search result ranking
US10657556B1 (en) * 2015-06-09 2020-05-19 Twitter, Inc. Click-through prediction for targeted content
US11170005B2 (en) * 2016-10-04 2021-11-09 Verizon Media Inc. Online ranking of queries for sponsored search
US10558687B2 (en) * 2016-10-27 2020-02-11 International Business Machines Corporation Returning search results utilizing topical user click data when search queries are dissimilar
US20190019157A1 (en) * 2017-07-13 2019-01-17 Linkedin Corporation Generalizing mixed effect models for personalizing job search
US10762092B2 (en) * 2017-08-16 2020-09-01 International Business Machines Corporation Continuous augmentation method for ranking components in information retrieval
RU2694001C2 (ru) * 2017-11-24 2019-07-08 Общество С Ограниченной Ответственностью "Яндекс" Способ и система создания параметра качества прогноза для прогностической модели, выполняемой в алгоритме машинного обучения
RU2733481C2 (ru) * 2018-12-13 2020-10-01 Общество С Ограниченной Ответственностью "Яндекс" Способ и система генерирования признака для ранжирования документа
RU2744029C1 (ru) * 2018-12-29 2021-03-02 Общество С Ограниченной Ответственностью "Яндекс" Система и способ формирования обучающего набора для алгоритма машинного обучения
US10956430B2 (en) * 2019-04-16 2021-03-23 International Business Machines Corporation User-driven adaptation of rankings of navigation elements
US11487797B2 (en) * 2020-09-22 2022-11-01 Dell Products L.P. Iterative application of a machine learning-based information extraction model to documents having unstructured text data
US11693897B2 (en) * 2020-10-20 2023-07-04 Spotify Ab Using a hierarchical machine learning algorithm for providing personalized media content

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037401A1 (en) * 2007-07-31 2009-02-05 Microsoft Corporation Information Retrieval and Ranking
CN106445979A (zh) * 2015-08-13 2017-02-22 北京字节跳动网络技术有限公司 一种智能频道排序方法和装置
CN107402954A (zh) * 2017-05-26 2017-11-28 百度在线网络技术(北京)有限公司 建立排序模型的方法、基于该模型的应用方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563361B (zh) * 2020-04-01 2024-05-14 北京小米松果电子有限公司 文本标签的提取方法及装置、存储介质

Also Published As

Publication number Publication date
US11403303B2 (en) 2022-08-02
US20210026860A1 (en) 2021-01-28

Similar Documents

Publication Publication Date Title
WO2020047861A1 (zh) 用于生成排序模型的方法和装置
US10846643B2 (en) Method and system for predicting task completion of a time period based on task completion rates and data trend of prior time periods in view of attributes of tasks using machine learning models
US20190362222A1 (en) Generating new machine learning models based on combinations of historical feature-extraction rules and historical machine-learning models
JP7343568B2 (ja) 機械学習のためのハイパーパラメータの識別および適用
US10671812B2 (en) Text classification using automatically generated seed data
US11182433B1 (en) Neural network-based semantic information retrieval
US10387430B2 (en) Geometry-directed active question selection for question answering systems
US20230139783A1 (en) Schema-adaptable data enrichment and retrieval
US20190050487A1 (en) Search Method, Search Server and Search System
US20140280238A1 (en) Systems and methods for classifying electronic information using advanced active learning techniques
US10489800B2 (en) Discovery of new business openings using web content analysis
US11216855B2 (en) Server computer and networked computer system for evaluating, storing, and managing labels for classification model evaluation and training
US20180101617A1 (en) Ranking Search Results using Machine Learning Based Models
US20210406993A1 (en) Automated generation of titles and descriptions for electronic commerce products
US10146872B2 (en) Method and system for predicting search results quality in vertical ranking
US11675764B2 (en) Learned data ontology using word embeddings from multiple datasets
US20220076157A1 (en) Data analysis system using artificial intelligence
CN110059172B (zh) 基于自然语言理解的推荐答案的方法和装置
EP4030355A1 (en) Neural reasoning path retrieval for multi-hop text comprehension
US20190065987A1 (en) Capturing knowledge coverage of machine learning models
WO2022265782A1 (en) Blackbox optimization via model ensembling
CN114676227A (zh) 样本生成方法、模型的训练方法以及检索方法
US20230229957A1 (en) Subcomponent model training
EP3968182A1 (en) Computerized smart inventory search methods and systems using classification and tagging
US20240152933A1 (en) Automatic mapping of a question or compliance controls associated with a compliance standard to compliance controls associated with another compliance standard

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18932499

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.06.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18932499

Country of ref document: EP

Kind code of ref document: A1