CN115545276A - Order receiving rate prediction method and system for abnormal orders of online taxi appointment - Google Patents

Order receiving rate prediction method and system for abnormal orders of online taxi appointment Download PDF

Info

Publication number
CN115545276A
CN115545276A CN202211071447.9A CN202211071447A CN115545276A CN 115545276 A CN115545276 A CN 115545276A CN 202211071447 A CN202211071447 A CN 202211071447A CN 115545276 A CN115545276 A CN 115545276A
Authority
CN
China
Prior art keywords
order
model
rate prediction
sample data
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211071447.9A
Other languages
Chinese (zh)
Inventor
李玉柱
史彬
凌国沈
田舟贤
史何富
强琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Geely Holding Group Co Ltd
Hangzhou Youxing Technology Co Ltd
Original Assignee
Zhejiang Geely Holding Group Co Ltd
Hangzhou Youxing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Geely Holding Group Co Ltd, Hangzhou Youxing Technology Co Ltd filed Critical Zhejiang Geely Holding Group Co Ltd
Priority to CN202211071447.9A priority Critical patent/CN115545276A/en
Publication of CN115545276A publication Critical patent/CN115545276A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Accounting & Taxation (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Finance (AREA)
  • Educational Administration (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an order receiving rate prediction method and system for abnormal orders of online taxi appointment. The method comprises the following steps: a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP; determining mold entering characteristics, namely marking a label based on an order sample data set, and determining the mold entering characteristics, including screening of the mold entering characteristics; training a model, namely obtaining a receiving rate prediction model through training and evaluation; and a model application step, namely inputting the order information to be predicted into an order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result. The system comprises a sample acquisition module, a sample marking module, a feature development module, a feature screening module, a model training module, a model evaluation module and an identification module. The method and the device improve the accuracy of abnormal order taking probability prediction and improve the normal order user experience of wind control misjudgment.

Description

Order receiving rate prediction method and system for abnormal order of network taxi appointment
Technical Field
The invention relates to the field of online taxi appointment in the Internet, in particular to a method and a system for predicting the order taking rate of an abnormal order of the online taxi appointment.
Background
With the combination of mobile communication technology and travel service, a network car-booking travel mode on a mobile terminal has become one of important choices when people travel.
The current network car booking service mostly adopts an operation mode of taking a car first and then paying, and a large number of abnormal orders which are not paid for a long time are inevitably generated in the service mode, so that high-amount capital loss of a platform is caused. The penalty mode for the identified abnormal order in the network taxi appointment service is usually to adopt pre-payment or charging penalty, and the penalty mode more or less influences the taxi taking experience of a normal user with wrong identification, so that the willingness of the user to call the taxi again is reduced, and the running water income of a platform is reduced. Therefore, better travel service and experience are provided for passengers, and simultaneously, the fund loss of unpaid orders of the platform is reduced, which is particularly important in the travel service.
The research direction in the prior art generally improves the accuracy rate of identifying abnormal orders as much as possible, so as to reduce the influence on normal orders, for example, a policy rule is formulated according to expert experience to judge whether the orders are abnormal or not, or a machine learning model is trained to identify abnormal orders based on order attributes and user behaviors, and then prepayment or recharge penalty is directly adopted for the identified abnormal orders. The current punishing mode does not consider whether the current order is accepted by the driver, namely whether the punishment is real and effective, if the current order is not accepted by the driver, the order does not cause unpaid fund loss to the platform, and the order does not need to be subjected to prepayment or charging punishment. Therefore, how to accurately and effectively identify abnormal orders to be picked up by drivers and to perform punishment is an urgent problem to be solved in the current travel service.
Disclosure of Invention
In view of the above disadvantages of the prior art, an object of the present invention is to provide a method and a system for predicting an order taking rate of an abnormal order of a networked car booking, which are used to solve the problem that in the prior art, the accuracy of the probability prediction of the order taking of the abnormal order is not sufficient, resulting in poor user experience of normal orders that are misjudged by sub-control.
In order to achieve the above objects and other related objects, the present invention provides a method and a system for predicting the order taking rate of abnormal orders of online taxi appointment, and the present invention provides a method for predicting the order taking rate of abnormal orders based on driver status information and user order information. The method can effectively identify the abnormal orders which can be picked up by a driver and improve the accuracy of the abnormal order picking probability prediction, thereby improving the normal order user experience of wind control misjudgment.
In an embodiment of the present invention, a method for predicting an order receiving rate of an abnormal order for a network taxi appointment, includes:
a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP;
determining mold entering characteristics, namely marking a label based on the order sample data set, and determining the mold entering characteristics, including screening of the mold entering characteristics;
training a model, namely obtaining a single receiving rate prediction model through training and evaluation;
and a model application step, namely inputting order information to be predicted into the order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result.
In an embodiment of the invention, in the data obtaining step, the database data of the passenger side APP and the driver side APP includes one or more of the following information: target order attribute information, user historical behavior information, peripheral driver information, and environmental information.
In an embodiment of the present invention, the step of determining the mold-in characteristic includes:
labeling the order taking according to the specific situation of the order taking result in the sample data set;
determining the mold entering characteristics according to target order related information, wherein the target order related information comprises target order attribute information, user historical behavior information, peripheral driver information, environment information and the like;
and screening the mould-entering characteristics based on the relevant indexes.
In an embodiment of the present invention, in the step of labeling the order form in the sample data set according to a specific situation of the order taking result, if a first specific situation occurs, that is, after the target user places an order, the order is taken by the driver and no subsequent withdrawing line is right, the order is labeled as 0 in the corresponding sample data set; if a second specific situation occurs, i.e. the target user has not finished making an order after placing an order, i.e. the order has not been taken by the driver or the order has been taken by the driver has taken a row-out, then the order is marked 1 in the corresponding sample data set.
In an embodiment of the invention, the related indicators include an availability indicator, an interpretability indicator, an information amount indicator, a related indicator and a stability indicator.
In one embodiment of the present invention, the first and second electrodes are formed on a substrate,
the availability index is used for evaluating whether the feature can be repeatedly developed on line;
the interpretability is used for evaluating whether the characteristic can be interpreted for a final result;
the correlation index is a Pearson correlation coefficient of the calculated features and is used for evaluating the correlation among the features;
the information quantity index is an information quantity IV of the calculated characteristic and is used for evaluating the prediction capability of the characteristic;
the stability indicator is a population stability indicator PSI that calculates a characteristic for assessing the stability of the characteristic.
In an embodiment of the present invention, the training of the model includes:
training the order taking rate prediction model by adopting a machine learning algorithm according to the screened entry characteristics;
evaluating the order taking rate prediction model, and judging the accuracy of the order taking rate prediction model for predicting whether the order is formed.
In an embodiment of the present invention, the mold-entering features obtained after the screening include: order characteristics, user characteristics, driver data, traffic conditions, and weather characteristics.
In an embodiment of the present invention, the machine learning algorithm may be a random forest algorithm, an XGBoost algorithm, or a decision tree algorithm.
In an embodiment of the present invention, in the step of training the model, the order sample data set is divided into a training set and a verification set according to a preset proportion; training the data of the training set according to the screened entry features by adopting the machine learning algorithm to train the order taking rate prediction model; and the data of the verification set is used for verifying the output result of the trained order taking rate prediction model so as to judge whether the trained order taking rate prediction model meets the preset requirement.
In an embodiment of the present invention, an order taking rate prediction system for abnormal orders of online taxi appointment, the system executing the method includes:
the order sample data set acquisition module acquires an order sample data set;
the sample data set marking module is used for marking order receiving labels on the orders in the order sample data set according to a specific situation;
the characteristic design and development module is used for designing and developing the model entering characteristics according to the target order attribute information, the user historical behavior information, the peripheral driver information, the environmental information and other data in the order sample data set;
the characteristic screening module is used for screening the characteristic according to the model entering characteristic and the labeled order sample data set and based on one or more of the following property indexes: screening the model entering characteristics by using an availability index, an interpretability index, an information quantity index, a correlation index and a stability index;
the order taking rate prediction model training module divides the order sample data set into a training set and a verification set according to a preset proportion, then trains an order taking rate prediction model according to the screened model entering characteristics by adopting a machine learning algorithm, optimizes parameters of the order taking rate prediction model and finally obtains the order taking rate prediction model;
the order taking rate prediction model evaluation module is used for verifying the output result of the trained order taking rate prediction model by using the verification set divided by the order sample data set and judging whether the recognition accuracy of the order taking rate prediction model on various risk users reaches a preset threshold value or not;
and the optimization abnormal order punishment decision module optimizes decision risk precautionary measures according to the prediction result of the order taking rate prediction model.
As described above, the method and system for predicting the order taking rate of the abnormal orders of the online taxi appointment have the following advantages: data such as driver state information, order information, user information, environmental factors and the like are considered in all directions to describe the model-entering characteristics, and the prediction accuracy rate of the order-receiving rate prediction model is improved. And based on the order receiving rate prediction model, whether the order is subjected to prepayment or recharging punishment is optimized and decided, and normal order user experience of wind control misjudgment is improved.
Drawings
FIG. 1 is a schematic diagram illustrating steps of the method for predicting the order taking rate of an abnormal order of a networked taxi appointment according to the present invention.
Fig. 2 is a schematic step diagram illustrating a method for predicting the order-taking rate of abnormal online taxi appointment orders according to a preferred embodiment of the present invention.
FIG. 3 is a schematic data flow diagram illustrating an order taking rate prediction system for abnormal orders of online taxi booking.
FIG. 4 is a block diagram of an application of the system for predicting the pick-up rate of abnormal orders in a network taxi appointment according to the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the features in the following embodiments and examples may be combined with each other without conflict. It is also to be understood that the terminology used in the examples herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention. Test methods in which specific conditions are not specified in the following examples are generally carried out under conventional conditions or under conditions recommended by the respective manufacturers.
Please refer to fig. 1 to 4. It should be understood that the structures, ratios, sizes, and the like shown in the drawings and described in the specification are only used for matching with the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions under which the present invention can be implemented, so that the present invention has no technical significance, and any structural modification, ratio relationship change, or size adjustment should still fall within the scope of the present invention without affecting the efficacy and the achievable purpose of the present invention. In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are used for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms may be changed or adjusted without substantial change in the technical content.
Referring to fig. 1 to 4, as a preferred embodiment, the present invention provides a method for predicting an order-receiving rate of an abnormal order of a network taxi appointment, which is specifically shown in fig. 1, and mainly includes the following steps:
s1: a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP;
s2: a step of determining mold entering characteristics, which is to mark a receipt receiving label based on the order sample data set and determine the mold entering characteristics including the screening of the mold entering characteristics;
s3: training a model, namely obtaining a single receiving rate prediction model through training and evaluation;
s4: and a model application step, namely inputting order information to be predicted into the order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result.
In the step of S1 data acquisition, a user drives a car on a platform, and data is generated by using a passenger side APP or an applet. Similarly, the driver can also generate driver data by using the driver-side App order receiving, and the characteristic design and development can be carried out based on the data to obtain the characteristic data.
In the step of determining the mold-entering characteristics in S2, the design and development characteristics according to the driver information, the order information, and the like around the target order may be based on expert experience to design development characteristics according to available data, such as a hexagonal range is circled, the number of called-order users in the past 10 minutes in the range is counted, the number of drivers in the circle is counted, and characteristics such as "the number of users when called-order is made around the order", "the number of drivers left free" are designed.
In the step of S3 model training, a sample data set is divided into a training set and a verification set according to a preset proportion, wherein the training set is used for training the order taking rate prediction model, the verification set is used for verifying the output result of the trained order taking rate prediction model, whether the trained order taking rate prediction model meets the preset requirement is judged, then, according to the screened characteristics, a machine learning algorithm is adopted, such as random forest, XGboost (eXtreme Gradient Boosting), decision tree and the like, to train the order taking rate prediction model, and after optimization, the order taking rate prediction model is finally obtained.
In the step of applying the model S4, the order taking rate model is applied to a specific scene, firstly, an order needing to be predicted is input into the order taking rate model, a prediction result can be obtained, and whether risk method measures (whether the measures such as prepayment or charging penalty are carried out on the order or not) need to be taken or not is determined according to the prediction result.
The method in a preferred embodiment of the invention (as shown in fig. 2) comprises the steps of:
s21: acquiring an order sample data set, wherein the order sample data set comprises target order attribute information, user historical behavior information, peripheral driver information and environment information;
s22: labeling the sample data set with labels according to specific situations;
s23: designing and developing characteristics according to driver information, order information and the like around the target order;
s24: screening the mold-entering characteristics based on the relevant indexes;
s25: according to the screened features, a machine learning algorithm is adopted to train a single receiving rate prediction model;
s26: evaluating an order receiving rate prediction model, and judging the accuracy of the model for predicting whether the order is formed;
s27: and optimizing and deciding whether to penalize the order or not according to the prediction result of the order receiving rate prediction model on order forming.
In a first step S21, first a sample set of orders is obtained, which is a number of order samples over a certain time period.
Specifically, the method mainly includes target order attribute information, user historical behavior information, driver information around the target order, and environment information, and the related information includes but is not limited to: peripheral driver position information, driver service state information, driver online time, order placing time, user position when placing an order, order price, order starting and ending position, traffic condition, weather and the like. Wherein the user refers to an ordering passenger. The data is derived from user data generated by a passenger-side APP or applet used by the user to drive the platform, and driver data generated by a driver-side APP order for the driver.
In a second step S22, each order in the sample data set is labeled according to the specific situation of the order taking result.
Specifically, a column of tag columns is added in the sample data set, and if a first specific situation occurs, namely after the target user places an order, the order is taken by a driver and no other row is followed, the order is marked as 0 in the corresponding sample data set; if the second special case occurs, i.e. the target user has not finished making an order after placing an order, i.e. the order has not been taken by the driver or an order removal row has taken place after the order has been taken by the driver, the order is marked 1 in the corresponding sample data set.
In a third step S23, according to data such as target order attribute information, user historical behavior information, peripheral driver information, and environmental information, design and development of a mold-entry feature are performed, that is, the mold-entry feature is determined according to the target order related information.
In particular, features developed include, but are not limited to: the number of order users around the order, the number of drivers around the order, the order taking ratio around the order, the amount of money after the order is folded and other order characteristics; the user characteristics such as the user call number, the user become the number, the user becomes the single ratio, the user removes the number, the user history removes the single waiting time and the like; the number of drivers in the periphery, the online time of the drivers in the periphery, the order taking amount of the drivers in the periphery, the order sending and receiving ratio of the drivers in the periphery, the distance between the drivers in the periphery and the starting point of the order and other driver data; as well as traffic conditions, weather characteristics, etc.
In the fourth step S24, screening of the modeled characteristics is performed based on, but not limited to, related indexes such as availability index, interpretability index, information amount index, correlation index, and stability index.
Specifically, the availability index needs to comprehensively consider various aspects such as product flow design, user authorization protocol, compliance requirements, model application links, and the like, and determine whether the feature data is continuously available. In the present invention, it is referred to whether the feature can be repeatedly developed on a line, for example, because the feature development cost is too high, the feature is discarded with a high probability.
The interpretability index means that the characteristic business logic needs to be clear and needs to meet business interpretability. In the present invention, the feature may explain the final result, for example, if the result of a feature is not normal (mostly determined by the service side), the feature may be discarded.
The information quantity index is an information quantity IV of the calculated feature for evaluating the predictive ability of the feature. Generally, the higher the IV, the stronger the predictive power. When the characteristic IV value is larger than a set threshold (generally set to be 0.02), the characteristic has prediction capability and meets the mold-entering requirement.
The correlation index is used for calculating the Pearson correlation coefficient of the features so as to evaluate the correlation between the features. The closer the correlation coefficient of the two characteristics is to 0, the weaker the linear correlation is, and the closer to 1 or-1, the stronger the linear correlation is. When the correlation coefficient between the two features is larger than a set threshold value (generally set to 0.6), the feature with the lower IV value is removed.
The Stability Index is a Population Stability Index (PSI) of the computed feature used to evaluate the Stability of the feature. When the PSI value is within the range of the set threshold (generally set to be 0-0.1), the characteristic is not changed or is rarely changed, and the stability requirement is met.
In the fifth step S25, firstly, the sample data set is divided into a training set and a verification set according to a preset proportion, where the training set is used for training the order taking rate prediction model, and the verification set is used for verifying the output result of the trained order taking rate prediction model, and determining whether the trained order taking rate prediction model meets the preset requirement. And then, training the order taking rate prediction model by adopting a machine learning algorithm such as a random forest algorithm, an XGboost algorithm, a decision tree algorithm and the like according to the screened characteristics, and finally obtaining the order taking rate prediction model after optimization.
Wherein the model is trained to adjust (learn) and determine the ideal values of all Weights and biases Bias by labeled samples. The training target is to minimize the loss function, and the machine learning algorithm does the following in the training process: multiple samples are examined and an attempt is made to find a model that minimizes losses, with the goal of minimizing losses (Loss).
The training parameters refer to parameters used by a machine learning algorithm, taking a decision tree as an example: including the depth of the tree, the number of leaf nodes, the minimum number of samples that the node can split, etc. The parameters are adjusted so as to obtain better model effect (minimizing loss function), and the accuracy of the model for predicting the order-missing order is improved.
To obtain the optimal parameters, the simplest example is to find the convergence point by computing the penalty function for each possible value in the entire data set, resulting in the optimal parameters: 1. calculating the loss: the Loss (Loss) under this secondary parameter (bias, weight) is calculated by a Loss function. 2. And (3) updating calculation parameters: the value of the loss function is detected and new values are generated for the parameters, e.g., bias, weight, to minimize the loss. The model under the optimal parameters is the optimal model, and the loss function of the optimal model is the minimum.
In a sixth step S26, the data of the verification set divided by the order sample data set is used to verify the output result of the trained order taking rate prediction model, and it is determined whether the recognition accuracy of the order taking rate prediction model for various risk users reaches a preset threshold. The accuracy of predicting the order of the order not formed mainly comprises two indexes: the accuracy rate of the order sample identification labeled 1 and the recall rate of the order sample identification labeled 1. The two indexes of accuracy and recall rate are defined as follows: precision = TP/(TP + FP), recall = TP/(TP + FN); wherein, TP: the sample labeled 1, predicted to be 1; FP: the sample labeled 0, predicted to be 1; FN: the sample labeled 1, is predicted to be 0.
In the seventh step S27, according to the prediction result of the order receiving rate prediction model, whether to perform prepayment or penalty charging on the order is optimized, so as to improve the normal user experience of the wind-controlled misjudgment order.
Specifically, aiming at a strategy and a model for punishing the order calling of the risk user in the online judgment mode, the order receiving rate prediction model is used for judging whether the order is subsequently formed, if so, the original punishment is kept, otherwise, the original punishment is cancelled or replaced by other punishments, so that the model punishment is optimized, the invalid punishment is reduced, and the user order calling experience is improved.
FIG. 3 is a schematic data flow diagram of the system for predicting the order taking rate of an abnormal order of a networked taxi appointment according to the present invention. The system of the present invention is used for executing the method of the present invention, and includes but is not limited to the following 7 modules, which are respectively:
the order sample data set acquisition module 31 is responsible for acquiring order sample data in a plurality of time periods specified by the user from the mobile phone terminal. Step S21 is executed to obtain an order sample set, and the order sample set is a number of order samples in a certain time period.
The sample data set labeling module 32 labels each order in the sample data set according to a specific situation, that is, executes step S22.
The feature design and development module 33 performs design and development of the model-entry features according to the data such as the target order attribute information, the user historical behavior information, the peripheral driver information, and the environmental information in the order sample set, that is, performs step S23.
The feature screening module 34 may perform screening of the model entry features based on the related indexes such as the availability index, the interpretability index, the information amount index, the correlation index, and the stability index, according to the model entry features and the labeled sample data set, that is, perform step S24. The relevant index may be one or more, and may not be limited to the above index.
The order taking rate prediction model training module 35 obtains the order taking rate prediction model by using machine learning algorithm training according to the screened features, and the order taking rate prediction model training may use GBDT (Gradient Boosting Decision Tree), neural network and other algorithms, that is, step S25 is executed.
The order taking rate prediction model evaluation module 36 verifies an output result of the trained order taking rate prediction model, and the evaluation of the order taking rate prediction model may use indexes such as ROC Curve (Receiver Operating characterizing Curve) and F1 score, that is, step S26 is executed.
And the abnormal optimization order penalty decision module 37 decides whether to perform order penalty according to the prediction result of the order receiving rate prediction model, namely, executes the step S27.
Fig. 4 shows an exemplary embodiment of the method of the present invention, and the idea of applying the present invention to solve the problem of the wind control algorithm specifically may include the following 4 points:
1. obtaining an original data domain, which is mainly to obtain a passenger side APP and user buried point data, behavior data, equipment data, order data and the like returned by the driver side APP;
2. designing and developing the mold-entering characteristics based on the original data domain, including the transformation and screening of the mold-entering characteristics;
3. selecting a proper algorithm, and training a model;
4. the model is applied to a specific scene.
Specifically, assume that a database is in a company's system, and the original sample data can be obtained. The occurrence of each order is recorded in the database. When a method for predicting the order taking rate of abnormal orders of online taxi appointment needs to be developed and such a model needs to be trained, the database can be derived and used as original sample data containing ten million records. This sample database itself contains a large amount of order data that can be used to train the model. The key to training the model is that the more sample data input, the better. A machine (such as a computer) can analyze a large amount of sample data and find out rules by itself to design (learn) a set of order rate prediction models which cannot be completed by human beings.
The order rate prediction model is evaluated by selecting several pieces of original data and inputting the selected pieces of original data into the model. Comparing the output result of the model with the actual situation, for example, the order rate prediction model predicts that the order can not be delivered, and then comparing the output result with the actually generated result. If the order is actually committed or not, and the result is consistent, the prediction of the order rate prediction model is correct. For example, taking 1000 orders to verify the evaluation according to the above method, it is possible to know what the correct rate of the model is.
After the order rate prediction model is determined, when the method is applied to the actual operation process, if an order comes in, the name, the telephone, the previous order expression, the background information and the like of a user can be searched in the database, and all information forms the current complete information of the real-time order by adding the time, the place, the actual situation and the like of the current order.
Real-time order information is input into the order rate prediction model for judgment, and the success rate of order taking can be predicted, and the order taking is not needed. If the predicted risk of the order rate prediction model is very high, the order is not taken (there can be several processing situations: 1, money must be pre-charged firstly, money must be paid firstly, a guarantee fund 3 is not paid firstly, the order is not taken, and the like), and if the predicted risk of the order rate prediction model is very low, the order is taken immediately, namely the actual application is realized.
In summary, the present invention designs and develops features based on driver status information and user order information, and uses driver data such as driver position information, driver service status information, driver online time, driver order taking amount, driver order sending and taking ratio, and distance from the driver to the order starting point, order data such as order price, order starting and ending point position, number of order calling users around the order, number of drivers around the order, order taking ratio around the order, and environmental factors such as traffic conditions and weather, to describe and influence the incoming model features of the order taking, so as to improve the accuracy rate of the order taking rate prediction model.
Based on the developed characteristics, a machine learning algorithm is adopted, an order taking rate prediction model is trained, the probability that the order is taken by a driver and the order is not removed subsequently is predicted, whether the order is subjected to prepayment or recharging punishment is optimized and decided is optimized, and normal order user experience of wind control misjudgment is improved. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (11)

1. A method for predicting the order receiving rate of abnormal orders of online taxi appointment comprises the following steps:
a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP;
a step of determining mold entering characteristics, namely marking a label based on the order sample data set, and determining the mold entering characteristics, including screening of the mold entering characteristics;
training a model, namely obtaining a single receiving rate prediction model through training and evaluation;
and a model application step, namely inputting order information to be predicted into the order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result.
2. The method for predicting the order taking rate of the online taxi appointment abnormal order according to claim 1, wherein in the data obtaining step, the database data of the passenger side APP and the driver side APP comprises one or more of the following information: target order attribute information, user historical behavior information, peripheral driver information, and environmental information.
3. The method of claim 1, wherein the step of determining the incoming model features comprises:
labeling the order taking according to the specific situation of the order taking result in the sample data set;
determining the mold entering characteristics according to target order related information, wherein the target order related information comprises target order attribute information, user historical behavior information, peripheral driver information, environment information and the like;
and screening the mold-entering characteristics based on the relevant indexes.
4. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 3, wherein: in the step of marking the order form with the label according to the specific situation of the order taking result in the sample data set, if the first specific situation occurs, namely after the target user places an order, the order is taken by the driver and no subsequent withdrawing line is right, marking the order as 0 in the corresponding sample data set; if a second specific situation occurs, i.e. the target user has not made a order after placing an order, i.e. the order has not been taken by the driver or the order has taken a row-off after having been taken by the driver, the order is marked 1 in the corresponding sample data set.
5. The call admission rate prediction method according to claim 3, wherein: the relevant indexes include an availability index, an interpretability index, an information content index, a relevance index and a stability index.
6. The system according to claim 5, wherein the system for predicting the pick-up rate of abnormal orders for online taxi booking:
the availability index is used for evaluating whether the feature can be repeatedly developed on line;
the interpretability is used for evaluating whether the characteristic can be interpreted for a final result;
the correlation index is a Pearson correlation coefficient of the calculated features and is used for evaluating the correlation among the features;
the information quantity index is an information quantity IV of the calculated characteristic and is used for evaluating the predicting capability of the characteristic;
the stability indicator is a population stability indicator PSI that calculates a characteristic for assessing the stability of the characteristic.
7. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 1, wherein: the training model step comprises:
training the order taking rate prediction model by adopting a machine learning algorithm according to the screened entry characteristics;
evaluating the order taking rate prediction model, and judging the accuracy of the order taking rate prediction model for predicting whether the order is formed.
8. The method of claim 7, wherein the method comprises: the mold-entering characteristics obtained after screening comprise: order characteristics, user characteristics, driver data, traffic conditions, and weather characteristics.
9. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 7, wherein: the machine learning algorithm can be a random forest algorithm, an XGboost algorithm or a decision tree algorithm.
10. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 7, wherein: in the step of training the model, dividing the order sample data set into a training set and a verification set according to a preset proportion; training the data of the training set according to the screened entry features by adopting the machine learning algorithm to train the order taking rate prediction model; and the data of the verification set is used for verifying the output result of the trained order taking rate prediction model so as to judge whether the trained order taking rate prediction model meets the preset requirement.
11. A pick-up rate prediction system for network taxi appointment exception orders, the system performing the method of claims 1 to 10, comprising:
the order sample data set acquisition module acquires an order sample data set;
the sample data set marking module is used for marking order receiving labels on the orders in the order sample data set according to a specific situation;
the characteristic design and development module is used for designing and developing the model entering characteristics according to the target order attribute information, the user historical behavior information, the peripheral driver information, the environmental information and other data in the order sample data set;
the characteristic screening module is used for screening the characteristic according to the model entering characteristic and the labeled order sample data set and based on one or more of the following property indexes: screening the model entering characteristics by using an availability index, an interpretability index, an information content index, a correlation index and a stability index;
the order taking rate prediction model training module divides the order sample data set into a training set and a verification set according to a preset proportion, then trains an order taking rate prediction model according to the screened model entering characteristics by adopting a machine learning algorithm, optimizes parameters of the order taking rate prediction model and finally obtains the order taking rate prediction model;
the order taking rate prediction model evaluation module is used for verifying the output result of the trained order taking rate prediction model by using the verification set divided by the order sample data set and judging whether the recognition accuracy of the order taking rate prediction model on various risk users reaches a preset threshold value or not;
and the optimization abnormal order punishment decision module optimizes decision risk precautionary measures according to the prediction result of the order taking rate prediction model.
CN202211071447.9A 2022-09-02 2022-09-02 Order receiving rate prediction method and system for abnormal orders of online taxi appointment Pending CN115545276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211071447.9A CN115545276A (en) 2022-09-02 2022-09-02 Order receiving rate prediction method and system for abnormal orders of online taxi appointment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211071447.9A CN115545276A (en) 2022-09-02 2022-09-02 Order receiving rate prediction method and system for abnormal orders of online taxi appointment

Publications (1)

Publication Number Publication Date
CN115545276A true CN115545276A (en) 2022-12-30

Family

ID=84726552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211071447.9A Pending CN115545276A (en) 2022-09-02 2022-09-02 Order receiving rate prediction method and system for abnormal orders of online taxi appointment

Country Status (1)

Country Link
CN (1) CN115545276A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116384989A (en) * 2023-06-05 2023-07-04 北京龙驹易行科技有限公司 Order payment method, device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116384989A (en) * 2023-06-05 2023-07-04 北京龙驹易行科技有限公司 Order payment method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112700072B (en) Traffic condition prediction method, electronic device, and storage medium
US20220340148A1 (en) Method for estimating an accident risk of an autonomous vehicle
CN116504076A (en) Expressway traffic flow prediction method based on ETC portal data
CN111582559A (en) Method and device for estimating arrival time
CN114547128A (en) False order identification method, false order identification system, computer equipment and storage medium
CN110910180A (en) Information pushing method and device, electronic equipment and storage medium
Karamizadeh et al. Using the clustering algorithms and rule-based of data mining to identify affecting factors in the profit and loss of third party insurance, insurance company auto
CN115622203A (en) Analysis reminding method and system based on charging data of vehicle-mounted wireless charger
CN111145006A (en) Automobile financial anti-fraud model training method and device based on user portrait
CN115545276A (en) Order receiving rate prediction method and system for abnormal orders of online taxi appointment
CN116432810A (en) Traffic flow prediction model determination method, device, apparatus and readable storage medium
CN113096405B (en) Construction method of prediction model, and vehicle accident prediction method and device
CN114418748A (en) Vehicle credit evaluation method, device, equipment and storage medium
US11176502B2 (en) Analytical model training method for customer experience estimation
CN116882755A (en) Method for predicting oil theft risk of trucking route based on vehicle-mounted Tbox data
CN115907898A (en) Method for recommending financial products to reinsurance client and related equipment
CN114066288B (en) Intelligent data center-based emergency detection method and system for operation road
CN112560953B (en) Private car illegal operation identification method, system, equipment and storage medium
CN111489171B (en) Riding travel matching method and device based on two-dimensional code, electronic equipment and medium
CN115271826A (en) Logistics line price interval prediction method and device
CN112070570B (en) Intelligent driver end network vehicle-closing operation method
CN113870020A (en) Overdue risk control method and device
CN114548463A (en) Line information prediction method, line information prediction device, computer equipment and storage medium
Siaminamini et al. Generating a risk profile for car insurance policyholders: A deep learning conceptual model
CN115456713A (en) Unpaid reminding method and system for abnormal order of online taxi appointment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination