CN115545276A

CN115545276A - Order receiving rate prediction method and system for abnormal orders of online taxi appointment

Info

Publication number: CN115545276A
Application number: CN202211071447.9A
Authority: CN
Inventors: 李玉柱; 史彬; 凌国沈; 田舟贤; 史何富; 强琦
Original assignee: Zhejiang Geely Holding Group Co Ltd; Hangzhou Youxing Technology Co Ltd
Current assignee: Zhejiang Geely Holding Group Co Ltd; Hangzhou Youxing Technology Co Ltd
Priority date: 2022-09-02
Filing date: 2022-09-02
Publication date: 2022-12-30

Abstract

The invention provides an order receiving rate prediction method and system for abnormal orders of online taxi appointment. The method comprises the following steps: a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP; determining mold entering characteristics, namely marking a label based on an order sample data set, and determining the mold entering characteristics, including screening of the mold entering characteristics; training a model, namely obtaining a receiving rate prediction model through training and evaluation; and a model application step, namely inputting the order information to be predicted into an order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result. The system comprises a sample acquisition module, a sample marking module, a feature development module, a feature screening module, a model training module, a model evaluation module and an identification module. The method and the device improve the accuracy of abnormal order taking probability prediction and improve the normal order user experience of wind control misjudgment.

Description

Order receiving rate prediction method and system for abnormal order of network taxi appointment

Technical Field

The invention relates to the field of online taxi appointment in the Internet, in particular to a method and a system for predicting the order taking rate of an abnormal order of the online taxi appointment.

Background

With the combination of mobile communication technology and travel service, a network car-booking travel mode on a mobile terminal has become one of important choices when people travel.

The current network car booking service mostly adopts an operation mode of taking a car first and then paying, and a large number of abnormal orders which are not paid for a long time are inevitably generated in the service mode, so that high-amount capital loss of a platform is caused. The penalty mode for the identified abnormal order in the network taxi appointment service is usually to adopt pre-payment or charging penalty, and the penalty mode more or less influences the taxi taking experience of a normal user with wrong identification, so that the willingness of the user to call the taxi again is reduced, and the running water income of a platform is reduced. Therefore, better travel service and experience are provided for passengers, and simultaneously, the fund loss of unpaid orders of the platform is reduced, which is particularly important in the travel service.

The research direction in the prior art generally improves the accuracy rate of identifying abnormal orders as much as possible, so as to reduce the influence on normal orders, for example, a policy rule is formulated according to expert experience to judge whether the orders are abnormal or not, or a machine learning model is trained to identify abnormal orders based on order attributes and user behaviors, and then prepayment or recharge penalty is directly adopted for the identified abnormal orders. The current punishing mode does not consider whether the current order is accepted by the driver, namely whether the punishment is real and effective, if the current order is not accepted by the driver, the order does not cause unpaid fund loss to the platform, and the order does not need to be subjected to prepayment or charging punishment. Therefore, how to accurately and effectively identify abnormal orders to be picked up by drivers and to perform punishment is an urgent problem to be solved in the current travel service.

Disclosure of Invention

In view of the above disadvantages of the prior art, an object of the present invention is to provide a method and a system for predicting an order taking rate of an abnormal order of a networked car booking, which are used to solve the problem that in the prior art, the accuracy of the probability prediction of the order taking of the abnormal order is not sufficient, resulting in poor user experience of normal orders that are misjudged by sub-control.

In order to achieve the above objects and other related objects, the present invention provides a method and a system for predicting the order taking rate of abnormal orders of online taxi appointment, and the present invention provides a method for predicting the order taking rate of abnormal orders based on driver status information and user order information. The method can effectively identify the abnormal orders which can be picked up by a driver and improve the accuracy of the abnormal order picking probability prediction, thereby improving the normal order user experience of wind control misjudgment.

In an embodiment of the present invention, a method for predicting an order receiving rate of an abnormal order for a network taxi appointment, includes:

a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP;

determining mold entering characteristics, namely marking a label based on the order sample data set, and determining the mold entering characteristics, including screening of the mold entering characteristics;

training a model, namely obtaining a single receiving rate prediction model through training and evaluation;

and a model application step, namely inputting order information to be predicted into the order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result.

In an embodiment of the invention, in the data obtaining step, the database data of the passenger side APP and the driver side APP includes one or more of the following information: target order attribute information, user historical behavior information, peripheral driver information, and environmental information.

In an embodiment of the present invention, the step of determining the mold-in characteristic includes:

labeling the order taking according to the specific situation of the order taking result in the sample data set;

determining the mold entering characteristics according to target order related information, wherein the target order related information comprises target order attribute information, user historical behavior information, peripheral driver information, environment information and the like;

and screening the mould-entering characteristics based on the relevant indexes.

In an embodiment of the present invention, in the step of labeling the order form in the sample data set according to a specific situation of the order taking result, if a first specific situation occurs, that is, after the target user places an order, the order is taken by the driver and no subsequent withdrawing line is right, the order is labeled as 0 in the corresponding sample data set; if a second specific situation occurs, i.e. the target user has not finished making an order after placing an order, i.e. the order has not been taken by the driver or the order has been taken by the driver has taken a row-out, then the order is marked 1 in the corresponding sample data set.

In an embodiment of the invention, the related indicators include an availability indicator, an interpretability indicator, an information amount indicator, a related indicator and a stability indicator.

In one embodiment of the present invention, the first and second electrodes are formed on a substrate,

the availability index is used for evaluating whether the feature can be repeatedly developed on line;

the interpretability is used for evaluating whether the characteristic can be interpreted for a final result;

the correlation index is a Pearson correlation coefficient of the calculated features and is used for evaluating the correlation among the features;

the information quantity index is an information quantity IV of the calculated characteristic and is used for evaluating the prediction capability of the characteristic;

the stability indicator is a population stability indicator PSI that calculates a characteristic for assessing the stability of the characteristic.

In an embodiment of the present invention, the training of the model includes:

training the order taking rate prediction model by adopting a machine learning algorithm according to the screened entry characteristics;

evaluating the order taking rate prediction model, and judging the accuracy of the order taking rate prediction model for predicting whether the order is formed.

In an embodiment of the present invention, the mold-entering features obtained after the screening include: order characteristics, user characteristics, driver data, traffic conditions, and weather characteristics.

In an embodiment of the present invention, the machine learning algorithm may be a random forest algorithm, an XGBoost algorithm, or a decision tree algorithm.

In an embodiment of the present invention, in the step of training the model, the order sample data set is divided into a training set and a verification set according to a preset proportion; training the data of the training set according to the screened entry features by adopting the machine learning algorithm to train the order taking rate prediction model; and the data of the verification set is used for verifying the output result of the trained order taking rate prediction model so as to judge whether the trained order taking rate prediction model meets the preset requirement.

In an embodiment of the present invention, an order taking rate prediction system for abnormal orders of online taxi appointment, the system executing the method includes:

the order sample data set acquisition module acquires an order sample data set;

the sample data set marking module is used for marking order receiving labels on the orders in the order sample data set according to a specific situation;

the characteristic design and development module is used for designing and developing the model entering characteristics according to the target order attribute information, the user historical behavior information, the peripheral driver information, the environmental information and other data in the order sample data set;

the characteristic screening module is used for screening the characteristic according to the model entering characteristic and the labeled order sample data set and based on one or more of the following property indexes: screening the model entering characteristics by using an availability index, an interpretability index, an information quantity index, a correlation index and a stability index;

the order taking rate prediction model training module divides the order sample data set into a training set and a verification set according to a preset proportion, then trains an order taking rate prediction model according to the screened model entering characteristics by adopting a machine learning algorithm, optimizes parameters of the order taking rate prediction model and finally obtains the order taking rate prediction model;

the order taking rate prediction model evaluation module is used for verifying the output result of the trained order taking rate prediction model by using the verification set divided by the order sample data set and judging whether the recognition accuracy of the order taking rate prediction model on various risk users reaches a preset threshold value or not;

and the optimization abnormal order punishment decision module optimizes decision risk precautionary measures according to the prediction result of the order taking rate prediction model.

As described above, the method and system for predicting the order taking rate of the abnormal orders of the online taxi appointment have the following advantages: data such as driver state information, order information, user information, environmental factors and the like are considered in all directions to describe the model-entering characteristics, and the prediction accuracy rate of the order-receiving rate prediction model is improved. And based on the order receiving rate prediction model, whether the order is subjected to prepayment or recharging punishment is optimized and decided, and normal order user experience of wind control misjudgment is improved.

Drawings

FIG. 1 is a schematic diagram illustrating steps of the method for predicting the order taking rate of an abnormal order of a networked taxi appointment according to the present invention.

Fig. 2 is a schematic step diagram illustrating a method for predicting the order-taking rate of abnormal online taxi appointment orders according to a preferred embodiment of the present invention.

FIG. 3 is a schematic data flow diagram illustrating an order taking rate prediction system for abnormal orders of online taxi booking.

FIG. 4 is a block diagram of an application of the system for predicting the pick-up rate of abnormal orders in a network taxi appointment according to the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the features in the following embodiments and examples may be combined with each other without conflict. It is also to be understood that the terminology used in the examples herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention. Test methods in which specific conditions are not specified in the following examples are generally carried out under conventional conditions or under conditions recommended by the respective manufacturers.

Please refer to fig. 1 to 4. It should be understood that the structures, ratios, sizes, and the like shown in the drawings and described in the specification are only used for matching with the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions under which the present invention can be implemented, so that the present invention has no technical significance, and any structural modification, ratio relationship change, or size adjustment should still fall within the scope of the present invention without affecting the efficacy and the achievable purpose of the present invention. In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are used for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms may be changed or adjusted without substantial change in the technical content.

Referring to fig. 1 to 4, as a preferred embodiment, the present invention provides a method for predicting an order-receiving rate of an abnormal order of a network taxi appointment, which is specifically shown in fig. 1, and mainly includes the following steps:

s1: a data acquisition step, wherein an order sample data set is acquired, and the order sample data set comprises database data of a passenger side APP and a driver side APP;

s2: a step of determining mold entering characteristics, which is to mark a receipt receiving label based on the order sample data set and determine the mold entering characteristics including the screening of the mold entering characteristics;

s3: training a model, namely obtaining a single receiving rate prediction model through training and evaluation;

s4: and a model application step, namely inputting order information to be predicted into the order taking rate prediction model, outputting the prediction of the order taking rate, and optimizing decision risk precautionary measures according to the prediction result.

In the step of S1 data acquisition, a user drives a car on a platform, and data is generated by using a passenger side APP or an applet. Similarly, the driver can also generate driver data by using the driver-side App order receiving, and the characteristic design and development can be carried out based on the data to obtain the characteristic data.

In the step of determining the mold-entering characteristics in S2, the design and development characteristics according to the driver information, the order information, and the like around the target order may be based on expert experience to design development characteristics according to available data, such as a hexagonal range is circled, the number of called-order users in the past 10 minutes in the range is counted, the number of drivers in the circle is counted, and characteristics such as "the number of users when called-order is made around the order", "the number of drivers left free" are designed.

In the step of S3 model training, a sample data set is divided into a training set and a verification set according to a preset proportion, wherein the training set is used for training the order taking rate prediction model, the verification set is used for verifying the output result of the trained order taking rate prediction model, whether the trained order taking rate prediction model meets the preset requirement is judged, then, according to the screened characteristics, a machine learning algorithm is adopted, such as random forest, XGboost (eXtreme Gradient Boosting), decision tree and the like, to train the order taking rate prediction model, and after optimization, the order taking rate prediction model is finally obtained.

In the step of applying the model S4, the order taking rate model is applied to a specific scene, firstly, an order needing to be predicted is input into the order taking rate model, a prediction result can be obtained, and whether risk method measures (whether the measures such as prepayment or charging penalty are carried out on the order or not) need to be taken or not is determined according to the prediction result.

The method in a preferred embodiment of the invention (as shown in fig. 2) comprises the steps of:

s21: acquiring an order sample data set, wherein the order sample data set comprises target order attribute information, user historical behavior information, peripheral driver information and environment information;

s22: labeling the sample data set with labels according to specific situations;

s23: designing and developing characteristics according to driver information, order information and the like around the target order;

s24: screening the mold-entering characteristics based on the relevant indexes;

s25: according to the screened features, a machine learning algorithm is adopted to train a single receiving rate prediction model;

s26: evaluating an order receiving rate prediction model, and judging the accuracy of the model for predicting whether the order is formed;

s27: and optimizing and deciding whether to penalize the order or not according to the prediction result of the order receiving rate prediction model on order forming.

In a first step S21, first a sample set of orders is obtained, which is a number of order samples over a certain time period.

Specifically, the method mainly includes target order attribute information, user historical behavior information, driver information around the target order, and environment information, and the related information includes but is not limited to: peripheral driver position information, driver service state information, driver online time, order placing time, user position when placing an order, order price, order starting and ending position, traffic condition, weather and the like. Wherein the user refers to an ordering passenger. The data is derived from user data generated by a passenger-side APP or applet used by the user to drive the platform, and driver data generated by a driver-side APP order for the driver.

In a second step S22, each order in the sample data set is labeled according to the specific situation of the order taking result.

Specifically, a column of tag columns is added in the sample data set, and if a first specific situation occurs, namely after the target user places an order, the order is taken by a driver and no other row is followed, the order is marked as 0 in the corresponding sample data set; if the second special case occurs, i.e. the target user has not finished making an order after placing an order, i.e. the order has not been taken by the driver or an order removal row has taken place after the order has been taken by the driver, the order is marked 1 in the corresponding sample data set.

In a third step S23, according to data such as target order attribute information, user historical behavior information, peripheral driver information, and environmental information, design and development of a mold-entry feature are performed, that is, the mold-entry feature is determined according to the target order related information.

In particular, features developed include, but are not limited to: the number of order users around the order, the number of drivers around the order, the order taking ratio around the order, the amount of money after the order is folded and other order characteristics; the user characteristics such as the user call number, the user become the number, the user becomes the single ratio, the user removes the number, the user history removes the single waiting time and the like; the number of drivers in the periphery, the online time of the drivers in the periphery, the order taking amount of the drivers in the periphery, the order sending and receiving ratio of the drivers in the periphery, the distance between the drivers in the periphery and the starting point of the order and other driver data; as well as traffic conditions, weather characteristics, etc.

In the fourth step S24, screening of the modeled characteristics is performed based on, but not limited to, related indexes such as availability index, interpretability index, information amount index, correlation index, and stability index.

Specifically, the availability index needs to comprehensively consider various aspects such as product flow design, user authorization protocol, compliance requirements, model application links, and the like, and determine whether the feature data is continuously available. In the present invention, it is referred to whether the feature can be repeatedly developed on a line, for example, because the feature development cost is too high, the feature is discarded with a high probability.

The interpretability index means that the characteristic business logic needs to be clear and needs to meet business interpretability. In the present invention, the feature may explain the final result, for example, if the result of a feature is not normal (mostly determined by the service side), the feature may be discarded.

The information quantity index is an information quantity IV of the calculated feature for evaluating the predictive ability of the feature. Generally, the higher the IV, the stronger the predictive power. When the characteristic IV value is larger than a set threshold (generally set to be 0.02), the characteristic has prediction capability and meets the mold-entering requirement.

The correlation index is used for calculating the Pearson correlation coefficient of the features so as to evaluate the correlation between the features. The closer the correlation coefficient of the two characteristics is to 0, the weaker the linear correlation is, and the closer to 1 or-1, the stronger the linear correlation is. When the correlation coefficient between the two features is larger than a set threshold value (generally set to 0.6), the feature with the lower IV value is removed.

The Stability Index is a Population Stability Index (PSI) of the computed feature used to evaluate the Stability of the feature. When the PSI value is within the range of the set threshold (generally set to be 0-0.1), the characteristic is not changed or is rarely changed, and the stability requirement is met.

In the fifth step S25, firstly, the sample data set is divided into a training set and a verification set according to a preset proportion, where the training set is used for training the order taking rate prediction model, and the verification set is used for verifying the output result of the trained order taking rate prediction model, and determining whether the trained order taking rate prediction model meets the preset requirement. And then, training the order taking rate prediction model by adopting a machine learning algorithm such as a random forest algorithm, an XGboost algorithm, a decision tree algorithm and the like according to the screened characteristics, and finally obtaining the order taking rate prediction model after optimization.

Wherein the model is trained to adjust (learn) and determine the ideal values of all Weights and biases Bias by labeled samples. The training target is to minimize the loss function, and the machine learning algorithm does the following in the training process: multiple samples are examined and an attempt is made to find a model that minimizes losses, with the goal of minimizing losses (Loss).

The training parameters refer to parameters used by a machine learning algorithm, taking a decision tree as an example: including the depth of the tree, the number of leaf nodes, the minimum number of samples that the node can split, etc. The parameters are adjusted so as to obtain better model effect (minimizing loss function), and the accuracy of the model for predicting the order-missing order is improved.

To obtain the optimal parameters, the simplest example is to find the convergence point by computing the penalty function for each possible value in the entire data set, resulting in the optimal parameters: 1. calculating the loss: the Loss (Loss) under this secondary parameter (bias, weight) is calculated by a Loss function. 2. And (3) updating calculation parameters: the value of the loss function is detected and new values are generated for the parameters, e.g., bias, weight, to minimize the loss. The model under the optimal parameters is the optimal model, and the loss function of the optimal model is the minimum.

In a sixth step S26, the data of the verification set divided by the order sample data set is used to verify the output result of the trained order taking rate prediction model, and it is determined whether the recognition accuracy of the order taking rate prediction model for various risk users reaches a preset threshold. The accuracy of predicting the order of the order not formed mainly comprises two indexes: the accuracy rate of the order sample identification labeled 1 and the recall rate of the order sample identification labeled 1. The two indexes of accuracy and recall rate are defined as follows: precision = TP/(TP + FP), recall = TP/(TP + FN); wherein, TP: the sample labeled 1, predicted to be 1; FP: the sample labeled 0, predicted to be 1; FN: the sample labeled 1, is predicted to be 0.

In the seventh step S27, according to the prediction result of the order receiving rate prediction model, whether to perform prepayment or penalty charging on the order is optimized, so as to improve the normal user experience of the wind-controlled misjudgment order.

Specifically, aiming at a strategy and a model for punishing the order calling of the risk user in the online judgment mode, the order receiving rate prediction model is used for judging whether the order is subsequently formed, if so, the original punishment is kept, otherwise, the original punishment is cancelled or replaced by other punishments, so that the model punishment is optimized, the invalid punishment is reduced, and the user order calling experience is improved.

FIG. 3 is a schematic data flow diagram of the system for predicting the order taking rate of an abnormal order of a networked taxi appointment according to the present invention. The system of the present invention is used for executing the method of the present invention, and includes but is not limited to the following 7 modules, which are respectively:

the order sample data set acquisition module 31 is responsible for acquiring order sample data in a plurality of time periods specified by the user from the mobile phone terminal. Step S21 is executed to obtain an order sample set, and the order sample set is a number of order samples in a certain time period.

The sample data set labeling module 32 labels each order in the sample data set according to a specific situation, that is, executes step S22.

The feature design and development module 33 performs design and development of the model-entry features according to the data such as the target order attribute information, the user historical behavior information, the peripheral driver information, and the environmental information in the order sample set, that is, performs step S23.

The feature screening module 34 may perform screening of the model entry features based on the related indexes such as the availability index, the interpretability index, the information amount index, the correlation index, and the stability index, according to the model entry features and the labeled sample data set, that is, perform step S24. The relevant index may be one or more, and may not be limited to the above index.

The order taking rate prediction model training module 35 obtains the order taking rate prediction model by using machine learning algorithm training according to the screened features, and the order taking rate prediction model training may use GBDT (Gradient Boosting Decision Tree), neural network and other algorithms, that is, step S25 is executed.

The order taking rate prediction model evaluation module 36 verifies an output result of the trained order taking rate prediction model, and the evaluation of the order taking rate prediction model may use indexes such as ROC Curve (Receiver Operating characterizing Curve) and F1 score, that is, step S26 is executed.

And the abnormal optimization order penalty decision module 37 decides whether to perform order penalty according to the prediction result of the order receiving rate prediction model, namely, executes the step S27.

Fig. 4 shows an exemplary embodiment of the method of the present invention, and the idea of applying the present invention to solve the problem of the wind control algorithm specifically may include the following 4 points:

1. obtaining an original data domain, which is mainly to obtain a passenger side APP and user buried point data, behavior data, equipment data, order data and the like returned by the driver side APP;

2. designing and developing the mold-entering characteristics based on the original data domain, including the transformation and screening of the mold-entering characteristics;

3. selecting a proper algorithm, and training a model;

4. the model is applied to a specific scene.

Specifically, assume that a database is in a company's system, and the original sample data can be obtained. The occurrence of each order is recorded in the database. When a method for predicting the order taking rate of abnormal orders of online taxi appointment needs to be developed and such a model needs to be trained, the database can be derived and used as original sample data containing ten million records. This sample database itself contains a large amount of order data that can be used to train the model. The key to training the model is that the more sample data input, the better. A machine (such as a computer) can analyze a large amount of sample data and find out rules by itself to design (learn) a set of order rate prediction models which cannot be completed by human beings.

The order rate prediction model is evaluated by selecting several pieces of original data and inputting the selected pieces of original data into the model. Comparing the output result of the model with the actual situation, for example, the order rate prediction model predicts that the order can not be delivered, and then comparing the output result with the actually generated result. If the order is actually committed or not, and the result is consistent, the prediction of the order rate prediction model is correct. For example, taking 1000 orders to verify the evaluation according to the above method, it is possible to know what the correct rate of the model is.

After the order rate prediction model is determined, when the method is applied to the actual operation process, if an order comes in, the name, the telephone, the previous order expression, the background information and the like of a user can be searched in the database, and all information forms the current complete information of the real-time order by adding the time, the place, the actual situation and the like of the current order.

Real-time order information is input into the order rate prediction model for judgment, and the success rate of order taking can be predicted, and the order taking is not needed. If the predicted risk of the order rate prediction model is very high, the order is not taken (there can be several processing situations: 1, money must be pre-charged firstly, money must be paid firstly, a guarantee fund 3 is not paid firstly, the order is not taken, and the like), and if the predicted risk of the order rate prediction model is very low, the order is taken immediately, namely the actual application is realized.

In summary, the present invention designs and develops features based on driver status information and user order information, and uses driver data such as driver position information, driver service status information, driver online time, driver order taking amount, driver order sending and taking ratio, and distance from the driver to the order starting point, order data such as order price, order starting and ending point position, number of order calling users around the order, number of drivers around the order, order taking ratio around the order, and environmental factors such as traffic conditions and weather, to describe and influence the incoming model features of the order taking, so as to improve the accuracy rate of the order taking rate prediction model.

Based on the developed characteristics, a machine learning algorithm is adopted, an order taking rate prediction model is trained, the probability that the order is taken by a driver and the order is not removed subsequently is predicted, whether the order is subjected to prepayment or recharging punishment is optimized and decided is optimized, and normal order user experience of wind control misjudgment is improved. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A method for predicting the order receiving rate of abnormal orders of online taxi appointment comprises the following steps:

a step of determining mold entering characteristics, namely marking a label based on the order sample data set, and determining the mold entering characteristics, including screening of the mold entering characteristics;

2. The method for predicting the order taking rate of the online taxi appointment abnormal order according to claim 1, wherein in the data obtaining step, the database data of the passenger side APP and the driver side APP comprises one or more of the following information: target order attribute information, user historical behavior information, peripheral driver information, and environmental information.

3. The method of claim 1, wherein the step of determining the incoming model features comprises:

and screening the mold-entering characteristics based on the relevant indexes.

4. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 3, wherein: in the step of marking the order form with the label according to the specific situation of the order taking result in the sample data set, if the first specific situation occurs, namely after the target user places an order, the order is taken by the driver and no subsequent withdrawing line is right, marking the order as 0 in the corresponding sample data set; if a second specific situation occurs, i.e. the target user has not made a order after placing an order, i.e. the order has not been taken by the driver or the order has taken a row-off after having been taken by the driver, the order is marked 1 in the corresponding sample data set.

5. The call admission rate prediction method according to claim 3, wherein: the relevant indexes include an availability index, an interpretability index, an information content index, a relevance index and a stability index.

6. The system according to claim 5, wherein the system for predicting the pick-up rate of abnormal orders for online taxi booking:

the information quantity index is an information quantity IV of the calculated characteristic and is used for evaluating the predicting capability of the characteristic;

7. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 1, wherein: the training model step comprises:

8. The method of claim 7, wherein the method comprises: the mold-entering characteristics obtained after screening comprise: order characteristics, user characteristics, driver data, traffic conditions, and weather characteristics.

9. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 7, wherein: the machine learning algorithm can be a random forest algorithm, an XGboost algorithm or a decision tree algorithm.

10. The method for predicting the pick-up rate of the abnormal orders for online taxi appointment as claimed in claim 7, wherein: in the step of training the model, dividing the order sample data set into a training set and a verification set according to a preset proportion; training the data of the training set according to the screened entry features by adopting the machine learning algorithm to train the order taking rate prediction model; and the data of the verification set is used for verifying the output result of the trained order taking rate prediction model so as to judge whether the trained order taking rate prediction model meets the preset requirement.

11. A pick-up rate prediction system for network taxi appointment exception orders, the system performing the method of claims 1 to 10, comprising:

the order sample data set acquisition module acquires an order sample data set;

the characteristic screening module is used for screening the characteristic according to the model entering characteristic and the labeled order sample data set and based on one or more of the following property indexes: screening the model entering characteristics by using an availability index, an interpretability index, an information content index, a correlation index and a stability index;