CN112561187A - Network taxi booking target order prediction method based on CNN-LSTM - Google Patents

Network taxi booking target order prediction method based on CNN-LSTM Download PDF

Info

Publication number
CN112561187A
CN112561187A CN202011530817.1A CN202011530817A CN112561187A CN 112561187 A CN112561187 A CN 112561187A CN 202011530817 A CN202011530817 A CN 202011530817A CN 112561187 A CN112561187 A CN 112561187A
Authority
CN
China
Prior art keywords
order
data
cnn
lstm
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011530817.1A
Other languages
Chinese (zh)
Other versions
CN112561187B (en
Inventor
黄妙华
张昊天
柳子晗
贾昌昊
王玉玖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202011530817.1A priority Critical patent/CN112561187B/en
Publication of CN112561187A publication Critical patent/CN112561187A/en
Application granted granted Critical
Publication of CN112561187B publication Critical patent/CN112561187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Molecular Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of order data processing, in particular to a network taxi appointment target order prediction method based on CNN-LSTM. The invention comprises the following steps: 1. the method comprises the steps of processing a preset area into a plurality of sub-areas in a slicing mode; 2. acquiring original order data of each sub-area in a preset area; 3. obtaining target order data based on the original order data: the target order data comprises the total order amount, the average order price, the POI characteristic, the weather characteristic and the time characteristic of the same region in the same time period; 4. predicting the order quantity data of each region in the next time period based on the CNN-LSTM model: inputting total order quantity data, POI characteristics, weather characteristics and time characteristics in the target order data into a CNN-LSTM model to obtain order quantity prediction data of each area in the next time period; 5. and establishing a region PVD model to obtain a value thermodynamic diagram of each sub-region. The invention can comprehensively and accurately predict the target order data.

Description

Network taxi booking target order prediction method based on CNN-LSTM
Technical Field
The invention relates to the technical field of order data processing, in particular to a network taxi appointment target order prediction method based on CNN-LSTM.
Background
In recent years, with the rapid development of Chinese economy and the continuous improvement of urban scale, the demand of residents on daily trips is increasing. The net appointment vehicle becomes an important part of an intelligent transportation system. The net car of making an appointment provides multiple trip modes such as fast car, tailgating, gives more selection spaces for the passenger in the aspects such as time of getting on the bus and demand motorcycle type simultaneously, has greatly satisfied resident daily trip demand. But the net appointment car travel company is also faced with a series of problems such as: the target order quantity is difficult to predict, and the vehicle scheduling is difficult to optimize. The problems seriously hinder the profit of a net car booking and traveling company, particularly the problem of target order quantity prediction becomes a bottleneck restricting the development of the traveling company in recent years, great pressure and challenge are brought to the daily management and operation of the net car booking and traveling company, and a net car booking driver usually has the problems of long idle distance and long distance from a vehicle-on place, which are caused by the fact that the target order quantity prediction is not accurate enough and the vehicle scheduling is not good enough, so that a method for accurately predicting target order data is needed to be designed at present.
Disclosure of Invention
In order to solve the above problems, the present invention provides a CNN-LSTM-based network taxi appointment target order prediction method, which can comprehensively and accurately predict target order data.
In order to achieve the purpose, the invention designs a network taxi appointment target order prediction method based on CNN-LSTM, which is characterized by comprising the following steps:
s1: the method comprises the steps of processing a preset area into a plurality of sub-areas in a slicing mode;
s2: acquiring original order data of each sub-area in a preset area;
s3: obtaining target order data based on the original order data:
the target order data comprises order total amount data, average order price, POI characteristics, weather characteristics and time characteristics of the same region in the same time period;
s4: predicting the order quantity data of each region in the next time period based on the CNN-LSTM model: inputting total order quantity data, POI characteristics, weather characteristics and time characteristics in the target order data into a CNN-LSTM model to obtain order quantity prediction data of each area in the next time period;
s5: establishing a region PVD model, obtaining a value thermodynamic diagram of each sub-region, and driving a driver to a list receiving region of the next time period by referring to the value thermodynamic diagram of the region:
the calculation formula (1) of the PVD model is as follows:
Figure BDA0002852069300000021
wherein VtotalFor PVD model output, PtPredicting coefficients for order size, Vt is price coefficient, DtIs a distance coefficient;
order quantity prediction coefficient PtThe calculation formula (2) is:
Figure BDA0002852069300000022
wherein P istPredicting the order quantity for the order quantity prediction coefficient, predicting the order quantity predicted by the CNN-LSTM model, st the time period starting time, and ed the time period ending time;
the calculation formula (3) of the price coefficient Vt is:
Figure BDA0002852069300000023
whereinVtTo be a cost factor, PtFor the amount of orders in the current time period, Pt-1For the order quantity, V, of the preceding time periodlocAverage order prices for the sub-area history for the time period;
distance coefficient DtThe calculation formula (4) is:
Dt=|dx+dyequation (4)
Wherein DtIs a distance coefficient, dxFor the transverse length of the vehicle from the sub-area, dyIs the longitudinal length of the vehicle from the sub-area.
Preferably, the original order data in step S2 includes a user ID, a driver ID, an order number, a time when the user places an order, a driver order location longitude, a driver order location latitude, a user payment amount, and a driver final income.
As a preferred scheme, the specific process of acquiring the target order data in step S3 is as follows:
s3.1: grouping and aggregating original order data according to sub-regions and time sequence
Firstly, time slicing processing is carried out on original historical order data by taking a time period of 30-60 minutes as a time interval, and the original historical order data are sorted from small to large according to the time sequence; then, grouping and aggregating the original order data subjected to the fragmentation processing to obtain original historical order data of each time period of each sub-region;
s3.2: acquiring order total data of each time period of each sub-area
Aggregating and accumulating the grouped and aggregated original order data to obtain order total data of each time period of each sub-region;
s3.3: obtaining average order price of each time section of each sub-area
Calculating and obtaining average booking price data of 108 areas in Wuhan city in each time period based on the user real payment amount of the original booking data and the final income data of the driver;
s3.4: obtaining POI characteristics of each historical order
Based on a map API provided by a high-end company, searching the longitude of a driver order taking position, the latitude of the driver order taking position, the longitude of a destination position and the latitude of the destination position in the original historical order quantity data after grouping aggregation by using a crawler to obtain the POI characteristics of the driver order taking position and the POI characteristics of the destination;
s3.5: obtaining weather characteristics of each historical order
Acquiring historical weather data provided by a historical weather network by using a crawler, and acquiring weather characteristics of each historical order based on the area and time of each historical order;
s3.6: obtaining time characteristics of each historical order
Obtaining current time characteristics based on the time of each historical order;
s3.7: the method comprises the steps of splicing total order data, average order price, POI characteristics, weather characteristics and time characteristics into original order data to obtain an order data set, extracting order data of the same time period in the same area from the order data set to obtain target order data, wherein the target order data comprises order data of the current time period, the same time period in the same area before one day, the same time period in the same area before two days and the same time period in the same area before seven days.
As a preferable scheme, the specific process of predicting the order quantity data of each area of the next time period based on the CNN-LSTM model in step S4 is as follows:
s4.1: firstly, establishing a CNN-LSTM initial model
Firstly, filling order total data in target order data into a matrix to obtain a five-dimensional array;
secondly, constructing a five-layer CNN-LSTM initial model;
s4.2: correcting the CNN-LSTM preliminary model to obtain a CNN-LSTM model
And calculating a coefficient determining value according to the predicted order quantity and the actual order data quantity of the CNN-LSTM initial model, and continuously adjusting and correcting parameters to obtain the CNN-LSTM model.
5. The CNN-LSTM-based network appointment target order prediction method as claimed in claim 4, wherein the specific process for constructing a five-layer CNN-LSTM initial model comprises: constructing a five-layer CNN-LSTM initial model, adding a layer of Batchnormation after each layer of CNN-LSTM to avoid overfitting, wherein the CNN-LSTM structural calculation formula used by the model is as follows:
an input gate:
Figure BDA0002852069300000041
wherein itFor the output of the input gate, Wxi,Whi,Wci,biCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the state of the cell of the previous sequence,' indicates the convolution operator,
Figure BDA0002852069300000042
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
forget the door:
Figure BDA0002852069300000043
wherein f istFor forgetting the output of the gate, Wxf,Whf,Wcf,bfCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the state of the cell of the previous sequence,' indicates the convolution operator,
Figure BDA0002852069300000044
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
cell state:
Figure BDA0002852069300000045
wherein c istIn a cellular state, Wxc,Whc,bcCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the cellular state of the last sequence,
Figure BDA0002852069300000046
representing the Hadamard product, tanh representing the excitation function tanh;
an output gate:
Figure BDA0002852069300000047
wherein o istTo output the output of the gate, Wxo,Who,Wco,boCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ctAn output indicating the state of the cell,' indicates a convolution operator,
Figure BDA0002852069300000051
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
hiding the layer:
Figure BDA0002852069300000052
wherein HtIn a hidden state, otIs the output of the output gate or gates,
Figure BDA0002852069300000053
representing the Hadamard product, tanh represents the excitation function tanh, ctIndicating the state of the cell.
Preferably, in step S1, the mesh based on the Geohash is divided and numbered.
Compared with the conventional network taxi appointment target order prediction method, the method has the advantages that:
compared with an LSTM model, the CNN-LSTM model can better capture space-time correlation, and can build a network model for a more general space-time sequence prediction problem by superposing a plurality of ConvLSTM layers and forming a coding prediction structure.
2, the invention uses the total order data, POI characteristics, weather characteristics and time characteristics in the target order data as the input parameters of the CNN-LSTM model, and can obtain more accurate prediction results.
3, establishing a PVD model to obtain a visual value thermodynamic diagram of each sub-area, and driving a driver to a pick-up area of the next time period by referring to the value thermodynamic diagram of the area; the PVD model established by the invention comprehensively refers to the predicted order quantity of each sub-area in the next time period, the predicted order price of each sub-area in the next time period and the distance parameter of the driver to each sub-area, and can provide reference basis for the driver more comprehensively.
Drawings
FIG. 1 is a flow chart of the network taxi appointment target order prediction method based on CNN-LSTM in the invention
Detailed Description
For a better understanding of the present invention, reference will now be made in detail to the present invention, which is illustrated in the accompanying drawings.
Referring to fig. 1, taking wuhan city as an example:
s1: the method comprises the steps of processing a preset area into a plurality of sub-areas in a slicing mode;
carrying out Geohash-based grid division and numbering on the Wuhan city, wherein Geohash codes represent a rectangular region, the larger the number of bits of the Geohash codes is, the smaller the represented region is, the Wuhan city is divided according to 5-bit Geohash codes, and the regions are named from west to east and from north to south;
the Geohash divides the map of Wuhan city into 108 areas, and the 108 areas are identified by numbers.
S2: the method comprises the following steps of collecting original order data of 108 areas in Wuhan city:
providing original order data in Wuhan city by a trip company, wherein the original order data comprises a user ID, a driver ID, an order number, user ordering time, driver order receiving position longitude, driver order receiving position latitude, user payment amount and driver final income, and performing data cleaning (clearing null value and abnormal data) on the original order data;
s3: obtaining target order data based on raw order data
S3.1: grouping and aggregating original order data according to sub-regions and time sequence
Firstly, time slicing processing is carried out on original historical order data by taking a time period (30 minutes) as a time interval, and sequencing is carried out from small to large according to a time sequence; then carrying out groupby processing on the original order data subjected to fragment processing to obtain original historical order data of 108 regions in the Wuhan city in each time period;
s3.2: acquiring order total data of 108 areas in Wuhan City in each time period
Aggregating and accumulating the grouped and aggregated original order data to obtain order total data of 108 areas in Wuhan City in each time period;
s3.3: obtaining average order price of 108 areas in Wuhan city in each time period
Calculating and obtaining average booking price data of 108 areas in Wuhan city in each time period based on the user real payment amount of the original booking data and the final income data of the driver;
s3.4: obtaining POI characteristics of each historical order
Based on a map API provided by a high-end company, searching the longitude of a driver order taking position, the latitude of the driver order taking position, the longitude of a destination position and the latitude of the destination position in the original historical order quantity data after grouping aggregation by using a crawler to obtain the POI characteristics of the driver order taking position and the POI characteristics of the destination; POI characteristics such as hotel, residential district, common market, factory, company, primary school, comprehensive hospital, mechano-electronics, middle school, disease prevention, and One-Hot coding processing;
s3.5: obtaining weather characteristics of each historical order
Acquiring historical weather data provided by a historical weather network by using a crawler, and acquiring weather characteristics of each historical order based on the area and time of each historical order;
s3.6: obtaining time characteristics of each historical order
Obtaining current time characteristics based on the time of each historical order; characteristics such as saturday, national day holidays, meta-denier holidays, peak hours on and off duty, student return time, grand rehearsal time, and the like;
s3.7: splicing the total order data, the average order price, the POI characteristics, the weather characteristics and the time characteristics into the original order data to obtain an order data set, extracting the order data of the same time period in four days from the order data set to obtain target order data, wherein the target order data are the current time period, the same region time period before one day, the same region time period before two days and the same region time period before seven days;
s4: predicting order quantity data of 108 areas in next time period based on CNN-LSTM model
S4.1: firstly, establishing a CNN-LSTM initial model
Firstly, filling order total data in target order data into a matrix, specifically filling order total of a current time period, a same region and time period before one day, a same region and time period before two days and a same region and time period before seven days into the matrix according to area numbers to obtain a five-dimensional nparray;
secondly, constructing a five-layer CNN-LSTM initial model, adding a layer of Batchnormation after each layer of CNN-LSTM to avoid overfitting, wherein the CNN-LSTM structural calculation formula used by the model is as follows:
an input gate:
Figure BDA0002852069300000071
wherein itFor the output of the input gate, Wxi,Whi,Wci,biCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the state of the cell of the previous sequence,' indicates the convolution operator,
Figure BDA0002852069300000072
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
forget the door:
Figure BDA0002852069300000073
wherein f istFor forgetting the output of the gate, Wxf,Whf,Wcf,bfCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the state of the cell of the previous sequence,' indicates the convolution operator,
Figure BDA0002852069300000081
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
cell state:
Figure BDA0002852069300000082
wherein c istIn a cellular state, Wxc,Whc,bcCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the cellular state of the last sequence,
Figure BDA0002852069300000083
representing the Hadamard product, tanh representing the excitation function tanh;
an output gate:
Figure BDA0002852069300000084
wherein o istTo output the output of the gate, Wxo,Who,Wco,boCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ctAn output indicating the state of the cell,' indicates a convolution operator,
Figure BDA0002852069300000085
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
hiding the layer:
Figure BDA0002852069300000086
wherein HtIn a hidden state, otIs the output of the output gate or gates,
Figure BDA0002852069300000087
representing the Hadamard product, tanh represents the excitation function tanh, ctIndicating the state of the cell;
s4.2: correcting the CNN-LSTM preliminary model to obtain a CNN-LSTM model
Inputting total order quantity data, POI characteristics, weather characteristics and time characteristics in target order data into a CNN-LSTM preliminary model to obtain order quantity prediction data of each area of the next time period, taking the order quantity prediction data obtained by the CNN-LSTM preliminary model as a training set, taking actual order quantity data as a verification set to evaluate the performance of the CNN-LSTM preliminary model, training, calculating a decision coefficient value according to the prediction result of an algorithm model and the actual order quantity, and obtaining the CNN-LSTM model capable of performing short-time order prediction after continuously adjusting parameters and correcting;
s5: establishing a region PVD model of the value thermodynamic diagram, obtaining the value thermodynamic diagram of each sub-region, and driving a driver to a pick-up region of the next time period by referring to the value thermodynamic diagram of the region;
the calculation formula (1) of the PVD model is as follows:
Figure BDA0002852069300000088
wherein VtotalFor PVD model output, PtPredicting coefficients for order size, Vt is price coefficient, DtIs a distance coefficient;
order quantity prediction coefficient PtThe calculation formula (2) is:
Figure BDA0002852069300000091
wherein P istPredicting the order quantity for the order quantity prediction coefficient, predicting the order quantity predicted by the CNN-LSTM model, st the time period starting time, and ed the time period ending time;
the calculation formula (3) of the price coefficient Vt is:
Figure BDA0002852069300000092
wherein VtTo be a cost factor, PtFor the amount of orders in the current time period, Pt-1For the order quantity, V, of the preceding time periodlocAverage order prices for the sub-area history for the time period;
distance coefficient DtThe calculation formula (4) is:
Dt=|dx+dyequation (4)
Wherein DtIs a distance coefficient, dxFor the transverse length of the vehicle from the sub-area, dyIs the longitudinal length of the vehicle from the sub-area.
The above-mentioned embodiments only express one embodiment of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (6)

1. A network taxi booking target order prediction method based on CNN-LSTM is characterized by comprising the following steps:
s1: the method comprises the steps of processing a preset area into a plurality of sub-areas in a slicing mode;
s2: acquiring original order data of each sub-area in a preset area;
s3: obtaining target order data based on the original order data:
the target order data comprises order total amount data, average order price, POI characteristics, weather characteristics and time characteristics of the same region in the same time period;
s4: predicting the order quantity data of each region in the next time period based on the CNN-LSTM model: inputting total order quantity data, POI characteristics, weather characteristics and time characteristics in the target order data into a CNN-LSTM model to obtain order quantity prediction data of each area in the next time period;
s5: establishing a region PVD model, obtaining a value thermodynamic diagram of each sub-region, and driving a driver to a list receiving region of the next time period by referring to the value thermodynamic diagram of the region:
the calculation formula (1) of the PVD model is as follows:
Figure FDA0002852069290000011
wherein VtotalFor PVD model output, PtPredicting coefficients for order size, Vt is price coefficient, DtIs a distance coefficient;
order quantity prediction coefficient PtThe calculation formula (2) is:
Figure FDA0002852069290000012
wherein P istPredicting the order quantity for the order quantity prediction coefficient, predicting the order quantity predicted by the CNN-LSTM model, st the time period starting time, and ed the time period ending time;
the calculation formula (3) of the price coefficient Vt is:
Figure FDA0002852069290000013
wherein VtTo be a cost factor, PtFor the amount of orders in the current time period, Pt-1For the order quantity, V, of the preceding time periodlocAverage order prices for the sub-area history for the time period;
distance coefficient DtThe calculation formula (4) is:
Dt=|dx+dyequation (4)
Wherein DtIs a distance coefficient, dxFor the distance of the vehicleTransverse length of the sub-region, dyIs the longitudinal length of the vehicle from the sub-area.
2. The CNN-LSTM-based network appointment target order prediction method of claim 1, wherein the original order data in step S2 comprises user ID, driver ID, order number, user placing time, driver pick-up location longitude, driver pick-up location latitude, user payment amount, and driver final income.
3. The CNN-LSTM-based network appointment target order prediction method according to claim 2, wherein the step S3 includes the following specific steps:
s3.1: grouping and aggregating original order data according to sub-regions and time sequence
Firstly, time slicing processing is carried out on original historical order data by taking a time period of 30-60 minutes as a time interval, and the original historical order data are sorted from small to large according to the time sequence; then, grouping and aggregating the original order data subjected to the fragmentation processing to obtain original historical order data of each time period of each sub-region;
s3.2: acquiring order total data of each time period of each sub-area
Aggregating and accumulating the grouped and aggregated original order data to obtain order total data of each time period of each sub-region;
s3.3: obtaining average order price of each time section of each sub-area
Calculating and obtaining average booking price data of 108 areas in Wuhan city in each time period based on the user real payment amount of the original booking data and the final income data of the driver;
s3.4: obtaining POI characteristics of each historical order
Based on a map API provided by a high-end company, searching the longitude of a driver order taking position, the latitude of the driver order taking position, the longitude of a destination position and the latitude of the destination position in the original historical order quantity data after grouping aggregation by using a crawler to obtain the POI characteristics of the driver order taking position and the POI characteristics of the destination;
s3.5: obtaining weather characteristics of each historical order
Acquiring historical weather data provided by a historical weather network by using a crawler, and acquiring weather characteristics of each historical order based on the area and time of each historical order;
s3.6: obtaining time characteristics of each historical order
Obtaining current time characteristics based on the time of each historical order;
s3.7: the method comprises the steps of splicing total order data, average order price, POI characteristics, weather characteristics and time characteristics into original order data to obtain an order data set, extracting order data of the same time period in the same area from the order data set to obtain target order data, wherein the target order data comprises order data of the current time period, the same time period in the same area before one day, the same time period in the same area before two days and the same time period in the same area before seven days.
4. The CNN-LSTM-based network appointment target order prediction method according to claim 3, wherein the step S4 is a specific process of predicting order volume data of each area in the next time slot based on the CNN-LSTM model, and comprises:
s4.1: firstly, establishing a CNN-LSTM initial model
Firstly, filling order total data in target order data into a matrix to obtain a five-dimensional array;
secondly, constructing a five-layer CNN-LSTM initial model;
s4.2: correcting the CNN-LSTM preliminary model to obtain a CNN-LSTM model
And calculating a coefficient determining value according to the predicted order quantity and the actual order data quantity of the CNN-LSTM initial model, and continuously adjusting and correcting parameters to obtain the CNN-LSTM model.
5. The CNN-LSTM-based network appointment target order prediction method as claimed in claim 4, wherein the specific process for constructing a five-layer CNN-LSTM initial model comprises: constructing a five-layer CNN-LSTM initial model, adding a layer of Batchnormation after each layer of CNN-LSTM to avoid overfitting, wherein the CNN-LSTM structural calculation formula used by the model is as follows:
an input gate:
Figure FDA0002852069290000031
wherein itFor the output of the input gate, Wxi,Whi,Wci,biCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the state of the cell of the previous sequence,' indicates the convolution operator,
Figure FDA0002852069290000032
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
forget the door:
Figure FDA0002852069290000033
wherein f istFor forgetting the output of the gate, Wxf,Whf,Wcf,bfCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the state of the cell of the previous sequence,' indicates the convolution operator,
Figure FDA0002852069290000041
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
cell state:
Figure FDA0002852069290000042
wherein c istIn a cellular state, Wxc,Whc,bcCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ct-1Indicating the cellular state of the last sequence,
Figure FDA0002852069290000043
representing the Hadamard product, tanh representing the excitation function tanh;
an output gate:
Figure FDA0002852069290000044
wherein o istTo output the output of the gate, Wxo,Who,Wco,boCoefficient and bias in a linear relationship, Ht-1Is a previous sequence of hidden states, χtIs the data of this sequence, ctAn output indicating the state of the cell,' indicates a convolution operator,
Figure FDA0002852069290000045
representing a Hadamard product, and sigma represents a stimulus function sigmoid;
hiding the layer:
Figure FDA0002852069290000046
wherein HtIn a hidden state, otIs the output of the output gate or gates,
Figure FDA0002852069290000047
representing the Hadamard product, tanh represents the excitation function tanh, ctIndicating the state of the cell.
6. The CNN-LSTM-based network taxi appointment target order prediction method according to any one of claims 1-5, wherein in the step S1, the grid is divided and numbered based on Geohash.
CN202011530817.1A 2020-12-22 2020-12-22 Network taxi booking target order prediction method based on CNN-LSTM Active CN112561187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011530817.1A CN112561187B (en) 2020-12-22 2020-12-22 Network taxi booking target order prediction method based on CNN-LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011530817.1A CN112561187B (en) 2020-12-22 2020-12-22 Network taxi booking target order prediction method based on CNN-LSTM

Publications (2)

Publication Number Publication Date
CN112561187A true CN112561187A (en) 2021-03-26
CN112561187B CN112561187B (en) 2022-06-03

Family

ID=75031648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011530817.1A Active CN112561187B (en) 2020-12-22 2020-12-22 Network taxi booking target order prediction method based on CNN-LSTM

Country Status (1)

Country Link
CN (1) CN112561187B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222373A (en) * 2021-04-28 2021-08-06 广州宸祺出行科技有限公司 Driver scheduling method and system based on value selection
CN114066076A (en) * 2021-11-22 2022-02-18 北京白龙马云行科技有限公司 Network taxi appointment prediction method and device based on multiple tenants
CN114331011A (en) * 2021-11-30 2022-04-12 中国科学院深圳先进技术研究院 Multi-queue model dispatching system and method and dispatching algorithm based on network flow
CN114418606A (en) * 2021-12-01 2022-04-29 武汉大学 Network taxi appointment order demand prediction method based on space-time convolutional network
CN114819414A (en) * 2022-06-24 2022-07-29 北京阿帕科蓝科技有限公司 Block demand prediction method, system and computer storage medium
CN116822916A (en) * 2023-08-31 2023-09-29 北京阿帕科蓝科技有限公司 Order quantity acquisition method, order quantity acquisition device, computer equipment and storage medium
CN117575546A (en) * 2024-01-17 2024-02-20 北京白龙马云行科技有限公司 Background management system for network vehicle-restraining platform

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985475A (en) * 2018-06-13 2018-12-11 厦门大学 Net based on deep neural network about vehicle car service needing forecasting method
CN110135624A (en) * 2019-04-15 2019-08-16 武汉科技大学 A kind of data predication method of the combination LSTM model based on 2-D data stream
CN110599767A (en) * 2019-09-04 2019-12-20 广东工业大学 Long-term and short-term prediction method based on network taxi appointment travel demands
CN110633871A (en) * 2019-09-25 2019-12-31 大连理工大学 Regional traffic demand prediction method based on convolution long-term and short-term memory network
CN111429235A (en) * 2020-04-17 2020-07-17 汉海信息技术(上海)有限公司 Method, device and equipment for acquiring order thermodynamic information and storage medium
CN111476588A (en) * 2019-01-24 2020-07-31 北京嘀嘀无限科技发展有限公司 Order demand prediction method and device, electronic equipment and readable storage medium
EP3716164A1 (en) * 2019-03-28 2020-09-30 Accenture Global Solutions Limited Predictive power usage monitoring

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985475A (en) * 2018-06-13 2018-12-11 厦门大学 Net based on deep neural network about vehicle car service needing forecasting method
CN111476588A (en) * 2019-01-24 2020-07-31 北京嘀嘀无限科技发展有限公司 Order demand prediction method and device, electronic equipment and readable storage medium
EP3716164A1 (en) * 2019-03-28 2020-09-30 Accenture Global Solutions Limited Predictive power usage monitoring
CN110135624A (en) * 2019-04-15 2019-08-16 武汉科技大学 A kind of data predication method of the combination LSTM model based on 2-D data stream
CN110599767A (en) * 2019-09-04 2019-12-20 广东工业大学 Long-term and short-term prediction method based on network taxi appointment travel demands
CN110633871A (en) * 2019-09-25 2019-12-31 大连理工大学 Regional traffic demand prediction method based on convolution long-term and short-term memory network
CN111429235A (en) * 2020-04-17 2020-07-17 汉海信息技术(上海)有限公司 Method, device and equipment for acquiring order thermodynamic information and storage medium

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
ANDRE DANTAS: "Neural network for travel demand forecast using GIS and remote sensing", 《IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS》 *
JINTAO KE: "Short-term forecasting of passenger demand under on-demand ride services:A spatio-temporal deep learning approach", 《TRANSPORTATION RESEARCH》 *
MIAOHUA HUANG: "Prediction of remaining useful life of lithium-ion battery based on multi-kernel support vector machine with particle swarm optimization", 《JOURNAL OF POWER ELECTRONICS》 *
张逢笑: "基于深度神经网络的网约车出行需求预测方法研究", 《中国优秀硕士学位论文全文数据库 工程科技II辑》 *
李浩: "融合VGG与FCN的智能出租车订单预测模型", 《计算机工程》 *
谷远利: "基于深度学习的网约车供需缺口短时预测研究", 《交通运输***工程与信息》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222373A (en) * 2021-04-28 2021-08-06 广州宸祺出行科技有限公司 Driver scheduling method and system based on value selection
CN114066076A (en) * 2021-11-22 2022-02-18 北京白龙马云行科技有限公司 Network taxi appointment prediction method and device based on multiple tenants
CN114331011A (en) * 2021-11-30 2022-04-12 中国科学院深圳先进技术研究院 Multi-queue model dispatching system and method and dispatching algorithm based on network flow
CN114418606A (en) * 2021-12-01 2022-04-29 武汉大学 Network taxi appointment order demand prediction method based on space-time convolutional network
CN114418606B (en) * 2021-12-01 2024-05-28 武汉大学 Network vehicle order demand prediction method based on space-time convolution network
CN114819414A (en) * 2022-06-24 2022-07-29 北京阿帕科蓝科技有限公司 Block demand prediction method, system and computer storage medium
CN116822916A (en) * 2023-08-31 2023-09-29 北京阿帕科蓝科技有限公司 Order quantity acquisition method, order quantity acquisition device, computer equipment and storage medium
CN116822916B (en) * 2023-08-31 2024-01-26 北京阿帕科蓝科技有限公司 Order quantity acquisition method, order quantity acquisition device, computer equipment and storage medium
CN117575546A (en) * 2024-01-17 2024-02-20 北京白龙马云行科技有限公司 Background management system for network vehicle-restraining platform
CN117575546B (en) * 2024-01-17 2024-04-05 北京白龙马云行科技有限公司 Background management system for network vehicle-restraining platform

Also Published As

Publication number Publication date
CN112561187B (en) 2022-06-03

Similar Documents

Publication Publication Date Title
CN112561187B (en) Network taxi booking target order prediction method based on CNN-LSTM
CN110570651B (en) Road network traffic situation prediction method and system based on deep learning
CN108985475B (en) Network taxi appointment and taxi calling demand prediction method based on deep neural network
WO2021212866A1 (en) Vehicle travel volume prediction model construction method, and prediction method and system
CN110503104B (en) Short-time remaining parking space quantity prediction method based on convolutional neural network
CN110390349A (en) Bus passenger flow volume based on XGBoost model predicts modeling method
CN110210664B (en) Deep learning method for short-term prediction of using behaviors of multiple individual vehicles
CN112863182B (en) Cross-modal data prediction method based on transfer learning
CN113380025B (en) Vehicle driving quantity prediction model construction method, prediction method and system
CN110991607B (en) Subway passenger flow prediction method and device, electronic equipment and storage medium
CN106910199A (en) Towards the car networking mass-rent method of city space information gathering
CN113780665B (en) Private car stay position prediction method and system based on enhanced recurrent neural network
CN112488185A (en) Method, system, electronic device and readable storage medium for predicting vehicle operating parameters including spatiotemporal characteristics
CN114692984A (en) Traffic prediction method based on multi-step coupling graph convolution network
Tavafoghi et al. A queuing approach to parking: Modeling, verification, and prediction
CN111815956B (en) Expressway traffic flow prediction method
CN116187591A (en) Method for predicting number of remaining parking spaces in commercial parking lot based on dynamic space-time trend
CN112559585A (en) Traffic space-time sequence single-step prediction method, system and storage medium
CN110490365B (en) Method for predicting network car booking order quantity based on multi-source data fusion
CN114418606B (en) Network vehicle order demand prediction method based on space-time convolution network
CN113821547B (en) Rapid and efficient short-time prediction method, system and storage medium for occupancy of parking lot
CN114372830A (en) Network taxi booking demand prediction method based on space-time multi-graph neural network
CN116307293B (en) Urban space-time data prediction method based on hybrid perception and causal depolarization
CN111985731A (en) Method and system for predicting number of people at urban public transport station
CN116543528A (en) Regional landslide hazard early warning method based on rainfall threshold

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant