CN113222645A - Urban hot spot area peak trip demand prediction method based on multi-source data fusion - Google Patents
Urban hot spot area peak trip demand prediction method based on multi-source data fusion Download PDFInfo
- Publication number
- CN113222645A CN113222645A CN202110443653.7A CN202110443653A CN113222645A CN 113222645 A CN113222645 A CN 113222645A CN 202110443653 A CN202110443653 A CN 202110443653A CN 113222645 A CN113222645 A CN 113222645A
- Authority
- CN
- China
- Prior art keywords
- trip
- travel
- influence
- data
- peak
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Finance (AREA)
- Entrepreneurship & Innovation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Accounting & Taxation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Game Theory and Decision Science (AREA)
- Mathematical Physics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a method for predicting peak travel demands in an urban hot spot area based on multi-source data fusion, which comprises the following steps: acquiring a travel influence feature set of an area to be predicted, wherein the travel influence feature set comprises travel influence features and/or a travel influence feature combination; respectively acquiring peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination in a trip influence characteristic set; respectively inputting peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination into the corresponding trained GRU model to obtain a first trip prediction result corresponding to each trip influence characteristic and/or each trip influence characteristic combination; and inputting the first trip prediction results corresponding to all trip influence characteristics and/or all trip influence characteristic combinations into the trained random forest model, and obtaining the peak trip demand prediction result of the area to be predicted. The invention can obtain a more accurate travel demand prediction result.
Description
Technical Field
The invention relates to the technical field of traffic data processing, in particular to a method for predicting peak travel demands in an urban hot spot area based on multi-source data fusion.
Background
With the development of technologies such as big data and internet of things, smart cities and intelligent transportation become the development direction of city construction. Due to popularization of urbanization, urban population is suddenly increased, urban resident traveling becomes an important attention object for urban traffic construction, and urban resident traveling demand prediction technology also becomes a research direction in related technical fields.
At present, the urban traffic trip demand is greatly increased, the trip purpose and the trip mode of urban residents are also more and more diversified, and the trip data generated by the method are more and more multisourced and comprise GPS positioning data, mobile phone signaling data, bus card swiping data and the like. Urban traffic data extracted from the data is beneficial to helping people to analyze and predict urban travel demands more comprehensively. The traffic flow data extracted from the GPS positioning data of the urban taxies represents a convenient and rapid traffic trip mode, and flexible trip service can be provided for the trip demand of citizens; the traffic flow data extracted from the mobile phone signaling data of urban residents contains various types of information of traffic travel modes in the city, and the daily activity patterns of human beings can be effectively recognized and researched through analysis of the information, so that the overall travel condition of the city is reflected. Analysis of the above data enables many traffic problems to be obtained and solved, but the data often records a characteristic analysis in a particular direction. How to effectively correlate and match various mass data, and the extraction and excavation of urban trip requirements become a research overall trend. The existing travel demand prediction technology does not fully utilize the data, and has the problem of low prediction accuracy.
Disclosure of Invention
The invention solves the problem that the prediction accuracy is not high in the existing travel demand prediction technology.
The invention provides a multi-source data fusion-based urban hot spot region peak travel demand prediction method, which comprises the following steps:
acquiring a travel influence feature set of the area to be predicted, wherein the travel influence feature set comprises travel influence features and/or a travel influence feature combination;
respectively acquiring peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination in the trip influence characteristic set;
respectively inputting peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination into a corresponding trained GRU model to obtain a first trip prediction result corresponding to each trip influence characteristic and/or each trip influence characteristic combination;
and inputting the first trip prediction results corresponding to all the trip influence characteristics and/or all the trip influence characteristic combinations into a trained random forest model, and obtaining a peak trip demand prediction result of the area to be predicted.
Optionally, before the obtaining of the travel influence feature set of the region to be predicted, the method further includes:
acquiring running speed data of the to-be-predicted area associated road, and performing cluster statistics on the running speed data to obtain a running speed distribution rule of the to-be-predicted area associated road along with time variation;
and determining the peak running time of the area to be predicted according to the running speed distribution rule.
Optionally, the travel influence characteristic comprises at least one of:
land planning, date characteristics, traffic planning, weather changes, special events, road operating conditions.
Optionally, the travel influence feature combination includes at least one of:
planning land and traffic; date features and special events; date characteristics and weather changes; weather changes and road operating conditions.
Optionally, the training process of the GRU model and the random forest model includes:
acquiring historical trip data of the peak trip time of the area to be predicted, and generating a training data set according to the historical trip data;
predicting by adopting the GRU model based on the training data set to obtain a preliminary prediction result, inputting the preliminary prediction result into the random forest model, and outputting a final prediction result by the random forest model;
and calculating the error between the final prediction result and the actual running amount, and adjusting the model parameters of the GRU model and the random forest model based on the error until the loss function is converged.
Optionally, the loss function comprises:
wherein, yiRepresenting the actual out value of the sample data, f (wx)i+ b) represents the prediction function f from sample data xiBy continuously adjusting f (wx)i+ b) adjusting the weights w and the deviations b and introducing the learning rates α, wiRepresents an initial value of the weight, wi+1 represents the updated weight value by:
and obtaining the optimal w value and the optimal b value.
Optionally, the historical travel data includes: taxi GPS data, bus GPS data, mobile phone signaling data, bus IC card swiping data and/or subway IC card swiping data.
The invention also provides a travel demand prediction device, which comprises:
an obtaining unit, configured to obtain a travel influence feature set of the area to be predicted, where the travel influence feature set includes a travel influence feature and/or a travel influence feature combination; respectively acquiring peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination in the trip influence characteristic set;
the first prediction unit is used for respectively inputting peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination into the corresponding trained GRU model to obtain a first trip prediction result corresponding to each trip influence characteristic and/or each trip influence characteristic combination;
and the second prediction unit is used for inputting the first trip prediction results corresponding to all the trip influence characteristics and/or all the trip influence characteristic combinations into the trained random forest model to obtain the peak trip demand prediction result of the area to be predicted.
The invention further provides a travel demand prediction terminal, which comprises a computer readable storage medium and a processor, wherein the computer readable storage medium is used for storing a computer program, and when the computer program is read and operated by the processor, the travel demand prediction method for the peak travel demand in the urban hot spot area based on the multi-source data fusion is realized.
The invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is read and executed by a processor, the method for predicting the peak travel demand in the urban hot spot area based on multi-source data fusion is realized.
Drawings
Fig. 1 is a schematic diagram of an embodiment of a method for predicting peak travel demand in an urban hot spot area based on multi-source data fusion according to the present invention;
fig. 2 is a schematic diagram of another embodiment of the urban hot spot area peak travel demand prediction method based on multi-source data fusion according to the present invention;
fig. 3 is a schematic diagram of an embodiment of a travel impact characteristic data set in the method for predicting peak travel demand in an urban hot spot area based on multi-source data fusion according to the present invention;
fig. 4 is a schematic diagram illustrating an example of a prediction method for peak travel demand in an urban hot spot area based on multi-source data fusion according to the present invention;
FIG. 5 is a schematic diagram illustrating comparison of effects of the urban hot spot area peak travel demand prediction method based on multi-source data fusion and other methods;
FIG. 6 is a schematic diagram of an embodiment of a trip demand forecasting apparatus according to the present invention;
fig. 7 is a schematic diagram of an embodiment of a travel demand forecasting terminal according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
The invention provides a multi-source data fusion-based urban hot spot area peak travel demand prediction method.
Fig. 1 is a schematic diagram of an embodiment of a method for predicting peak travel demand in an urban hot spot area based on multi-source data fusion. The urban hot spot area peak travel demand prediction method based on multi-source data fusion comprises the following steps:
step S10, obtaining a travel influence feature set of the area to be predicted, where the travel influence feature set includes a travel influence feature and/or a travel influence feature combination.
The area to be predicted is a city hotspot area, such as a school, a hospital, a scenic spot, an amusement park, a market, and the like. Different regions to be predicted have different trip influence characteristic sets, the corresponding relation between the region type and the trip influence characteristic set can be preset, when the trip amount prediction is carried out on the regions to be predicted, the region type of the regions to be predicted is judged firstly, the corresponding relation between the region type and the trip influence characteristic set is inquired according to the region type of the regions to be predicted, and the trip influence characteristic set corresponding to the regions to be predicted is determined.
The trip influence characteristics are factors influencing the trip demand of the area to be detected, optionally, the trip influence characteristics can refer to space time characteristics such as road types and dates, the trip influence characteristics can also refer to characteristics such as weather, special events and road conditions, and the trip influence characteristics can also refer to household personnel composition characteristics, job category characteristics, household income characteristics, automobile holding capacity characteristics and household address characteristics.
Optionally, as shown in fig. 3, time-space characteristic analysis is performed at different time intervals and in different environments based on urban trip demands, trip influence characteristic data is divided into a static influence characteristic set, a dynamic influence characteristic set and a social influence characteristic set, and the static factor data set, the dynamic factor data set and the social factor data set are correspondingly stored to form different characteristic data set management modes. The social influence feature set is a basic factor for determining the traffic volume, and the static influence feature set and the dynamic influence feature set are influence factors of the traffic volume.
The static influence feature set comprises features with strong relevance with the regions to be predicted, such as traffic planning features, road types, time and the like, and for the features, a database can be preset for storing the features of each region. And when a certain region becomes a region to be detected, directly acquiring the static influence characteristics of the region to be detected from the database.
The dynamic influence characteristic set comprises characteristics of weather, special events, road conditions and the like which change frequently, in one embodiment, a database is arranged, weather fields are arranged in the database, corresponding field values are updated in real time, and when relevant data of the dynamic influence characteristic set need to be obtained, the relevant data are directly obtained from the database, wherein the weather data can be crawled through a weather API (application program interface). In one embodiment, holidays and major events (concerts, marathons, etc.) are set as special events, the time of each special event is stored, the degree of association between each area and each special event is preset, when the degree of association is positive, it is described that the trip demand of the area is increased due to the special event, when the degree of association is negative, it is described that the trip demand of the area is decreased due to the special event, and the larger the absolute value of the degree of association is, the larger the influence of the special event on the trip demand of the area is. For example, when the specific event is the occurrence of an infectious disease, the association degree between the market area and the specific event is negative, and the absolute value of the association degree is large; when the special event is a national holiday, the correlation degree between the scenic spot and the special event is positive, and the absolute value of the correlation degree is larger. In one embodiment, the road condition is stored in the database, the road condition indicates whether road obstacles exist on each road, road construction and the like, the road condition of each road can be stored and updated in real time, the association relationship between each area and the road is established, the road condition is further associated with each area, and the road condition influencing the trip demand of the area to be detected is directly acquired when the road condition data is acquired subsequently.
The social impact feature set comprises a family personnel composition feature, a job category feature, a family income feature, an automobile keeping feature, a family address feature and the like.
The travel influence feature set of the region to be predicted may include a single travel influence feature, or may include a combination of travel influence features that are combined from a plurality of travel influence features. For example, the travel influence feature set may include a single feature such as "time feature", "weather feature" and "special event feature", and the travel influence feature set may further include a feature combination such as "time feature and weather feature", "time feature and special event feature", and "weather feature and special event feature".
Optionally, the travel influence characteristic comprises at least one of: land planning characteristics, date characteristics, traffic planning characteristics, weather change characteristics, special event characteristics, and road operation condition characteristics. By bringing the trip influence characteristics into the trip influence characteristic set of the region to be detected, factors influencing the trip demand of the region to be predicted are comprehensively considered, so that a more accurate trip demand prediction result is obtained.
Optionally, the travel influence feature combination includes at least one of: planning land and traffic; date features and special events; date characteristics and weather changes; weather changes and road operating conditions. By incorporating the trip influence characteristic combination into the trip influence characteristic set of the region to be detected, the weight of the trip influence characteristic combination on trip influence can be enhanced, so that a more accurate trip demand prediction result can be obtained.
Step S20, respectively obtaining peak real-time data corresponding to each trip influence feature and/or each trip influence feature combination in the trip influence feature set.
Determining the peak traveling time of the area to be predicted, screening out the peak real-time data of the area to be predicted, and acquiring the peak real-time data corresponding to each traveling influence characteristic and/or each traveling influence characteristic combination in the traveling influence characteristic set.
Step S30, respectively inputting peak real-time data corresponding to each trip influence feature and/or each trip influence feature combination into the corresponding trained GRU model, and obtaining a first trip prediction result corresponding to each trip influence feature and/or each trip influence feature combination.
Corresponding GRU models are preset for different trip influence characteristics or trip influence characteristic combinations, peak real-time data corresponding to each trip influence characteristic or each trip influence characteristic combination is input into the corresponding GRU models to obtain first trip prediction results corresponding to each trip influence characteristic or each trip influence characteristic combination, the first trip prediction results are used as input of a random forest model, the advantage that the GRU models process dynamic change rules of traffic data to capture time characteristics can be fully utilized, trip demand prediction results under each trip influence characteristic or each trip influence characteristic combination are determined, and the final prediction results can be obtained through further analysis of subsequent random forest models.
And step S40, inputting the first trip prediction results corresponding to all the trip influence characteristics and/or all the trip influence characteristic combinations into a trained random forest model, and obtaining a peak trip demand prediction result of the area to be predicted.
After first trip prediction results corresponding to various trip influence characteristics and/or various trip influence characteristic combinations are obtained, all the first trip prediction results are input into a random forest model, trip demand prediction under various trip influence characteristics and/or trip influence characteristic combinations is carried out by the random forest model, the advantages of a GRU model and the random forest model can be fully utilized, the accuracy of a trip demand prediction model is improved, in addition, various data can be simply accessed and analyzed, the timeliness and the comprehensiveness are good, various trip influence characteristic correlation analysis is supported, and trip demands under various influence characteristics are predicted more truly.
Wherein, gru (gate recovery unit) is one of Recurrent Neural Networks (RNN). Inputting travel influence characteristics and/or travel influence characteristic combinations into a trained GRU model to obtain a preliminary travel demand prediction result (namely a first travel prediction result), inputting the preliminary travel demand prediction result output by the GRU model into a random forest model, outputting a final travel demand prediction result by the random forest model, combining the GRU model and the random forest model, utilizing the advantage that the GRU model has the dynamic change rule of learning and processing traffic data to capture time characteristics, combining a large number of weak models into a strong model by the random forest model, improving the advantage of space-time correlation in traffic flow data, mining the correlation of different travel influence characteristics through the combination of the GRU model and the random forest model, rapidly identifying data rules, forming strong correlation of data fusion and having small errors, the prediction accuracy is high.
Optionally, before step S10, the method further includes: acquiring the running speed data of the road associated with the area to be predicted, performing cluster statistics on the running speed data to obtain a running speed distribution rule of the road associated with the area to be predicted along with the change of time, and determining the high-peak running time of the area to be predicted according to the running speed distribution rule.
The more the traffic is, the slower the running speed is, for example, during peak hours of work, the traffic flow on the road is larger, the traffic speed is slower, and the traffic flow on the road is smaller and the traffic speed is faster when the travel is underestimated in the middle of the night. Therefore, the peak outgoing time of the area to be predicted can be determined according to the running speed distribution rule of the road associated with the area to be predicted. The high-peak traveling time of the area to be predicted can be determined by carrying out cluster statistics on the moving speed of GPS data, and the moving speeds of vehicles and people can also be counted by a sensor arranged beside a road in advance, and then the high-peak traveling time of the area to be predicted is determined by carrying out cluster statistics on the moving speeds. As shown in fig. 4, taking Shenzhen citizen travel distribution as an example, by performing 24-hour statistics on GPS data and combining building function attribute division, it is found that the travel peak periods around Shenzhen city "hospital" are concentrated on working days 7:00-9:00 and 13:00-15:00, the travel peak periods around school "are concentrated on working days 7:00-8:00 and 16:00-18:00, the travel peak periods around colons" are concentrated on non-working days 10:00-20:00, the travel peak periods around hubs and ports "are concentrated on non-working days 8:00-22:00, and the travel peak periods around scenic spots" are concentrated on non-working days 10:00-18: 00.
Optionally, the training process of the GRU model and the random forest model includes:
acquiring historical trip data of the peak trip time of the area to be predicted, and generating a training data set according to the historical trip data; predicting by adopting the GRU model based on the training data set to obtain a preliminary prediction result, inputting the preliminary prediction result into the random forest model, and outputting a final prediction result by the random forest model; and calculating the error between the final prediction result and the actual running amount, and adjusting the model parameters of the GRU model and the random forest model based on the error until the loss function is converged.
The training data set comprises trip influence characteristic data (trip influence characteristics or trip influence characteristic combinations) and corresponding actual trip condition data. The travel influence characteristic data is characteristic data with strong correlation with travel demands, so that the training speed can be improved by mainly learning low-gradient information gain during model training.
The historical trip data comprises taxi GPS data, bus GPS data, mobile phone signaling data, bus IC card swiping data and/or subway IC card swiping data. Specifically, as shown in fig. 2, before generating the training data set, data cleaning and data preprocessing operations are further included, specifically, after the historical trip data is collected, the historical trip data is subjected to data cleaning, repeated data, important field missing data, irrelevant data, unrecoverable or obviously wrong data (for example, the traffic flow of a vehicle type is obviously larger than the total traffic flow under the condition that the total traffic flow is normal) and abnormal data which do not meet a conventional threshold value, abnormal data which do not meet the traffic flow theory and drifting data (for example, GPS data drifting) are cleaned, then direct sample expansion (data volume at a certain time is deduced through other data sources) is adopted according to the data condition, feature sample expansion (historical features are simulated to capture data features, data volume at a certain time is deduced by utilizing a certain rule), and the cleaned data after the data sample expansion is formed into effective data with a unified format, and generating actual travel condition data based on the processed effective data. By using traffic data (such as taxi GPS data, bus GPS data and the like) and non-traffic data (such as road conditions, special events and the like) as training data of the GRU model and the random forest model together, the universality of the GRU model and the random forest model on multi-source data can be improved, the accuracy of a travel prediction result finally output by the model is ensured due to the comprehensiveness of the data, and the fluctuation of the prediction accuracy of the existing travel demand prediction technology due to the absence of the non-traffic data is avoided.
Wherein the GRU model and the random forest model are trained using a mean square error as a loss function to obtain the trained GRU model and the trained random forest model with optimal parameters. Mean Squared Error (MSE) refers to the expected value of the square of the difference between the estimated value of a parameter and the true value of the parameter, and can be modeled by a Mean Squared Error loss function through a gradient descent method, where the Mean Squared Error loss function can be expressed as:
wherein, yiRepresenting the actual out value of the sample data, f (wx)i+ b) represents the prediction function f from sample data xiBy continuously adjusting f (wx)i+ b) adjusting the weights w and the deviations b and introducing the learning rates α, wiRepresents an initial value of the weight, wi+1Represents the updated weight value by the following equation:
and obtaining the most appropriate w and b values so as to obtain an optimal function model.
The GRU model and the random forest model are trained by using the mean square error as a loss function, an optimal function model with optimal parameters is obtained, and a fast model training speed can be obtained, so that a travel prediction result finally output by the model is infinitely close to an actual value, and a relatively accurate travel prediction result is obtained.
In the conception of the solution according to the invention, the applicant has made an in-depth analysis of the drawbacks of the prior art: in the prior art, a single model is generally used for travel prediction, for example, a prediction model based on statistical principles and calculus is used, or a prediction model based on a machine learning algorithm is used. The forecasting model based on the statistical principle and the calculus predicts the future traffic flow through the statistical characteristics of the historical flow data, and has the defects that: influence factors such as weather and regional functions are not considered, and occasions with strong traffic flow randomness are difficult to accurately predict. The defects of the prediction model based on the machine learning algorithm are as follows: some of the characteristics inherent in traffic stream data are ignored. It can be seen that a single prediction model is difficult to take into account the inherent characteristics of traffic flow data and the external influences caused by season, climate or human factors. In order to overcome the defects of a single model, a traffic flow prediction model which can take various traffic flow influence factors into consideration must be designed, based on the design, the applicant designs a combined model based on a GRU model and an RF model, learns the dynamic change rule of traffic data along with each factor and captures the time characteristics by the GRU model, and fuses the dynamic change rule of multiple factors by utilizing the construction capability of the RF model to finally obtain the traffic flow prediction model with higher accuracy. The applicant carries out simulation prediction on the GRU-RF combined model, the SARIMA-RF combined model and the SVR-RF combined model, sets a certain scenic region (namely a hotspot region) of Shenzhen as a region to be predicted, and predicts the traffic around the peak period of the region to be predicted. Processing trip demand data around the peak time of a scenic spot, respectively taking characteristic factors of a research area (wherein, dynamic factors are weather, static factors are working day or not, social factors are standing population and floating population around the hot spot area) as condition input, taking a certain influence factor, namely weather as an example, when the influence factor is input into a specific model (any model of GRU/SARIMA/SVR), screening out a corresponding weather data set in a mass data set, such as: studying the travel rule of the scenic spot influenced by rainy days, selecting historical multi-time rainy-day travel data in peak periods and travel data in the same-frequency sunny days to compare in a model, so that the travel demand change rule under the factor can be obtained, further the dynamic change rule of the travel demand under the single-factor condition can be calculated, the characteristic rules of a plurality of single factors can be obtained in the step, the output rule is used as the input of an RF (radio frequency) model, and the travel demand under the multi-condition can be output, and as shown in fig. 5, the prediction result and the actual travel participation (namely the actual travel) of each combined model are schematically illustrated. The graph 1 represents the prediction result of the SARIMA-RF combined model, the graph 2 represents the prediction result of the SVR-RF combined model, the graph 3 represents the prediction result of the GRU-RF combined model, and the graph 4 represents the actual trip participation degree, as can be seen from FIG. 6, the GRU-RF combined model and the actual trip participation degree are the closest to each other, so that the data law can be identified more quickly, the strong association of data fusion is formed, and the error is smaller.
The invention further provides a travel demand forecasting device. Fig. 6 is a schematic diagram of an embodiment of a travel demand prediction apparatus according to the present invention, where the travel demand prediction apparatus includes:
an obtaining unit 101, configured to obtain a trip impact feature set of the area to be predicted, where the trip impact feature set includes trip impact features and/or trip impact feature combinations; respectively acquiring peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination in the trip influence characteristic set;
a first prediction unit 102, configured to input peak real-time data corresponding to each trip influence feature and/or each trip influence feature combination to a corresponding trained GRU model, respectively, so as to obtain a first trip prediction result corresponding to each trip influence feature and/or each trip influence feature combination;
and a second prediction unit 103, configured to input the first trip prediction results corresponding to all the trip influence features and/or all trip influence feature combinations into the trained random forest model, and obtain a peak trip demand prediction result of the area to be predicted.
Optionally, before the obtaining of the travel influence feature set of the region to be predicted, the method further includes:
the processing unit is used for acquiring the running speed data of the road associated with the area to be predicted, carrying out cluster statistics on the running speed data and acquiring the running speed distribution rule of the road associated with the area to be predicted along with the change of time; and determining the peak running time of the area to be predicted according to the running speed distribution rule.
Optionally, the travel influence characteristic comprises at least one of: land planning, date characteristics, traffic planning, weather changes, special events, road operating conditions.
Optionally, the travel influence feature combination includes at least one of: planning land and traffic; date features and special events; date characteristics and weather changes; weather changes and road operating conditions.
Optionally, the training process of the GRU model and the random forest model includes:
acquiring historical trip data of the peak trip time of the area to be predicted, and generating a training data set according to the historical trip data; predicting by adopting the GRU model based on the training data set to obtain a preliminary prediction result, inputting the preliminary prediction result into the random forest model, and outputting a final prediction result by the random forest model; and calculating the error between the final prediction result and the actual running amount, and adjusting the model parameters of the GRU model and the random forest model based on the error until the loss function is converged.
Optionally, the loss function comprises:
wherein, yiRepresenting the actual out value of the sample data, f (wx)i+ b) represents the prediction function f from sample data xiBy continuously adjusting f (wx)i+ b) adjusting the weights w and the deviations b and introducing the learning rates α, wiRepresents an initial value of the weight, wi+1 represents the updated weight value by:
and obtaining the optimal w value and the optimal b value.
Optionally, the historical travel data includes: taxi GPS data, bus GPS data, mobile phone signaling data, bus IC card swiping data and/or subway IC card swiping data.
Compared with the prior art, the travel demand prediction device has the advantages that the travel demand prediction method is consistent with the urban hot spot area peak travel demand prediction method based on multi-source data fusion, and details are omitted here.
The present invention further provides a travel demand forecasting terminal, as shown in fig. 7, the travel demand forecasting terminal includes a computer readable storage medium 201 storing a computer program and a processor 202, and when the computer program is read and executed by the processor 202, the travel demand forecasting method for peak travel in urban hot spot areas based on multi-source data fusion is implemented.
Compared with the prior art, the travel demand prediction terminal has the advantages that the travel demand prediction terminal is consistent with the urban hot spot area peak travel demand prediction method based on multi-source data fusion, and the method is not repeated herein.
The invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is read and executed by a processor, the method for predicting the peak travel demand in the urban hot spot area based on multi-source data fusion is realized.
Compared with the prior art, the beneficial effects of the computer-readable storage medium of the invention are consistent with the above prediction method of peak travel demand in urban hot spot areas based on multi-source data fusion, and are not repeated here.
Although the present disclosure has been described above, the scope of the present disclosure is not limited thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present disclosure, and these changes and modifications are intended to be within the scope of the present disclosure.
Claims (10)
1. A multi-source data fusion-based urban hot spot area peak travel demand prediction method is characterized by comprising the following steps:
acquiring a travel influence feature set of the area to be predicted, wherein the travel influence feature set comprises travel influence features and/or a travel influence feature combination;
respectively acquiring peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination in the trip influence characteristic set;
respectively inputting peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination into a corresponding trained GRU model to obtain a first trip prediction result corresponding to each trip influence characteristic and/or each trip influence characteristic combination;
and inputting the first trip prediction results corresponding to all the trip influence characteristics and/or all the trip influence characteristic combinations into a trained random forest model, and obtaining a peak trip demand prediction result of the area to be predicted.
2. The multi-source data fusion-based urban hot spot area peak travel demand prediction method according to claim 1, wherein before obtaining the travel influence feature set of the area to be predicted, the method further comprises:
acquiring running speed data of the to-be-predicted area associated road, and performing cluster statistics on the running speed data to obtain a running speed distribution rule of the to-be-predicted area associated road along with time variation;
and determining the peak running time of the area to be predicted according to the running speed distribution rule.
3. The multi-source data fusion-based urban hot spot regional peak travel demand prediction method according to claim 1, wherein the travel impact characteristics include at least one of:
land planning, date characteristics, traffic planning, weather changes, special events, road operating conditions.
4. The multi-source data fusion-based urban hot spot regional peak travel demand prediction method according to claim 1, wherein the travel impact feature combination comprises at least one of the following:
planning land and traffic; date features and special events; date characteristics and weather changes; weather changes and road operating conditions.
5. The multi-source data fusion-based urban hot spot area peak travel demand prediction method according to claim 1, wherein the training process of the GRU model and the random forest model comprises:
acquiring historical trip data of the peak trip time of the area to be predicted, and generating a training data set according to the historical trip data;
predicting by adopting the GRU model based on the training data set to obtain a preliminary prediction result, inputting the preliminary prediction result into the random forest model, and outputting a final prediction result by the random forest model;
and calculating the error between the final prediction result and the actual running amount, and adjusting the model parameters of the GRU model and the random forest model based on the error until the loss function is converged.
6. The multi-source data fusion-based urban hot spot area peak travel demand prediction method according to claim 5, wherein the loss function comprises:
wherein, yiRepresenting the actual out value of the sample data, f (wx)i+ b) represents the prediction function f from sample data xiBy continuously adjusting f (wx)i+ b) adjusting the weights w and the deviations b and introducing the learning rates α, wiRepresents an initial value of the weight, wi+1 represents the updated weight value by:
and obtaining the optimal w value and the optimal b value.
7. The multi-source data fusion-based urban hot spot regional peak travel demand prediction method according to claim 5, wherein the historical travel data comprises: taxi GPS data, bus GPS data, mobile phone signaling data, bus IC card swiping data and/or subway IC card swiping data.
8. A travel demand prediction apparatus, comprising:
an obtaining unit, configured to obtain a travel influence feature set of the area to be predicted, where the travel influence feature set includes a travel influence feature and/or a travel influence feature combination; respectively acquiring peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination in the trip influence characteristic set;
the first prediction unit is used for respectively inputting peak real-time data corresponding to each trip influence characteristic and/or each trip influence characteristic combination into the corresponding trained GRU model to obtain a first trip prediction result corresponding to each trip influence characteristic and/or each trip influence characteristic combination;
and the second prediction unit is used for inputting the first trip prediction results corresponding to all the trip influence characteristics and/or all the trip influence characteristic combinations into the trained random forest model to obtain the peak trip demand prediction result of the area to be predicted.
9. A travel demand forecasting terminal, comprising a computer readable storage medium storing a computer program and a processor, wherein the computer program is read by the processor and executed to implement the multi-source data fusion-based urban hot spot area peak travel demand forecasting method according to any one of claims 1 to 7.
10. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, which when read and executed by a processor, implements the method for forecasting peak travel demand in urban hot spot area based on multi-source data fusion according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110443653.7A CN113222645A (en) | 2021-04-23 | 2021-04-23 | Urban hot spot area peak trip demand prediction method based on multi-source data fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110443653.7A CN113222645A (en) | 2021-04-23 | 2021-04-23 | Urban hot spot area peak trip demand prediction method based on multi-source data fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113222645A true CN113222645A (en) | 2021-08-06 |
Family
ID=77088555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110443653.7A Pending CN113222645A (en) | 2021-04-23 | 2021-04-23 | Urban hot spot area peak trip demand prediction method based on multi-source data fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113222645A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140114556A1 (en) * | 2012-10-23 | 2014-04-24 | University Of Southern California | Traffic prediction using real-world transportation data |
CN108877223A (en) * | 2018-07-13 | 2018-11-23 | 南京理工大学 | A kind of Short-time Traffic Flow Forecasting Methods based on temporal correlation |
CN109063911A (en) * | 2018-08-03 | 2018-12-21 | 天津相和电气科技有限公司 | A kind of Load aggregation body regrouping prediction method based on gating cycle unit networks |
CN109886387A (en) * | 2019-01-07 | 2019-06-14 | 北京大学 | It is a kind of that the traffic time sequence forecasting method returned is promoted based on gating network and gradient |
US20190265950A1 (en) * | 2018-02-27 | 2019-08-29 | New York University | System, method, and apparatus for recurrent neural networks |
US20190273510A1 (en) * | 2018-03-01 | 2019-09-05 | Crowdstrike, Inc. | Classification of source data by neural network processing |
CN110458337A (en) * | 2019-07-23 | 2019-11-15 | 内蒙古工业大学 | A kind of net based on C-GRU about vehicle supply and demand prediction method |
CN112001740A (en) * | 2020-06-19 | 2020-11-27 | 南京理工大学 | Combined prediction method based on adaptive neural network |
CN112489420A (en) * | 2020-11-17 | 2021-03-12 | 中国科学院深圳先进技术研究院 | Road traffic state prediction method, system, terminal and storage medium |
-
2021
- 2021-04-23 CN CN202110443653.7A patent/CN113222645A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140114556A1 (en) * | 2012-10-23 | 2014-04-24 | University Of Southern California | Traffic prediction using real-world transportation data |
US20190265950A1 (en) * | 2018-02-27 | 2019-08-29 | New York University | System, method, and apparatus for recurrent neural networks |
US20190273510A1 (en) * | 2018-03-01 | 2019-09-05 | Crowdstrike, Inc. | Classification of source data by neural network processing |
CN108877223A (en) * | 2018-07-13 | 2018-11-23 | 南京理工大学 | A kind of Short-time Traffic Flow Forecasting Methods based on temporal correlation |
CN109063911A (en) * | 2018-08-03 | 2018-12-21 | 天津相和电气科技有限公司 | A kind of Load aggregation body regrouping prediction method based on gating cycle unit networks |
CN109886387A (en) * | 2019-01-07 | 2019-06-14 | 北京大学 | It is a kind of that the traffic time sequence forecasting method returned is promoted based on gating network and gradient |
CN110458337A (en) * | 2019-07-23 | 2019-11-15 | 内蒙古工业大学 | A kind of net based on C-GRU about vehicle supply and demand prediction method |
CN112001740A (en) * | 2020-06-19 | 2020-11-27 | 南京理工大学 | Combined prediction method based on adaptive neural network |
CN112489420A (en) * | 2020-11-17 | 2021-03-12 | 中国科学院深圳先进技术研究院 | Road traffic state prediction method, system, terminal and storage medium |
Non-Patent Citations (4)
Title |
---|
于德新;邱实;周户星;王卓睿;: "基于GRU-RNN模型的交叉口短时交通流预测研究", 公路工程, no. 04 * |
张振;曾献辉;: "基于CNN-LightGBM模型的高速公路交通量预测", 信息技术与网络安全, no. 02 * |
张翔宇;张强;吕明琪;: "基于演化模式挖掘和代价敏感学习的交通拥堵指数预测", 高技术通讯, no. 09 * |
陈明露;江伟炜;: "基于多价值链的汽车零配件需求预测研究", 现代计算机, no. 24 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104484993B (en) | Processing method of cell phone signaling information for dividing traffic zones | |
CN102799897B (en) | Computer recognition method of GPS (Global Positioning System) positioning-based transportation mode combined travelling | |
CN109410577B (en) | Self-adaptive traffic control subarea division method based on space data mining | |
CN110555544B (en) | Traffic demand estimation method based on GPS navigation data | |
CN112819340B (en) | Urban flood disaster dynamic evaluation method based on multi-source data | |
CN105206048A (en) | Urban resident traffic transfer mode discovery system and method based on urban traffic OD data | |
CN110716935A (en) | Track data analysis and visualization method and system based on online taxi appointment travel | |
CN113570867A (en) | Urban traffic state prediction method, device, equipment and readable storage medium | |
CN113505521B (en) | Urban waterlogging rapid forecasting method based on neural network-numerical simulation | |
CN113806419B (en) | Urban area function recognition model and recognition method based on space-time big data | |
CN112863182A (en) | Cross-modal data prediction method based on transfer learning | |
CN115565369A (en) | Hypergraph-based time-space hypergraph convolution traffic flow prediction method and system | |
CN115293570A (en) | GIS-based territorial space planning system and method | |
Fafoutellis et al. | Dilated LSTM networks for short-term traffic forecasting using network-wide vehicle trajectory data | |
CN115412857A (en) | Resident travel information prediction method | |
CN108053646B (en) | Traffic characteristic obtaining method, traffic characteristic prediction method and traffic characteristic prediction system based on time sensitive characteristics | |
CN109800903A (en) | A kind of profit route planning method based on taxi track data | |
Kurte et al. | Regional-scale spatio-temporal analysis of impacts of weather on traffic speed in Chicago using probe data | |
CN113222645A (en) | Urban hot spot area peak trip demand prediction method based on multi-source data fusion | |
CN115456238A (en) | Urban trip demand prediction method based on dynamic multi-view coupling graph convolution | |
Kim et al. | Examining the socio-spatial patterns of bus shelters with deep learning analysis of street-view images: A case study of 20 cities in the US | |
CN106529778A (en) | Bus ride comfort index construction method based on smart phone | |
Ning | Prediction and detection of urban trajectory using data mining and deep neural network | |
Wang et al. | Route selection for opportunity-sensing and prediction of waterlogging | |
Jiang et al. | Spatio-temporal prediction of crime based on Data Mining and LSTM network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |