CN116796904A - Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit - Google Patents

Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit Download PDF

Info

Publication number
CN116796904A
CN116796904A CN202310793330.XA CN202310793330A CN116796904A CN 116796904 A CN116796904 A CN 116796904A CN 202310793330 A CN202310793330 A CN 202310793330A CN 116796904 A CN116796904 A CN 116796904A
Authority
CN
China
Prior art keywords
passenger flow
station
new line
type
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310793330.XA
Other languages
Chinese (zh)
Inventor
许心越
张鹏羽
李海鹰
孔庆雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN202310793330.XA priority Critical patent/CN116796904A/en
Publication of CN116796904A publication Critical patent/CN116796904A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method, a system, electronic equipment and a medium for predicting new line passenger flow of rail transit, which relate to the field of rail transit passenger flow prediction, and the method comprises the following steps: firstly, acquiring the built environment characteristics related to the passenger flow of a station, and determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; preliminary prediction is carried out on the passenger flow of the new line station by utilizing a passenger flow prediction sub-model corresponding to the station type; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set, wherein the training data set comprises built environment characteristics of the existing station and corresponding passenger flow data in different periods; determining a prediction residual error of the passenger flow prediction sub-model for primarily predicting the passenger flow of the new line station by using the geographic weighted regression model; and finally, summing the preliminary prediction result and the prediction residual error to obtain the final passenger flow of the new line station. The invention improves the accuracy of the passenger flow prediction of the in-out station.

Description

Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit
Technical Field
The invention relates to the field of rail transit passenger flow prediction, in particular to a rail transit new line passenger flow prediction method, a system, electronic equipment and a medium.
Background
In order to meet the operation requirement of a new line station, a scientific basis is provided for urban rail transit planning and management, and an accurate and quick-response passenger flow prediction method is very urgent to construct in the early stage of new line opening. The current rail transit new line passenger flow prediction method mainly has the following defects: in the aspect of selecting influence factors of the passenger flow of the rail transit, the considered built environment features are not comprehensive and comprehensive enough, and the refinement and comprehensiveness of the data layer are to be improved; the new line passenger flow prediction model has the defects of more parameters of each link, slow response speed and lower prediction precision like a four-stage method, and other direct demand prediction methods do not consider the temporal-spatial heterogeneity and nonlinear influence fusion of the built environmental characteristics, so that the new line passenger flow prediction model also has the limitations of low precision and low adaptability.
Disclosure of Invention
The invention aims to provide a method, a system, electronic equipment and a medium for predicting new line passenger flow of rail transit, so as to improve the accuracy of predicting the passenger flow of the rail transit.
In order to achieve the above object, the present invention provides the following solutions:
a new line passenger flow prediction method for rail transit comprises the following steps:
acquiring the relevant built environmental characteristics of the passenger flow of a new line station of the passenger flow to be predicted; the passenger flow related built environment characteristics are determined by characteristic screening of the built environment characteristics; the built environment characteristics comprise land utilization property characteristics, regional socioeconomic characteristics, station self characteristic characteristics, external traffic characteristics and space location characteristics; the land utilization property characteristics comprise living areas, office areas, government POIs, hospital POIs, entertainment POIs, tourism POIs, education POIs and land utilization mixing degree in the site attraction range; the regional socioeconomic characteristics include employment density, residence density and room price within the site attraction range; the station self characteristics comprise the degree, the number of bets, the distance from the station to the city center, the number of entrances and exits and the number of local track stations; the external traffic characteristics comprise road density and bus number in the station attraction range; the spatial location features include spatial orientation of the site and suburban attributes;
determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; the station type is suburban residence type, mixed employment type, job-oriented residence type, employment-oriented residence type, tourist attraction type, transportation junction type, comprehensive type or commercial entertainment type;
primarily predicting the passenger flow of the new line station by using a passenger flow prediction sub-model corresponding to the station type to obtain primarily predicted passenger flow; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set; the training data set comprises the built environment characteristics of the existing station and corresponding passenger flow data of different time periods;
determining a prediction residual error of the passenger flow prediction sub-model by using a geographic weighted regression model;
and determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the predicted residual error.
Optionally, training the machine learning model with the training data set specifically includes:
acquiring the training data set;
grouping the training data sets according to the space location and the station type to obtain a plurality of training data subsets; one of the training data subsets corresponds to at least one station type;
feature screening is carried out on each training data subset to obtain a plurality of screened training data subsets;
respectively training the machine learning model by utilizing each screened training data subset to obtain a plurality of passenger flow predictor models; one passenger flow predictor model corresponds to at least one station type.
Optionally, determining the station type of the new line station according to the built environmental characteristics of the new line station of the passenger flow to be predicted, specifically including:
determining the Euclidean distance between the built environmental characteristics of the new line station and the built environmental characteristics of the existing station;
determining the station type of the new line station according to the Euclidean distance; the station type of the new line station is the station type of the existing station corresponding to the minimum Euclidean distance.
Optionally, determining the prediction residual error of the passenger flow prediction sub-model by using a geographic weighted regression model specifically includes:
using a geographically weighted regression modelDetermining a prediction residual error of the passenger flow predictor model; wherein, (u) i ,v i ) The spatial coordinates of site i; beta 0 (u i ,v i ) Is the intercept; beta k (u i ,v i ) Regression coefficients between dependent and explanatory variables; k is the number of built environmental features; e (E) i A prediction residual error of the passenger flow prediction sub-model at a site i; x is X ik Building an environment feature matrix for a site i; epsilon i And predicting an error term of the residual error for the site i.
Optionally, determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the prediction residual error specifically includes:
determining the final passenger flow of the new line station by using the formula V (s, t) =m (s, t) +e (s, t); m (s, t) is the preliminary predicted passenger flow of the station s at the moment t; e (s, t) is the prediction residual of station s at time t; v (s, t) is the final passenger flow of the new line station.
A rail transit new line passenger flow prediction system, comprising:
the data acquisition module is used for acquiring the relevant built environment characteristics of the passenger flow of the new line station of the passenger flow to be predicted; the passenger flow related built environment characteristics are determined by characteristic screening of the built environment characteristics; the built environment characteristics comprise land utilization property characteristics, regional socioeconomic characteristics, station self characteristic characteristics, external traffic characteristics and space location characteristics; the land utilization property characteristics comprise living areas, office areas, government POIs, hospital POIs, entertainment POIs, tourism POIs, education POIs and land utilization mixing degree in the site attraction range; the regional socioeconomic characteristics include employment density, residence density and room price within the site attraction range; the station self characteristics comprise the degree, the number of bets, the distance from the station to the city center, the number of entrances and exits and the number of local track stations; the external traffic characteristics comprise road density and bus number in the station attraction range; the spatial location features include spatial orientation of the site and suburban attributes;
the station type determining module is used for determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; the station type is suburban residence type, mixed employment type, job-oriented residence type, employment-oriented residence type, tourist attraction type, transportation junction type, comprehensive type or commercial entertainment type;
the preliminary prediction module is used for preliminarily predicting the passenger flow of the new line station by utilizing the passenger flow prediction sub-model corresponding to the station type to obtain preliminary predicted passenger flow; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set; the training data set comprises the built environment characteristics of the existing station and corresponding passenger flow data of different time periods;
the prediction residual determination module is used for determining the prediction residual of the passenger flow predictor model by using a geographic weighted regression model;
and the passenger flow determining module is used for determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the predicted residual error.
An electronic device, comprising: the system comprises a memory and a processor, wherein the memory is used for storing a computer program, and the processor runs the computer program to enable the electronic equipment to execute the rail transit new line passenger flow prediction method.
A computer readable storage medium storing a computer program which when executed by a processor implements the rail transit new line passenger flow prediction method described above.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides a method, a system, electronic equipment and a medium for predicting new line passenger flow of rail transit, which are characterized in that firstly, the built environment characteristics related to the passenger flow of a station are obtained, the passenger flow of the new line station is primarily predicted by using a passenger flow prediction sub-model, wherein the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set, and the training data set comprises the built environment characteristics of the existing station and corresponding passenger flow data of different time periods; determining a prediction residual error of the passenger flow prediction sub-model for primarily predicting the passenger flow of the new line station by using the geographic weighted regression model; and finally, summing the preliminary prediction result and the prediction residual error to obtain the final passenger flow of the new line station. According to the method, the corresponding rule of the passenger flow and the built environment characteristic is explored according to the multi-source data of the rail transit, the space-time heterogeneity of the fusion linear model when the built environment is in nonlinear influence on the passenger flow is captured, the prediction accuracy of the passenger flow entering and exiting the station is improved, and the operation management of a new station is effectively supported.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for predicting new passenger flow of rail transit provided by the invention;
FIG. 2 is a flow chart of the passenger flow predictor model training of the present invention;
FIG. 3 is a graph of residual profiles before and after GWR model modification;
FIG. 4 is a flow chart of a new line passenger flow prediction method of rail transit;
FIG. 5 is a diagram showing the screening of the prediction features of the arrival amount of a group of weekdays and weekends and the ranking of importance.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a method, a system, electronic equipment and a medium for predicting new line passenger flow of rail transit, so as to improve the accuracy of predicting the passenger flow of the rail transit.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
Example 1
As shown in fig. 1, the method for predicting the new line passenger flow of the rail transit provided by the invention comprises the following steps:
step 101: acquiring the relevant built environmental characteristics of the passenger flow of a new line station of the passenger flow to be predicted; the passenger flow related built environment characteristics are determined by characteristic screening of the built environment characteristics; the built environment characteristics comprise land utilization property characteristics, regional socioeconomic characteristics, station self characteristic characteristics, external traffic characteristics and space location characteristics; the land utilization property characteristics comprise living areas, office areas, government POIs, hospital POIs, entertainment POIs, tourism POIs, education POIs and land utilization mixing degree in the site attraction range; the regional socioeconomic characteristics include employment density, residence density and room price within the site attraction range; the station self characteristics comprise the degree, the number of bets, the distance from the station to the city center, the number of entrances and exits and the number of local track stations; the external traffic characteristics comprise road density and bus number in the station attraction range; the spatial locality characteristics include spatial orientation of the site and suburban attributes.
Step 102: determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; the station type is suburban residence type, mixed employment type, job-oriented residence type, employment-oriented residence type, tourist attraction type, transportation junction type, comprehensive type or business entertainment type.
Step 103: primarily predicting the passenger flow of the new line station by using a passenger flow prediction sub-model corresponding to the station type to obtain primarily predicted passenger flow; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set; the training data set comprises the existing station building environment characteristics and corresponding passenger flow data in different periods.
Step 104: and determining the prediction residual error of the passenger flow prediction sub-model by using a geographic weighted regression model.
Step 105: and determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the predicted residual error.
In practical application, as shown in fig. 2, training the machine learning model by using the training data set specifically includes:
s1: the training data set is acquired. In practical application, firstly, the built-up environment characteristics of the existing station and the passenger flow data of the existing station are obtained, and a built-up environment characteristic system of the station is built.
Specifically, the five types of built environment features are extracted through a Goldmap and a chain home website API by using Python language, GIS space processing technology is applied to screening, fusion and statistics of the built environment features, the quantity of POIs (points of interest) is used for representing land utilization properties, including the quantity of government institutions, the quantity of office buildings, the quantity of residential areas, the quantity of scenic spots, the quantity of recreational places and the like, and the POIs meeting requirements are left after useless fields such as 'gates' and repeated fields are removed; combining residence POIs and office building POIs related to residence and employment properties with building area data according to longitude and latitude to form vectors with areas, types and names, and obtaining building area data; fusing the acquired district price information with the residence POI information according to the position and the name, and setting the matching value to be 90%; constructing a topological network of the rail transit station by using a Python programming language, and calculating the medium number and degree index of each station by using a built-in function of network; calculating the distance from each station to the city center by using the point distance of the neighborhood analysis options in the GIS, converting the input of the position coordinates of the central point of the Beijing city into shp files, and inputting the shp files converted by the position coordinates of each station as adjacent elements to generate the distance from each station to the city center; counting the number of rail transit stations in the attraction range through space connection in the GIS; importing all the crawled POIs of the bus stations into a GIS, and further counting the number of bus lines in the range; and summing the first-level, second-level, third-level and fourth-level road length attributes of the city in the GIS, counting the total road length in the range, and obtaining the road density by using the ratio of the total road length to the range area. And meanwhile, cleaning and counting the AFC data to obtain passenger flow data, so that the passenger flow data can show good data processing capacity in order to meet the model premise setting, carrying out square root opening or logarithmic preprocessing on the normalized passenger flow data, and correcting the left-hand offset or right-hand offset state of distribution of the passenger flow data.
In this embodiment, the station building environmental characteristic system characterizes the concept of the environment provided by human activities including large urban environments around the station, and mainly includes land utilization characteristics, regional socioeconomic characteristics, station self characteristics, external traffic characteristics, and spatial location characteristics.
Further, in the present embodiment, the station attraction range refers to a range of a circle layer under the action of both walking connection and bus connection of an equal time circle of the street road network, and a numerical value of a certain characteristic variable around the station m is expressed as:
C m =C m1 +θ·C m2
wherein C is m1 For a certain characteristic quantity in the walking connection range, C m2 For a certain characteristic quantity in a range between bus connection and walking connection, θ is related to the scale, building layout, development level and the like of each urban rail transit service, and θ is 0.35 in the embodiment.
Further, in this embodiment, employment and occupancy densities are obtained based on real-time population activity displayed by a hundred degree thermodynamic diagram, in combination with LandScan global demographic data corrections, distinguishing weekdays from weekends.
Intercepting 14 heat maps of 10 points and 23 points within a week of the site attraction range, and after coordinate registration, respectively carrying out fusion average according to the working days and weekends by using a command embedded to a new grid in a grid tool.
Recalculating a first, second and third channel (Band) Band endowing expression loaded on a tif-format picture file by using a grid calculator, reclassifying the first, second and third channel Band endowing expression into 7 types, respectively corresponding to red, orange, yellow, green, cyan, bluish and purple Alpha Value intervals of 7 colors, changing the Value of the generated grid file into corresponding aggregation density, and respectively obtaining the Value of the generated grid file>60 (person/hm) 2 ) 40-60 (person/hm) 2 ) 20-40 (person/hm) 2 ) 10-20 (person/hm) 2 )、<10 (person/hm) 2 ) Default, default.
The grid is converted into a vector layer by using a grid turn-plane tool, and a fishing net vector with a grid area of 100m multiplied by 100m is created on the vector layer.
Space connection is carried out on the fishing net vector file and the converted surface vector image layer according to the density attribute, a fishing net vector density value is given, the area of each grid is generated according to the calculation geometry, the total grid in the influence range of the site is counted to obtain the population activity number, and the formula is:
wherein A is i Is the population active number, D c To aggregate density, S i Is the mesh area.
And converting the LandScan population TIFF file into a vector format, and carrying out space connection statistics on the number of people in the influence range of each site.
LandScan global demographic data of the national laboratory of Oak Kaolin, the United states department of energy, which uses spatial data, image analysis techniques and multivariate partition density model algorithms to resolve census data for each country or region within a specific administrative boundary, the TIFF file of which can be projected to GIS software.
And correcting the population activity number by combining the population ratio after treatment to obtain the residence and employment density on weekdays and weekends.
Examples of population occupancy and employment densities extracted using thermodynamic diagrams are shown in table 1.
Table 1 exemplary table of population occupancy and employment densities
Station Density of residence (person/hm) 2 ) Employment Density (person/hm) 2 )
Chong Chinese door 37.75 10.18
Western style sheet 66.36 5.48
Flower village east bridge 3.54 0.54
In this embodiment, the spatial orientation division rule of the site is: and calculating azimuth angles of longitude and latitude coordinates of each station and longitude and latitude coordinates of the center point by taking the urban center point as the central point of the railway network, and respectively giving eight zone location labels to east, south, west, north, northeast, southeast, northwest and southwest.
In addition, in the present embodiment, the specific format of the station building environmental characteristic system is shown in table 2:
table 2 table of specific formats for building environmental characteristics system at station
In this embodiment, the update rule of the station building environmental characteristic system is: the data is collected in the previous year of opening the new line, the predicted year is the first year of opening the new line, and because the two time intervals are short, the utilization of the surrounding land of the station is considered to be almost unchanged, but because the topology structure of the whole track traffic network is changed due to the influence of the new line station, the corresponding topology index scale and the corresponding betweenness are updated, and in addition, the used recent residence and employment density can be updated according to the hundred-degree thermodynamic diagram at regular time as required.
S2: grouping the training data sets according to the space location and the station type to obtain a plurality of training data subsets; one of the training data subsets corresponds to at least one station type. In practical application, existing stations are classified based on passenger flow sequences and built environments, various built environment features are obtained, and new stations are matched to corresponding groups according to the built environments.
The station type is determined together with the built environment through a passenger flow sequence, and comprises the double types of weekdays and weekends, and the method comprises the following steps:
the knowledge drive is used for extracting station data related to tourist attractions and transportation hubs to define two types.
And carrying out type division on the rest station standardized workday passenger flow data by using a K-means clustering method, preliminarily determining a clustering number K value basis and a variance SSE Index, simultaneously calculating a contour coefficient and a CH Index (Calinski-Harabaz Index), determining a clustering number as 6 by combining the three indexes, carrying out the same processing on the weekend passenger flow data, and determining the clustering number as 6.
Analyzing the built environment characteristics of each cluster, carrying out standardized treatment on each cluster by using a Min-Max method, mapping the value to [0,1], dividing different grades, and naming the class of the workdays and the weekends of the station by combining the land utilization functions reflected by the built environment.
Specifically, the corresponding station workday types are: suburban living, mixed preferential employment, job mix, living dominant, employment dominant, tourist attraction, and transportation hub; the corresponding types of the weekends of the stations are as follows: comprehensive, commercial entertainment, employment oriented, mixed employment oriented, living oriented, mixed living oriented, tourist attraction oriented, and transportation hub oriented.
S3: and carrying out feature screening on each training data subset to obtain a plurality of screened training data subsets.
Aiming at station groups divided by taking a space location tag as a main basis and a type tag as an auxiliary basis, training a GBRT machine learning sub-model (machine learning model) of passenger flows in all time periods of workdays and weekends, sorting and screening built environment features of all groups through feature importance, and performing super-parameter adjustment to obtain a preliminary prediction result.
Because of more built environment features, correlation is unavoidable among all built environment features, and the redundant features add noise of the model, feature selection, namely screening of influence factors, is performed by calculating feature importance. The feature importance represents the influence degree of each built environment feature on the passenger flow in a certain sub-model, the replacement importance calculation is utilized, the working principle is that other features are kept unchanged, certain feature values are randomly arranged to replace the original sequence, the replaced data set is retrained and predicted, the corresponding accuracy score is obtained, the difference coefficient which is reduced compared with the model score before replacement is the replacement importance of the feature, and the features are sequentially repeated. And obtaining the optimal feature quantity and the required specific features of the sub-model through feature recursion elimination.
S4: respectively training the machine learning model by utilizing each screened training data set to obtain a plurality of passenger flow predictor models; one passenger flow predictor model corresponds to at least one station type.
The GBRT algorithm updates the approximation function f (x) by minimizing the expected value of the loss function L (y, f (x)), where y is the actual passenger flow value, and the square error loss function formula is:
L(y,f(x))=(y-f(x)) 2
based on the gradient descent direction, f (x) is updated by using m regression trees, and the formula is:
ζ is a contraction parameter between 0 and 1, the contribution value of each tree is scaled by simple regularization to prevent overfitting. c jm For the corresponding region R jm Constant value of R 1m ,R 2m ,…R jm Representing disjoint regions of the input space divided by the tree ρ m Estimated by minimizing the expected value of the loss function, the formula is:
as an alternative embodiment, step 102 specifically includes:
1) And determining the Euclidean distance between the built environmental characteristics of the new line station and the built environmental characteristics of the existing station.
In practical application, calculating Euclidean distance between a new line station and an established environmental characteristic data average value of each type of existing stations, wherein the formula is as follows:
wherein D is j To synthesize distance x inew Representing the ith built environment characteristic value of a new line station, x ijold And (5) representing the average value of the ith built environment characteristics of the jth station.
2) Determining the station type of the new line station according to the Euclidean distance; the station type of the new line station is the station type of the existing station corresponding to the minimum Euclidean distance.
In practical application, D is j As a proximity degree judgment standard, selecting a new line station building environment according to the station category Euclidean distanceThe value from the minimum is taken as the category of the new line station.
In this embodiment, for example, the comprehensive distance between a new station and the average value of the built-up environmental characteristic data of each type of existing station is shown in table 3, and the station is classified as a mixed preferential residence type, a mixed preferential employment type, and a living dominant type according to the principle that the station is the smallest approach.
TABLE 3 Euclidean distance statistics table for building environmental characteristics of certain station and each class of existing stations
As an alternative embodiment, step 104 specifically includes:
using a geographically weighted regression modelDetermining a prediction residual error of the passenger flow predictor model; wherein, (u) i ,v i ) The spatial coordinates of site i; beta 0 (u i ,v i ) Is the intercept; beta k (u i ,v i ) Regression coefficients between dependent and explanatory variables; k is the number of built environmental features (self-variable number, corresponding to screening features); e (E) i Estimating passenger flow errors (predicted residuals of a passenger flow prediction sub-model at a site i) for a machine learning sub-model at the site i; x is X ik Building an environment feature matrix for a site i; epsilon i And predicting an error term of the residual error for the site i.
In practical application, each built environment feature shows obvious spatial autocorrelation, and a GWR model (geographic weighted regression model) is utilized to capture a prediction residual, and the formula is as follows:
β k (u i ,v i )=[X T W(u i ,v i )X] -1 X T W(u i ,v i )E。
wherein W is a space weight matrix, E is a vector of the machine learning sub-model estimated passenger flow error. The distribution of residuals before and after correction is shown in fig. 3, and the residuals after GWR correction are found to be significantly reduced and random disturbance.
As an alternative embodiment, step 105 specifically includes:
determining the final passenger flow of the new line station by using the formula V (s, t) =m (s, t) +e (s, t); m (s, t) is the preliminary predicted passenger flow of the station s at the moment t; e (s, t) is the prediction residual of station s at time t.
The final prediction result (final passenger flow) is obtained by summing the preliminary prediction result of the machine learning sub-model (passenger flow prediction sub-model) and the residual value (prediction residual) estimated by GWR, and a specific input and output work flow chart of the prediction model is shown in figure 4.
In this embodiment, for example, the result of a working day experiment of a new line by applying a group prediction correction model is shown in the following table 4, the prediction label is a mixed partial residential type in southwest urban area, the data set training consisting of the built environmental features and the passenger flow volume of the corresponding group and period is input, the related data of the employment-oriented and mixed partial commercial type stations far away from the new line are removed, the best prediction result is output by screening features and adjusting super parameters, the working day and weekend all-day inbound quantity prediction feature screening and importance ranking of the group are shown in fig. 5, and the average error is improved by 15% by combining with a geographic weighted regression model, the total inbound and outbound quantity errors of other vast stations are concentrated in (-10%, 10%), and the peak inbound and outbound quantity errors of the morning and evening peaks are concentrated in (-35%, 35%).
TABLE 4 prediction results table of working day passenger flow of a new line and a station
According to the method for predicting the new line passenger flow of the rail transit, provided by the invention, the corresponding rule of the passenger flow and the built environment characteristic is explored according to the multi-source data of the rail transit, the built environment characteristic matching rule of the station is established, and the functional positioning and operation management of the new line opening station are effectively supported. According to the rail transit new line passenger flow prediction method, the multi-dimensional factors affecting the built environment of the passenger flow are comprehensively considered and identified, the nonlinear influence of the factors on the passenger flow is captured, the advantage of space-time heterogeneity is considered by fusing the linear model, the passenger flow prediction result with high accuracy is obtained, and the practical production application is met. The invention improves the prediction precision of new passenger flow, improves the statistics and analysis visualization level of station multisource data, and improves the construction of digital infrastructure of rail transit.
Example two
In order to execute the corresponding method of the above embodiment to achieve the corresponding functions and technical effects, the following provides a new line passenger flow prediction system for rail transit, which includes:
the data acquisition module is used for acquiring the relevant built environment characteristics of the passenger flow of the new line station of the passenger flow to be predicted; the passenger flow related built environment characteristics are determined by characteristic screening of the built environment characteristics; the built environment characteristics comprise land utilization property characteristics, regional socioeconomic characteristics, station self characteristic characteristics, external traffic characteristics and space location characteristics; the land utilization property characteristics comprise living areas, office areas, government POIs, hospital POIs, entertainment POIs, tourism POIs, education POIs and land utilization mixing degree in the site attraction range; the regional socioeconomic characteristics include employment density, residence density and room price within the site attraction range; the station self characteristics comprise the degree, the number of bets, the distance from the station to the city center, the number of entrances and exits and the number of local track stations; the external traffic characteristics comprise road density and bus number in the station attraction range; the spatial locality characteristics include spatial orientation of the site and suburban attributes.
The station type determining module is used for determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; the station type is suburban residence type, mixed employment type, job-oriented residence type, employment-oriented residence type, tourist attraction type, transportation junction type, comprehensive type or business entertainment type.
The preliminary prediction module is used for preliminarily predicting the passenger flow of the new line station by utilizing the passenger flow prediction sub-model corresponding to the station type to obtain preliminary predicted passenger flow; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set; the training data set comprises the existing station building environment characteristics and corresponding passenger flow data in different periods.
And the prediction residual determination module is used for determining the prediction residual of the passenger flow predictor model by using a geographic weighted regression model.
And the passenger flow determining module is used for determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the predicted residual error.
The intelligent system for predicting the new line passenger flow of the rail transit improves the prediction precision of the new line passenger flow, improves the statistics and analysis visualization level of the station multisource data, and improves the construction of the digital infrastructure of the rail transit.
Example III
The invention provides an electronic device, comprising: the system comprises a memory and a processor, wherein the memory is used for storing a computer program, and the processor runs the computer program to enable the electronic equipment to execute the rail transit new line passenger flow prediction method of the first embodiment.
Example IV
The present invention provides a computer-readable storage medium storing a computer program which when executed by a processor implements the rail transit new line passenger flow prediction method of the first embodiment.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims (8)

1. The new line passenger flow prediction method for the rail transit is characterized by comprising the following steps of:
acquiring the relevant built environmental characteristics of the passenger flow of a new line station of the passenger flow to be predicted; the passenger flow related built environment characteristics are determined by characteristic screening of the built environment characteristics; the built environment characteristics comprise land utilization property characteristics, regional socioeconomic characteristics, station self characteristic characteristics, external traffic characteristics and space location characteristics; the land utilization property characteristics comprise living areas, office areas, government POIs, hospital POIs, entertainment POIs, tourism POIs, education POIs and land utilization mixing degree in the site attraction range; the regional socioeconomic characteristics include employment density, residence density and room price within the site attraction range; the station self characteristics comprise the degree, the number of bets, the distance from the station to the city center, the number of entrances and exits and the number of local track stations; the external traffic characteristics comprise road density and bus number in the station attraction range; the spatial location features include spatial orientation of the site and suburban attributes;
determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; the station type is suburban residence type, mixed employment type, job-oriented residence type, employment-oriented residence type, tourist attraction type, transportation junction type, comprehensive type or commercial entertainment type;
primarily predicting the passenger flow of the new line station by using a passenger flow prediction sub-model corresponding to the station type to obtain primarily predicted passenger flow; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set; the training data set comprises the built environment characteristics of the existing station and corresponding passenger flow data of different time periods;
determining a prediction residual error of the passenger flow prediction sub-model by using a geographic weighted regression model;
and determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the predicted residual error.
2. The method for predicting new line passenger flow of rail transit of claim 1, wherein training the machine learning model with the training data set specifically comprises:
acquiring the training data set;
grouping the training data sets according to the space location and the station type to obtain a plurality of training data subsets; one of the training data subsets corresponds to at least one station type;
feature screening is carried out on each training data subset to obtain a plurality of screened training data subsets;
respectively training the machine learning model by utilizing each screened training data subset to obtain a plurality of passenger flow predictor models; one passenger flow predictor model corresponds to at least one station type.
3. The method for predicting the new line passenger flow of the rail transit according to claim 2, wherein the determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted specifically comprises:
determining the Euclidean distance between the built environmental characteristics of the new line station and the built environmental characteristics of the existing station;
determining the station type of the new line station according to the Euclidean distance; the station type of the new line station is the station type of the existing station corresponding to the minimum Euclidean distance.
4. The method for predicting new passenger flow of rail transit as set forth in claim 1, wherein determining the prediction residual of the passenger flow predictor model by using a geo-weighted regression model comprises:
using a geographically weighted regression modelDetermining a prediction residual error of the passenger flow predictor model; wherein, (u) i ,v i ) The spatial coordinates of site i; beta 0 (u i ,v i ) Is the intercept; beta k (u i ,v i ) Regression coefficients between dependent and explanatory variables; k is the number of built environmental features; e (E) i A prediction residual error of the passenger flow prediction sub-model at a site i; x is X ik Building an environment feature matrix for a site i; epsilon i And predicting an error term of the residual error for the site i.
5. The method for predicting new line passenger flow of rail transit according to claim 1, wherein determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the prediction residual error comprises:
determining the final passenger flow of the new line station by using the formula V (s, t) =m (s, t) +e (s, t); m (s, t) is the preliminary predicted passenger flow of the station s at the moment t; e (s, t) is the prediction residual of station s at time t; v (s, t) is the final passenger flow of the new line station.
6. A rail transit new line passenger flow prediction system, comprising:
the data acquisition module is used for acquiring the relevant built environment characteristics of the passenger flow of the new line station of the passenger flow to be predicted; the passenger flow related built environment characteristics are determined by characteristic screening of the built environment characteristics; the built environment characteristics comprise land utilization property characteristics, regional socioeconomic characteristics, station self characteristic characteristics, external traffic characteristics and space location characteristics; the land utilization property characteristics comprise living areas, office areas, government POIs, hospital POIs, entertainment POIs, tourism POIs, education POIs and land utilization mixing degree in the site attraction range; the regional socioeconomic characteristics include employment density, residence density and room price within the site attraction range; the station self characteristics comprise the degree, the number of bets, the distance from the station to the city center, the number of entrances and exits and the number of local track stations; the external traffic characteristics comprise road density and bus number in the station attraction range; the spatial location features include spatial orientation of the site and suburban attributes;
the station type determining module is used for determining the station type of the new line station according to the built environment characteristics of the new line station of the passenger flow to be predicted; the station type is suburban residence type, mixed employment type, job-oriented residence type, employment-oriented residence type, tourist attraction type, transportation junction type, comprehensive type or commercial entertainment type;
the preliminary prediction module is used for preliminarily predicting the passenger flow of the new line station by utilizing the passenger flow prediction sub-model corresponding to the station type to obtain preliminary predicted passenger flow; the passenger flow prediction sub-model is obtained by training a machine learning model by using a training data set; the training data set comprises the built environment characteristics of the existing station and corresponding passenger flow data of different time periods;
the prediction residual determination module is used for determining the prediction residual of the passenger flow predictor model by using a geographic weighted regression model;
and the passenger flow determining module is used for determining the final passenger flow of the new line station according to the preliminary predicted passenger flow and the predicted residual error.
7. An electronic device, comprising: a memory for storing a computer program, and a processor that runs the computer program to cause the electronic device to perform the rail transit new line passenger flow prediction method of any one of claims 1-5.
8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the rail transit new line passenger flow prediction method of any one of claims 1-5.
CN202310793330.XA 2023-06-30 2023-06-30 Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit Pending CN116796904A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310793330.XA CN116796904A (en) 2023-06-30 2023-06-30 Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310793330.XA CN116796904A (en) 2023-06-30 2023-06-30 Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit

Publications (1)

Publication Number Publication Date
CN116796904A true CN116796904A (en) 2023-09-22

Family

ID=88041809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310793330.XA Pending CN116796904A (en) 2023-06-30 2023-06-30 Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit

Country Status (1)

Country Link
CN (1) CN116796904A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117407774A (en) * 2023-12-15 2024-01-16 深圳市地铁集团有限公司 Traffic data processing method and system based on artificial intelligence
CN117688456A (en) * 2024-02-04 2024-03-12 四川轻化工大学 Machine learning auxiliary mixing method for rail transit station classification dynamics

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117407774A (en) * 2023-12-15 2024-01-16 深圳市地铁集团有限公司 Traffic data processing method and system based on artificial intelligence
CN117407774B (en) * 2023-12-15 2024-03-26 深圳市地铁集团有限公司 Traffic data processing method and system based on artificial intelligence
CN117688456A (en) * 2024-02-04 2024-03-12 四川轻化工大学 Machine learning auxiliary mixing method for rail transit station classification dynamics
CN117688456B (en) * 2024-02-04 2024-04-23 四川轻化工大学 Machine learning auxiliary mixing method for rail transit station classification dynamics

Similar Documents

Publication Publication Date Title
CN109299438B (en) Public transport facility supply level evaluation method based on network appointment data
CN116796904A (en) Method, system, electronic equipment and medium for predicting new line passenger flow of rail transit
Bao et al. Spatial analysis of bikeshare ridership with smart card and POI data using geographically weighted regression method
CN110414732A (en) A kind of trip Future Trajectory Prediction method, apparatus, storage medium and electronic equipment
Wang et al. A systematic method to develop three dimensional geometry models of buildings for urban building energy modeling
CN110413855B (en) Region entrance and exit dynamic extraction method based on taxi boarding point
CN110503485B (en) Geographical region classification method and device, electronic equipment and storage medium
CN114548811B (en) Airport reachability detection method and device, electronic equipment and storage medium
CN113112068A (en) Method and system for addressing and layout of public facilities in villages and small towns
CN112184282A (en) Cinema site selection model establishing method, cinema site selection method and cinema site selection platform
CN113962472A (en) Time-space double-attention subway passenger flow short-time prediction method based on GAT-Seq2Seq model
CN112288311A (en) Convenient and fast residential area supporting facility metering method based on POI data
CN114969007A (en) Urban functional area identification method based on function mixing degree and integrated learning
CN114661744B (en) Terrain database updating method and system based on deep learning
CN114662774A (en) City block vitality prediction method, storage medium and terminal
Bikdeli et al. Accessibility modeling for land use, population and public transportation in Mashhad, NE Iran
CN117079148B (en) Urban functional area identification method, device, equipment and medium
CN112258029B (en) Demand prediction method for sharing bicycles around subway station
CN111008730B (en) Crowd concentration prediction model construction method and device based on urban space structure
CN117291000A (en) Auxiliary model for analyzing big data of homeland space planning
CN115146990A (en) Urban vitality quantitative evaluation method integrating multi-source geographic big data
CN112308382B (en) Open TOD city big data monitoring analysis platform
CN114722276A (en) Data management and analysis method for smart city service
Weerasinghe et al. A GIS based methodology to demarcate modified traffic analysis zones in urban areas
CN111860182B (en) Subway passenger flow source intelligent analysis method based on remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination