CN114444795A - Single-line bus passenger travel data generation method - Google Patents
Single-line bus passenger travel data generation method Download PDFInfo
- Publication number
- CN114444795A CN114444795A CN202210081034.2A CN202210081034A CN114444795A CN 114444795 A CN114444795 A CN 114444795A CN 202210081034 A CN202210081034 A CN 202210081034A CN 114444795 A CN114444795 A CN 114444795A
- Authority
- CN
- China
- Prior art keywords
- line
- similarity
- data
- candidate
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 239000013598 vector Substances 0.000 claims abstract description 32
- 238000004364 calculation method Methods 0.000 claims abstract description 16
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 7
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 5
- 238000004088 simulation Methods 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 abstract description 4
- 238000012216 screening Methods 0.000 description 6
- 230000003042 antagnostic effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005206 flow analysis Methods 0.000 description 1
- 238000004642 transportation engineering Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
- G06Q10/047—Optimisation of routes or paths, e.g. travelling salesman problem
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Evolutionary Biology (AREA)
- Operations Research (AREA)
- Evolutionary Computation (AREA)
- Development Economics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Entrepreneurship & Innovation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a single-line bus passenger travel data generation method, which comprises the steps of randomly obtaining G candidate lines, obtaining space vectors of stations of all the candidate lines according to station space information, obtaining space similar total values of all the candidate lines and a target line according to the space vectors, selecting candidate lines corresponding to the former Q space similar total values as preferred candidate lines, obtaining the similarity between each preferred candidate line and the target line through calculation of a data similarity index and the similarity, selecting the preferred candidate line with the highest similarity as an optimal candidate line, and finally taking the optimal candidate line as a learning sample to generate passenger travel data of the target sample through a cyclic generation type confrontation network algorithm. The method can conveniently, quickly and inexpensively generate a large amount of bus passenger travel data of the target route based on real data, the generated data accords with real rules, and the method can be applied to the fields of bus analysis and the like.
Description
Technical Field
The invention relates to the field of transportation engineering, belongs to the category of urban public transport, and particularly relates to a single-line bus passenger trip data generation method.
Background
The bus passenger travel data refers to a series of data records generated when passengers travel by using buses, and the data records comprise passenger boarding station information, passenger alighting station information, corresponding boarding and alighting time, line numbers and the like. Passenger flow analysis and passenger classification of the urban public transport system can be carried out through public transport passenger travel data, and public transport travel information is widely applied to urban public transport operation scheduling.
A series of innovations are made on bus travel payment modes under the high-speed development of the mobile internet, the payment modes adopted by urban buses in China are a segmented charging system and a ticket system, the charging system is used for carrying out charge calculation by recording passenger getting-on information and getting-off information, and the ticket system is used for only recording the getting-on time-space information of passengers without recording getting-off time-space information. For urban buses adopting a bus-ticket system mode, information data of getting-off passengers are lacked, and the data of bus passengers are incomplete. At present, the commonly used means is to complete data collection through a car following method and a get-off station deduction method, but the get-off deduction method depends on data hypothesis and is low in accuracy, the data flow of the car following method is complex and tedious, manual tabulation recording is relied on, the efficiency is low, and the cost is high.
Disclosure of Invention
The invention provides a method for generating single-line bus passenger travel data, which can conveniently, quickly and low-cost generate single-line target bus passenger travel data under a bus ticket system based on the existing complete travel data set of similar single-line buses, and provides necessary data conditions for application scenes such as bus analysis and the like.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a single-line bus passenger travel data generation method comprises the following steps:
s1, randomly selecting G bus routes as candidate routes, and acquiring historical bus taking data and station space information of all the candidate routes;
s2, determining space vectors of all stations in the G candidate lines according to the station space information;
s3, calculating the spatial similarity indexes of all the sites in the target line and all the sites in any candidate line according to the spatial vectors of the sites to obtain the total spatial similarity value of the candidate line and the target line;
s4, obtaining a total spatial similarity value with the target line by each candidate line according to the mode of the step S3, sequencing the total spatial similarity values of all the candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q total spatial similarity values as preferred candidate lines;
s5, calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain a data similarity total value of the preferred candidate line and the target line;
s6, each preferred candidate line obtains a data similarity total value with the target line in the mode of S5, the similarity between each preferred candidate line and the target line is calculated according to the obtained data similarity total value, and the preferred candidate line corresponding to the maximum value of the similarity is selected as the optimal candidate line;
s7, taking the optimal candidate line and the target line as a learning sample and a target sample;
and S8, generating passenger travel data of the target sample by learning the route travel data of the sample through a cycle generation type confrontation network algorithm.
The method of the invention comprises the steps of firstly randomly obtaining G bus lines as candidate lines, then obtaining space vectors of all stops of all the candidate lines according to the stop space information, then obtaining the total spatial similarity value of all the candidate lines and a target line selected in advance according to the space vectors, and selecting candidate lines corresponding to the first Q (Q < G, both positive integers) spatially similar total values as preferred candidate lines, and then, calculating the data similarity index and the similarity from the Q preferred candidate lines to obtain the similarity between each preferred candidate line and the target line, then selecting the preferred candidate line with the highest similarity as the optimal candidate line, finally taking the optimal candidate line as a learning sample and the target line as a target sample, and generating the passenger travel data of the target sample by using the travel data of the learning sample through a cyclic generation type confrontation network algorithm.
Further, in step S2, the site spatial information includes longitude and latitude information and interest point classification information, and the process of determining the site spatial vector of the candidate route according to the site spatial information is as follows:
determining the longitude and latitude of each site and the data quantity of interest points under the tolerance radius, recording the longitude and latitude information of each site as a 2-dimensional vector l, recording the tolerable radius of the site as r, counting the number of k interest points of the site under the tolerable radius r, and recording the number as a k-dimensional vector h, wherein the space vector of each site is represented by a (2+ k) -dimensional vector (l, h).
Further, in step S3, the calculation formula of the spatial similarity index is as follows:
in the formula, siFor the ith station in the target line, bjIs the jth station in the candidate line, P(s)i,bj) For site siAnd site bjSpatial similarity index of (1)i、hiFor site siLatitude and longitude and point of interest information, |j、hjTo station bjThe longitude and latitude and the information of the interest point, alpha and beta are respectively the influence coefficients of the longitude and latitude and the information of the interest point on the space similarity index, and the value range of the space similarity index P is [ -1, 1]And the larger the value is, the higher the spatial similarity of the two sites is.
Further, in step S3, the specific process of obtaining the total spatial similarity value between the candidate line and the target line is as follows:
with n stations, i.e. s, for the target linei∈{s1,s2,...,sn1, 2, …, n, and m stations, i.e. b, of the candidate linej∈{b1,b2,...,bm},j=1,2,…,m;
Firstly, the station s in the target line1Calculating the spatial similarity index with m sites in the candidate line to obtain a site s1Selecting m spatial similarity indexes corresponding to m sites in the candidate line, and selecting the maximum value as a site s1The maximum value P of the spatial similarity index corresponding to the candidate line1By analogy, the target is obtainedThe set of the maximum value of the spatial similarity index corresponding to the candidate line for the n stations in the line is marked as { P1,P2,...,Pn}, finally P is added1,P2,...,PnAnd accumulating to obtain the spatial similarity total value of the candidate line and the target line.
Further, in step S5, the data similarity index is calculated as follows:
in the formula, E(s)i,bj) For site siAnd site bjData similarity index of (1), Rs、RbTotal number of passengers, T, for target route and preferred candidate route respectivelyu、TdSites s, each being a target lineiThe number of persons getting on or off the vehicle, Cu、CdSite b of preferred candidate line respectivelyjThe number of people getting on or off the vehicle is within the value range of 0,1]And the larger the value is, the lower the data similarity of the two sites is.
Further, in step S5, the specific process of obtaining the data similarity total value of the preferred candidate route and the target route is as follows:
firstly, the station s in the target line1Calculating data similarity indexes with m sites in the preferred candidate line to obtain a site s1M data similarity indexes corresponding to m sites in the preferred candidate line are selected, and the maximum value is selected as a site s1Maximum value E of data similarity index corresponding to the preferred candidate route1And by analogy, obtaining a set of maximum values of data similarity indexes corresponding to the preferred candidate route and the n sites in the target route, and marking as { E1,E2,...,EnGet E out of the solution1,E2,...,EnAnd accumulating to obtain the data similarity total value of the preferred candidate line and the target line.
Further, in step S6, the calculation formula of the similarity is as follows:
where V (S, b) is the similarity between the preferred candidate line and the target line, Ei∈{E1,E2,…,En},EiFor stations s in the target lineiMaximum value of data similarity index corresponding to preferred candidate line, i.e.In order to optimize the data similarity total value of the candidate line and the target line, the value range of the similarity V is [0,1 ]]Meanwhile, the larger the value is, the more similar the preferred candidate line and the target line are.
Further, in step S7, a ticket riding data set of the target route is used as the data source of the target sample, and a complete passenger riding data set of the optimal candidate route is used as the data source of the learning sample.
Further, in step S8, the cycle generation type antagonistic network model is used as a generator, the travel data matrix of the learning sample is input to the generator, the simulation data is iteratively generated in the generator, the generated simulation data is put into a decision device for decision, and if the similarity of the simulation data is not lower than a set threshold, the generated simulation data is considered to be valid and output as the passenger travel data of the target sample.
Further, in the decision device, the similarity between the simulation data and the real travel data of the learning sample is calculated firstly, then the similarity is compared with a set threshold, and if the similarity of the simulation data is not lower than the set threshold, the generated simulation data is considered to be effective and output;
the set threshold value is the similarity between the optimal candidate route obtained in step S6 and the target route.
The invention has the beneficial effects that:
the method for generating the bus passenger travel data can generate the passenger travel data of the target line according to the spatial information of the bus stop and the real data of the optimal candidate line, and compared with the traditional following method and the traditional departure stop deducing method, the method for generating the bus passenger travel data of the target line under the bus one-ticket system can conveniently, quickly and low-cost generate the bus passenger travel data of the target line based on the historical real data of the optimal candidate line, only the spatial information of the stop is needed, the method for generating the data is high in applicability, the obtained data is high in accuracy, and the method can be applied to the fields of bus analysis and the like.
Drawings
Fig. 1 is a flow chart of a single-line bus passenger travel data generation method of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
Example 1:
as shown in fig. 1, a method for generating travel data of a single-line bus passenger includes the following steps:
s1, randomly selecting G bus routes as candidate routes, and acquiring historical bus taking data and station space information of all the candidate routes;
s2, determining space vectors of all stations in the G candidate lines according to the station space information;
s3, calculating the spatial similarity indexes of all the sites in the target line and all the sites in any candidate line according to the spatial vectors of the sites to obtain the total spatial similarity value of the candidate line and the target line;
s4, obtaining a total spatial similarity value with the target line by each candidate line according to the mode of the step S3, sequencing the total spatial similarity values of all the candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q total spatial similarity values as preferred candidate lines;
s5, calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain a data similarity total value of the preferred candidate line and the target line;
s6, each preferred candidate line obtains a data similarity total value with the target line in the mode of S5, the similarity between each preferred candidate line and the target line is calculated according to the obtained data similarity total value, and the preferred candidate line corresponding to the maximum value of the similarity is selected as the optimal candidate line;
s7, taking the optimal candidate line and the target line as a learning sample and a target sample;
and S8, generating passenger travel data of the target sample by learning the route travel data of the sample through a cycle generation type confrontation network algorithm.
The method of the invention comprises the steps of firstly randomly obtaining G bus lines as candidate lines, then obtaining space vectors of all stops of all the candidate lines according to the stop space information, then obtaining the total spatial similarity value of all the candidate lines and a target line selected in advance according to the space vectors, and selecting candidate lines corresponding to the first Q (Q < G, both positive integers) space similar total values as preferred candidate lines, and then, calculating the data similarity index and the similarity from the Q preferred candidate lines to obtain the similarity between each preferred candidate line and the target line, then selecting the preferred candidate line with the highest similarity as the optimal candidate line, finally taking the optimal candidate line as a learning sample and the target line as a target sample, and generating the passenger travel data of the target sample by using the travel data of the learning sample through a cyclic generation type confrontation network algorithm.
In the present embodiment, the values of G and Q can be set according to actual conditions. For example, 100 public transportation lines may be randomly selected as candidate lines, that is, G is 100, and after the spatial similarity total values of all the candidate lines and the target line selected in advance are obtained, the candidate lines corresponding to the top 10 spatial similarity total values are selected as preferred candidate lines, that is, Q is 10.
In step S2 of this embodiment, the site spatial information includes longitude and latitude information and interest point classification information, and the process of determining the site spatial vector of the candidate route according to the site spatial information is as follows:
determining the longitude and latitude of each site and the data quantity of interest points under the tolerance radius, recording the longitude and latitude information of each site as a 2-dimensional vector l, recording the tolerable radius of the site as r, counting the number of k interest points of the site under the tolerable radius r, and recording the number as a k-dimensional vector h, wherein the space vector of each site is represented by a (2+ k) -dimensional vector (l, h).
In step S3 of the present embodiment, the calculation formula of the spatial similarity index is as follows:
in the formula, siFor the ith station in the target line, bjIs the jth site in the candidate line, P(s)i,bj) For site siAnd site bjSpatial similarity index of (1)i、hiFor site siLatitude and longitude and point of interest information, |j、hjTo station bjThe longitude and latitude and the information of the interest point, alpha and beta are respectively the influence coefficients of the longitude and latitude and the information of the interest point on the space similarity index, and the value range of the space similarity index P is [ -1, 1]And the larger the value is, the higher the spatial similarity of the two sites is.
According to the spatial similarity index, the specific process of obtaining the spatial similarity total value of the candidate line and the target line is as follows:
with n stations, i.e. s, for the target linei∈{s1,s2,...,sn1, 2, …, n, and m stations, i.e. b, of the candidate linej∈{b1,b2,...,bm},j=1,2,…,m;
Firstly, the station s in the target line1Calculating the spatial similarity index with m sites in the candidate line to obtain a site s1Selecting m spatial similarity indexes corresponding to m stations in the candidate line, and selecting the maximum value as the stationPoint s1The maximum value P of the spatial similarity index corresponding to the candidate line1And by analogy, obtaining a set of maximum values of the spatial similarity indexes corresponding to the candidate line and marking as { P } of the n sites in the target line1,P2,...,Pn}, finally P is added1,P2,...,PnAnd accumulating to obtain the spatial similarity total value of the candidate line and the target line.
In step S5 of the present embodiment, the calculation formula of the data similarity index is as follows:
in the formula, E(s)i,bj) For site siAnd site bjData similarity index of (1), Rs、RbTotal number of passengers, T, for target route and preferred candidate route respectivelyu、TdSites s, each being a target lineiThe number of persons getting on or off the vehicle, Cu、CdSite b of preferred candidate line respectivelyjThe number of people getting on or off the vehicle is within the value range of 0,1]And the larger the value is, the lower the data similarity of the two sites is.
According to the data similarity index, the specific process of obtaining the data similarity total value of the preferred candidate line and the target line is as follows:
firstly, the station s in the target line1Calculating data similarity indexes with m sites in the preferred candidate line to obtain a site s1Selecting the maximum value from the m data similarity indexes corresponding to the m sites in the preferred candidate line as the site s1Maximum value E of data similarity index corresponding to the preferred candidate route1And by analogy, obtaining a set of maximum values of data similarity indexes corresponding to the preferred candidate route and the n sites in the target route, and marking as { E1,E2,...,EnGet E out of the solution1,E2,...,EnAccumulating to obtain the preferred candidate line and the target lineThe data for the way is similar to the total value.
In step S6 of the present embodiment, the calculation formula of the similarity is as follows:
where V (S, b) is the similarity between the preferred candidate line and the target line, Ei∈{E1,E2,...,En},EiFor stations s in the target lineiMaximum value of data similarity index corresponding to preferred candidate line, i.e.In order to optimize the data similarity total value of the candidate line and the target line, the value range of the similarity V is [0,1 ]]Meanwhile, the larger the value is, the more similar the preferred candidate line and the target line are.
In step S7 of the present embodiment, one ticket riding data set of the target route is used as the data source of the target sample, and the complete passenger riding data set of the optimal candidate route is used as the data source of the learning sample.
In step S8 of the present embodiment, a cycle generation type antagonistic network model is used as a generator, a travel data matrix of a learning sample is input to the generator, simulation data is iteratively generated in the generator, the generated simulation data is put into a decision device for decision, and if the similarity of the simulation data is not lower than a set threshold, the generated simulation data is considered valid and output as passenger travel data of a target sample.
In the decision device, the similarity between the simulation data and the real travel data of the learning sample is calculated, then the similarity is compared with a set threshold, and if the similarity of the simulation data is not lower than the set threshold, the generated simulation data is considered to be valid and output. The set threshold is the similarity between the optimal candidate route obtained in step S6 and the target route.
The method for generating the bus passenger travel data can generate the passenger travel data of the target line according to the spatial information of the bus stop and the real data of the optimal candidate line, and compared with the traditional following method and the traditional departure stop deducing method, the method for generating the bus passenger travel data of the target line under the bus one-ticket system can conveniently, quickly and low-cost generate the bus passenger travel data of the target line based on the historical real data of the optimal candidate line, only the spatial information of the stop is needed, the method for generating the data is high in applicability, the obtained data is high in accuracy, and the method can be applied to the fields of bus analysis and the like.
Example 2:
a system to which the method for generating single-line bus passenger travel data in embodiment 1 above is applied is provided, and the system includes:
the data acquisition module is used for randomly selecting G bus lines as candidate lines and acquiring historical bus taking data and station space information of all the candidate lines;
the space vector module is used for determining space vectors of all stations in the G candidate lines according to the station space information;
the spatial similarity total value calculation module is used for calculating spatial similarity indexes of all stations in the target line and all stations in any candidate line according to the spatial vectors of the stations to obtain a spatial similarity total value of the candidate line and the target line, and then each candidate line obtains a spatial similarity total value with the target line according to the method;
the preferred candidate line screening module is used for sorting the spatial similarity total values of all candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q spatial similarity total values as preferred candidate lines;
the data similarity total value calculation module is used for calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain data similarity total values of the preferred candidate line and the target line, and then each preferred candidate line obtains a data similarity total value with the target line according to the mode;
the optimal candidate line screening module is used for calculating the similarity between each optimal candidate line and the target line according to the obtained data similarity total value, and selecting the optimal candidate line corresponding to the maximum similarity value as the optimal candidate line;
the generation module is used for taking the optimal candidate route and the target route as a learning sample and a target sample, and generating passenger travel data of the target sample according to the route travel data of the learning sample through a cyclic generation type confrontation network algorithm;
the data acquisition module, the space vector module, the space similarity total value calculation module, the preferred candidate line screening module, the data similarity total value calculation module, the optimal candidate line screening module and the generation module are in mutual communication in a wireless or wired mode.
Example 3:
the present embodiment is similar to embodiment 2, and is different in that the data acquisition module, the spatial vector module, the spatial similarity total value calculation module, the preferred candidate line screening module, the data similarity total value calculation module, the optimal candidate line screening module, and the generation module of the system in embodiment 2 are integrated into one processor, and the analog data finally output by the processor is displayed on a display screen, so that an operator can visually observe the output result.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (10)
1. A single-line bus passenger travel data generation method is characterized by comprising the following steps:
s1, randomly selecting G bus routes as candidate routes, and acquiring historical bus taking data and station space information of all the candidate routes;
s2, determining space vectors of all stations in the G candidate lines according to the station space information;
s3, calculating the spatial similarity indexes of all the sites in the target line and all the sites in any candidate line according to the spatial vectors of the sites to obtain the total spatial similarity value of the candidate line and the target line;
s4, obtaining a total spatial similarity value with the target line by each candidate line according to the mode of the step S3, sequencing the total spatial similarity values of all the candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q total spatial similarity values as preferred candidate lines;
s5, calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain a data similarity total value of the preferred candidate line and the target line;
s6, each preferred candidate line obtains a data similarity total value with the target line in the mode of S5, the similarity between each preferred candidate line and the target line is calculated according to the obtained data similarity total value, and the preferred candidate line corresponding to the maximum value of the similarity is selected as the optimal candidate line;
s7, taking the optimal candidate line and the target line as a learning sample and a target sample;
and S8, generating passenger travel data of the target sample by learning the route travel data of the sample through a cycle generation type confrontation network algorithm.
2. The single-line bus passenger travel data generation method according to claim 1, wherein in step S2, the station spatial information includes longitude and latitude information and interest point classification information, and the process of determining the station spatial vector of the candidate line according to the station spatial information is as follows:
determining the longitude and latitude of each site and the data quantity of interest points under the tolerance radius, recording the longitude and latitude information of each site as a 2-dimensional vector l, recording the tolerable radius of the site as r, counting the number of k interest points of the site under the tolerable radius r, and recording the number as a k-dimensional vector h, wherein the space vector of each site is represented by a (2+ k) -dimensional vector (l, h).
3. The single-line bus passenger travel data generation method according to claim 2, wherein in step S3, the calculation formula of the spatial similarity index is as follows:
in the formula, siFor the ith station in the target line, bjIs the jth station in the candidate line, P(s)i,bj) For site siAnd site bjSpatial similarity index of (1)i、hiFor site siLatitude and longitude and point of interest information, |j、hjTo station bjThe longitude and latitude and the information of the interest point, alpha and beta are respectively the influence coefficients of the longitude and latitude and the information of the interest point on the space similarity index, and the value range of the space similarity index P is [ -1, 1]And the larger the value is, the higher the spatial similarity of the two sites is.
4. The single-line bus passenger travel data generation method according to claim 3, wherein in step S3, the specific process of obtaining the total spatial similarity value between the candidate line and the target line is as follows:
with n stations, i.e. s, for the target linei∈{s1,s2,...,sn1, 2, …, n, and m stations, i.e. b, of the candidate linej∈{b1,b2,...,bm},j=1,2,…,m;
Firstly, the station s in the target line1Calculating the spatial similarity index with m sites in the candidate line to obtain a site s1Selecting m spatial similarity indexes corresponding to m sites in the candidate line, and selecting the maximum value as a site s1The maximum value P of the spatial similarity index corresponding to the candidate line1And by analogy, obtaining n sites in the target line and the candidate line pairThe set of maximum values of the corresponding spatial similarity indices is denoted as { P }1,P2,...,Pn}, finally P is added1,P2,...,PnAnd accumulating to obtain the spatial similarity total value of the candidate line and the target line.
5. The single-line bus passenger travel data generation method according to claim 4, wherein in step S5, the calculation formula of the data similarity index is as follows:
in the formula, E(s)i,bj) For site siAnd site bjData similarity index of (1), Rs、RbTotal number of passengers, T, for target route and preferred candidate route respectivelyu、TdSites s, each being a target lineiThe number of persons getting on or off the vehicle, Cu、CdSite b of preferred candidate line respectivelyjThe number of people getting on or off the vehicle is within the value range of 0,1]And the larger the value is, the lower the data similarity of the two sites is.
6. The single-line bus passenger travel data generation method according to claim 5, wherein in step S5, the specific process of obtaining the data similarity total value of the preferred candidate route and the target route is as follows:
firstly, the station s in the target line1Calculating data similarity indexes with m sites in the preferred candidate line to obtain a site s1Selecting the maximum value from the m data similarity indexes corresponding to the m sites in the preferred candidate line as the site s1Maximum value E of data similarity index corresponding to the preferred candidate route1And by analogy, obtaining a set of maximum values of data similarity indexes corresponding to the preferred candidate route and the n sites in the target route, and marking as { E1,E2,...,EnGet E out of the solution1,E2,...,EnAnd accumulating to obtain the data similarity total value of the preferred candidate line and the target line.
7. The single-line bus passenger travel data generation method according to claim 6, wherein in step S6, the calculation formula of the similarity is as follows:
where V (S, b) is the similarity between the preferred candidate line and the target line, Ei∈{E1,E2,...,En},EiFor stations s in the target lineiMaximum value of data similarity index corresponding to preferred candidate line, i.e.In order to optimize the data similarity total value of the candidate line and the target line, the value range of the similarity V is [0,1 ]]Meanwhile, the larger the value is, the more similar the preferred candidate line and the target line are.
8. The single-line bus passenger travel data generation method according to claim 1, wherein in step S7, a data set of a ticket for a target route is used as a data source of a target sample, and a data set of a complete passenger for an optimal candidate route is used as a data source of a learning sample.
9. The method for generating the passenger travel data on the single-line bus according to claim 1, wherein in step S8, a cyclic generation type confrontation network model is used as a generator, the travel data matrix of the learning sample is input into the generator, the simulation data is iteratively generated in the generator, the generated simulation data is put into a decision device for decision, and if the similarity of the simulation data is not lower than a set threshold, the generated simulation data is considered to be valid and output as the passenger travel data of the target sample.
10. The single-line bus passenger travel data generation method according to claim 9, wherein in the decision device, the similarity between the simulation data and the real travel data of the learning sample is calculated first, and then the similarity is compared with a set threshold, and if the similarity of the simulation data is not lower than the set threshold, the generated simulation data is considered to be valid and output;
the set threshold value is the similarity between the optimal candidate route obtained in step S6 and the target route.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210081034.2A CN114444795A (en) | 2022-01-24 | 2022-01-24 | Single-line bus passenger travel data generation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210081034.2A CN114444795A (en) | 2022-01-24 | 2022-01-24 | Single-line bus passenger travel data generation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114444795A true CN114444795A (en) | 2022-05-06 |
Family
ID=81370221
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210081034.2A Pending CN114444795A (en) | 2022-01-24 | 2022-01-24 | Single-line bus passenger travel data generation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114444795A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116663338A (en) * | 2023-08-02 | 2023-08-29 | 中国电子信息产业集团有限公司第六研究所 | Simulation analysis method, device, equipment and medium based on similar calculation example |
-
2022
- 2022-01-24 CN CN202210081034.2A patent/CN114444795A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116663338A (en) * | 2023-08-02 | 2023-08-29 | 中国电子信息产业集团有限公司第六研究所 | Simulation analysis method, device, equipment and medium based on similar calculation example |
CN116663338B (en) * | 2023-08-02 | 2023-10-20 | 中国电子信息产业集团有限公司第六研究所 | Simulation analysis method, device, equipment and medium based on similar calculation example |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qi et al. | Analysis and prediction of regional mobility patterns of bus travellers using smart card data and points of interest data | |
CN108446470B (en) | Medical facility accessibility analysis method based on vehicle trajectory data and population distribution | |
CN108009972B (en) | Multi-mode travel O-D demand estimation method based on multi-source data check | |
CN112016605B (en) | Target detection method based on corner alignment and boundary matching of bounding box | |
WO2021013190A1 (en) | Meteorological parameter-based high-speed train positioning method and system in navigation blind zone | |
CN116628455B (en) | Urban traffic carbon emission monitoring and decision support method and system | |
CN111931998B (en) | Individual travel mode prediction method and system based on mobile positioning data | |
CN113380043B (en) | Bus arrival time prediction method based on deep neural network calculation | |
CN115049534A (en) | Knowledge distillation-based real-time semantic segmentation method for fisheye image | |
CN112258029B (en) | Demand prediction method for sharing bicycles around subway station | |
CN106846214A (en) | Method of the analysis transport hub accessibility to region public transportation mode competitive influence | |
CN113298314A (en) | Rail transit passenger flow prediction method considering dynamic space-time correlation | |
CN112884235A (en) | Travel recommendation method, and training method and device of travel recommendation model | |
CN116523093A (en) | Grid demand sensing system and method of energy system based on random source load prediction | |
Zhou et al. | Support vector machine and back propagation neutral network approaches for trip mode prediction using mobile phone data | |
CN114444795A (en) | Single-line bus passenger travel data generation method | |
CN115995149A (en) | Multi-source data-based parking supply and demand characteristic dynamic evaluation method and system | |
CN117333669A (en) | Remote sensing image semantic segmentation method, system and equipment based on useful information guidance | |
Xu et al. | A taxi dispatch system based on prediction of demand and destination | |
CN112101132B (en) | Traffic condition prediction method based on graph embedding model and metric learning | |
CN110704789B (en) | Population dynamic measurement and calculation method and system based on 'urban superconcephalon' computing platform | |
CN116128160B (en) | Method, system, equipment and medium for predicting peak passenger flow of railway station | |
CN117172461A (en) | Automatic driving bus dispatching system and bus dispatching method based on passenger flow prediction | |
Chen et al. | Customized bus line design model based on multi-source data | |
CN114757447B (en) | Multi-model mixed passenger transport hub station passenger flow prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |