CN114444795A - Single-line bus passenger travel data generation method - Google Patents

Single-line bus passenger travel data generation method Download PDF

Info

Publication number
CN114444795A
CN114444795A CN202210081034.2A CN202210081034A CN114444795A CN 114444795 A CN114444795 A CN 114444795A CN 202210081034 A CN202210081034 A CN 202210081034A CN 114444795 A CN114444795 A CN 114444795A
Authority
CN
China
Prior art keywords
line
similarity
data
candidate
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210081034.2A
Other languages
Chinese (zh)
Inventor
李军
区静怡
骆刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202210081034.2A priority Critical patent/CN114444795A/en
Publication of CN114444795A publication Critical patent/CN114444795A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • G06Q10/047Optimisation of routes or paths, e.g. travelling salesman problem
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a single-line bus passenger travel data generation method, which comprises the steps of randomly obtaining G candidate lines, obtaining space vectors of stations of all the candidate lines according to station space information, obtaining space similar total values of all the candidate lines and a target line according to the space vectors, selecting candidate lines corresponding to the former Q space similar total values as preferred candidate lines, obtaining the similarity between each preferred candidate line and the target line through calculation of a data similarity index and the similarity, selecting the preferred candidate line with the highest similarity as an optimal candidate line, and finally taking the optimal candidate line as a learning sample to generate passenger travel data of the target sample through a cyclic generation type confrontation network algorithm. The method can conveniently, quickly and inexpensively generate a large amount of bus passenger travel data of the target route based on real data, the generated data accords with real rules, and the method can be applied to the fields of bus analysis and the like.

Description

Single-line bus passenger travel data generation method
Technical Field
The invention relates to the field of transportation engineering, belongs to the category of urban public transport, and particularly relates to a single-line bus passenger trip data generation method.
Background
The bus passenger travel data refers to a series of data records generated when passengers travel by using buses, and the data records comprise passenger boarding station information, passenger alighting station information, corresponding boarding and alighting time, line numbers and the like. Passenger flow analysis and passenger classification of the urban public transport system can be carried out through public transport passenger travel data, and public transport travel information is widely applied to urban public transport operation scheduling.
A series of innovations are made on bus travel payment modes under the high-speed development of the mobile internet, the payment modes adopted by urban buses in China are a segmented charging system and a ticket system, the charging system is used for carrying out charge calculation by recording passenger getting-on information and getting-off information, and the ticket system is used for only recording the getting-on time-space information of passengers without recording getting-off time-space information. For urban buses adopting a bus-ticket system mode, information data of getting-off passengers are lacked, and the data of bus passengers are incomplete. At present, the commonly used means is to complete data collection through a car following method and a get-off station deduction method, but the get-off deduction method depends on data hypothesis and is low in accuracy, the data flow of the car following method is complex and tedious, manual tabulation recording is relied on, the efficiency is low, and the cost is high.
Disclosure of Invention
The invention provides a method for generating single-line bus passenger travel data, which can conveniently, quickly and low-cost generate single-line target bus passenger travel data under a bus ticket system based on the existing complete travel data set of similar single-line buses, and provides necessary data conditions for application scenes such as bus analysis and the like.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a single-line bus passenger travel data generation method comprises the following steps:
s1, randomly selecting G bus routes as candidate routes, and acquiring historical bus taking data and station space information of all the candidate routes;
s2, determining space vectors of all stations in the G candidate lines according to the station space information;
s3, calculating the spatial similarity indexes of all the sites in the target line and all the sites in any candidate line according to the spatial vectors of the sites to obtain the total spatial similarity value of the candidate line and the target line;
s4, obtaining a total spatial similarity value with the target line by each candidate line according to the mode of the step S3, sequencing the total spatial similarity values of all the candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q total spatial similarity values as preferred candidate lines;
s5, calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain a data similarity total value of the preferred candidate line and the target line;
s6, each preferred candidate line obtains a data similarity total value with the target line in the mode of S5, the similarity between each preferred candidate line and the target line is calculated according to the obtained data similarity total value, and the preferred candidate line corresponding to the maximum value of the similarity is selected as the optimal candidate line;
s7, taking the optimal candidate line and the target line as a learning sample and a target sample;
and S8, generating passenger travel data of the target sample by learning the route travel data of the sample through a cycle generation type confrontation network algorithm.
The method of the invention comprises the steps of firstly randomly obtaining G bus lines as candidate lines, then obtaining space vectors of all stops of all the candidate lines according to the stop space information, then obtaining the total spatial similarity value of all the candidate lines and a target line selected in advance according to the space vectors, and selecting candidate lines corresponding to the first Q (Q < G, both positive integers) spatially similar total values as preferred candidate lines, and then, calculating the data similarity index and the similarity from the Q preferred candidate lines to obtain the similarity between each preferred candidate line and the target line, then selecting the preferred candidate line with the highest similarity as the optimal candidate line, finally taking the optimal candidate line as a learning sample and the target line as a target sample, and generating the passenger travel data of the target sample by using the travel data of the learning sample through a cyclic generation type confrontation network algorithm.
Further, in step S2, the site spatial information includes longitude and latitude information and interest point classification information, and the process of determining the site spatial vector of the candidate route according to the site spatial information is as follows:
determining the longitude and latitude of each site and the data quantity of interest points under the tolerance radius, recording the longitude and latitude information of each site as a 2-dimensional vector l, recording the tolerable radius of the site as r, counting the number of k interest points of the site under the tolerable radius r, and recording the number as a k-dimensional vector h, wherein the space vector of each site is represented by a (2+ k) -dimensional vector (l, h).
Further, in step S3, the calculation formula of the spatial similarity index is as follows:
Figure BDA0003485897000000021
in the formula, siFor the ith station in the target line, bjIs the jth station in the candidate line, P(s)i,bj) For site siAnd site bjSpatial similarity index of (1)i、hiFor site siLatitude and longitude and point of interest information, |j、hjTo station bjThe longitude and latitude and the information of the interest point, alpha and beta are respectively the influence coefficients of the longitude and latitude and the information of the interest point on the space similarity index, and the value range of the space similarity index P is [ -1, 1]And the larger the value is, the higher the spatial similarity of the two sites is.
Further, in step S3, the specific process of obtaining the total spatial similarity value between the candidate line and the target line is as follows:
with n stations, i.e. s, for the target linei∈{s1,s2,...,sn1, 2, …, n, and m stations, i.e. b, of the candidate linej∈{b1,b2,...,bm},j=1,2,…,m;
Firstly, the station s in the target line1Calculating the spatial similarity index with m sites in the candidate line to obtain a site s1Selecting m spatial similarity indexes corresponding to m sites in the candidate line, and selecting the maximum value as a site s1The maximum value P of the spatial similarity index corresponding to the candidate line1By analogy, the target is obtainedThe set of the maximum value of the spatial similarity index corresponding to the candidate line for the n stations in the line is marked as { P1,P2,...,Pn}, finally P is added1,P2,...,PnAnd accumulating to obtain the spatial similarity total value of the candidate line and the target line.
Further, in step S5, the data similarity index is calculated as follows:
Figure BDA0003485897000000031
in the formula, E(s)i,bj) For site siAnd site bjData similarity index of (1), Rs、RbTotal number of passengers, T, for target route and preferred candidate route respectivelyu、TdSites s, each being a target lineiThe number of persons getting on or off the vehicle, Cu、CdSite b of preferred candidate line respectivelyjThe number of people getting on or off the vehicle is within the value range of 0,1]And the larger the value is, the lower the data similarity of the two sites is.
Further, in step S5, the specific process of obtaining the data similarity total value of the preferred candidate route and the target route is as follows:
firstly, the station s in the target line1Calculating data similarity indexes with m sites in the preferred candidate line to obtain a site s1M data similarity indexes corresponding to m sites in the preferred candidate line are selected, and the maximum value is selected as a site s1Maximum value E of data similarity index corresponding to the preferred candidate route1And by analogy, obtaining a set of maximum values of data similarity indexes corresponding to the preferred candidate route and the n sites in the target route, and marking as { E1,E2,...,EnGet E out of the solution1,E2,...,EnAnd accumulating to obtain the data similarity total value of the preferred candidate line and the target line.
Further, in step S6, the calculation formula of the similarity is as follows:
Figure BDA0003485897000000041
where V (S, b) is the similarity between the preferred candidate line and the target line, Ei∈{E1,E2,…,En},EiFor stations s in the target lineiMaximum value of data similarity index corresponding to preferred candidate line, i.e.
Figure BDA0003485897000000042
In order to optimize the data similarity total value of the candidate line and the target line, the value range of the similarity V is [0,1 ]]Meanwhile, the larger the value is, the more similar the preferred candidate line and the target line are.
Further, in step S7, a ticket riding data set of the target route is used as the data source of the target sample, and a complete passenger riding data set of the optimal candidate route is used as the data source of the learning sample.
Further, in step S8, the cycle generation type antagonistic network model is used as a generator, the travel data matrix of the learning sample is input to the generator, the simulation data is iteratively generated in the generator, the generated simulation data is put into a decision device for decision, and if the similarity of the simulation data is not lower than a set threshold, the generated simulation data is considered to be valid and output as the passenger travel data of the target sample.
Further, in the decision device, the similarity between the simulation data and the real travel data of the learning sample is calculated firstly, then the similarity is compared with a set threshold, and if the similarity of the simulation data is not lower than the set threshold, the generated simulation data is considered to be effective and output;
the set threshold value is the similarity between the optimal candidate route obtained in step S6 and the target route.
The invention has the beneficial effects that:
the method for generating the bus passenger travel data can generate the passenger travel data of the target line according to the spatial information of the bus stop and the real data of the optimal candidate line, and compared with the traditional following method and the traditional departure stop deducing method, the method for generating the bus passenger travel data of the target line under the bus one-ticket system can conveniently, quickly and low-cost generate the bus passenger travel data of the target line based on the historical real data of the optimal candidate line, only the spatial information of the stop is needed, the method for generating the data is high in applicability, the obtained data is high in accuracy, and the method can be applied to the fields of bus analysis and the like.
Drawings
Fig. 1 is a flow chart of a single-line bus passenger travel data generation method of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
Example 1:
as shown in fig. 1, a method for generating travel data of a single-line bus passenger includes the following steps:
s1, randomly selecting G bus routes as candidate routes, and acquiring historical bus taking data and station space information of all the candidate routes;
s2, determining space vectors of all stations in the G candidate lines according to the station space information;
s3, calculating the spatial similarity indexes of all the sites in the target line and all the sites in any candidate line according to the spatial vectors of the sites to obtain the total spatial similarity value of the candidate line and the target line;
s4, obtaining a total spatial similarity value with the target line by each candidate line according to the mode of the step S3, sequencing the total spatial similarity values of all the candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q total spatial similarity values as preferred candidate lines;
s5, calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain a data similarity total value of the preferred candidate line and the target line;
s6, each preferred candidate line obtains a data similarity total value with the target line in the mode of S5, the similarity between each preferred candidate line and the target line is calculated according to the obtained data similarity total value, and the preferred candidate line corresponding to the maximum value of the similarity is selected as the optimal candidate line;
s7, taking the optimal candidate line and the target line as a learning sample and a target sample;
and S8, generating passenger travel data of the target sample by learning the route travel data of the sample through a cycle generation type confrontation network algorithm.
The method of the invention comprises the steps of firstly randomly obtaining G bus lines as candidate lines, then obtaining space vectors of all stops of all the candidate lines according to the stop space information, then obtaining the total spatial similarity value of all the candidate lines and a target line selected in advance according to the space vectors, and selecting candidate lines corresponding to the first Q (Q < G, both positive integers) space similar total values as preferred candidate lines, and then, calculating the data similarity index and the similarity from the Q preferred candidate lines to obtain the similarity between each preferred candidate line and the target line, then selecting the preferred candidate line with the highest similarity as the optimal candidate line, finally taking the optimal candidate line as a learning sample and the target line as a target sample, and generating the passenger travel data of the target sample by using the travel data of the learning sample through a cyclic generation type confrontation network algorithm.
In the present embodiment, the values of G and Q can be set according to actual conditions. For example, 100 public transportation lines may be randomly selected as candidate lines, that is, G is 100, and after the spatial similarity total values of all the candidate lines and the target line selected in advance are obtained, the candidate lines corresponding to the top 10 spatial similarity total values are selected as preferred candidate lines, that is, Q is 10.
In step S2 of this embodiment, the site spatial information includes longitude and latitude information and interest point classification information, and the process of determining the site spatial vector of the candidate route according to the site spatial information is as follows:
determining the longitude and latitude of each site and the data quantity of interest points under the tolerance radius, recording the longitude and latitude information of each site as a 2-dimensional vector l, recording the tolerable radius of the site as r, counting the number of k interest points of the site under the tolerable radius r, and recording the number as a k-dimensional vector h, wherein the space vector of each site is represented by a (2+ k) -dimensional vector (l, h).
In step S3 of the present embodiment, the calculation formula of the spatial similarity index is as follows:
Figure BDA0003485897000000061
in the formula, siFor the ith station in the target line, bjIs the jth site in the candidate line, P(s)i,bj) For site siAnd site bjSpatial similarity index of (1)i、hiFor site siLatitude and longitude and point of interest information, |j、hjTo station bjThe longitude and latitude and the information of the interest point, alpha and beta are respectively the influence coefficients of the longitude and latitude and the information of the interest point on the space similarity index, and the value range of the space similarity index P is [ -1, 1]And the larger the value is, the higher the spatial similarity of the two sites is.
According to the spatial similarity index, the specific process of obtaining the spatial similarity total value of the candidate line and the target line is as follows:
with n stations, i.e. s, for the target linei∈{s1,s2,...,sn1, 2, …, n, and m stations, i.e. b, of the candidate linej∈{b1,b2,...,bm},j=1,2,…,m;
Firstly, the station s in the target line1Calculating the spatial similarity index with m sites in the candidate line to obtain a site s1Selecting m spatial similarity indexes corresponding to m stations in the candidate line, and selecting the maximum value as the stationPoint s1The maximum value P of the spatial similarity index corresponding to the candidate line1And by analogy, obtaining a set of maximum values of the spatial similarity indexes corresponding to the candidate line and marking as { P } of the n sites in the target line1,P2,...,Pn}, finally P is added1,P2,...,PnAnd accumulating to obtain the spatial similarity total value of the candidate line and the target line.
In step S5 of the present embodiment, the calculation formula of the data similarity index is as follows:
Figure BDA0003485897000000071
in the formula, E(s)i,bj) For site siAnd site bjData similarity index of (1), Rs、RbTotal number of passengers, T, for target route and preferred candidate route respectivelyu、TdSites s, each being a target lineiThe number of persons getting on or off the vehicle, Cu、CdSite b of preferred candidate line respectivelyjThe number of people getting on or off the vehicle is within the value range of 0,1]And the larger the value is, the lower the data similarity of the two sites is.
According to the data similarity index, the specific process of obtaining the data similarity total value of the preferred candidate line and the target line is as follows:
firstly, the station s in the target line1Calculating data similarity indexes with m sites in the preferred candidate line to obtain a site s1Selecting the maximum value from the m data similarity indexes corresponding to the m sites in the preferred candidate line as the site s1Maximum value E of data similarity index corresponding to the preferred candidate route1And by analogy, obtaining a set of maximum values of data similarity indexes corresponding to the preferred candidate route and the n sites in the target route, and marking as { E1,E2,...,EnGet E out of the solution1,E2,...,EnAccumulating to obtain the preferred candidate line and the target lineThe data for the way is similar to the total value.
In step S6 of the present embodiment, the calculation formula of the similarity is as follows:
Figure BDA0003485897000000072
where V (S, b) is the similarity between the preferred candidate line and the target line, Ei∈{E1,E2,...,En},EiFor stations s in the target lineiMaximum value of data similarity index corresponding to preferred candidate line, i.e.
Figure BDA0003485897000000073
In order to optimize the data similarity total value of the candidate line and the target line, the value range of the similarity V is [0,1 ]]Meanwhile, the larger the value is, the more similar the preferred candidate line and the target line are.
In step S7 of the present embodiment, one ticket riding data set of the target route is used as the data source of the target sample, and the complete passenger riding data set of the optimal candidate route is used as the data source of the learning sample.
In step S8 of the present embodiment, a cycle generation type antagonistic network model is used as a generator, a travel data matrix of a learning sample is input to the generator, simulation data is iteratively generated in the generator, the generated simulation data is put into a decision device for decision, and if the similarity of the simulation data is not lower than a set threshold, the generated simulation data is considered valid and output as passenger travel data of a target sample.
In the decision device, the similarity between the simulation data and the real travel data of the learning sample is calculated, then the similarity is compared with a set threshold, and if the similarity of the simulation data is not lower than the set threshold, the generated simulation data is considered to be valid and output. The set threshold is the similarity between the optimal candidate route obtained in step S6 and the target route.
The method for generating the bus passenger travel data can generate the passenger travel data of the target line according to the spatial information of the bus stop and the real data of the optimal candidate line, and compared with the traditional following method and the traditional departure stop deducing method, the method for generating the bus passenger travel data of the target line under the bus one-ticket system can conveniently, quickly and low-cost generate the bus passenger travel data of the target line based on the historical real data of the optimal candidate line, only the spatial information of the stop is needed, the method for generating the data is high in applicability, the obtained data is high in accuracy, and the method can be applied to the fields of bus analysis and the like.
Example 2:
a system to which the method for generating single-line bus passenger travel data in embodiment 1 above is applied is provided, and the system includes:
the data acquisition module is used for randomly selecting G bus lines as candidate lines and acquiring historical bus taking data and station space information of all the candidate lines;
the space vector module is used for determining space vectors of all stations in the G candidate lines according to the station space information;
the spatial similarity total value calculation module is used for calculating spatial similarity indexes of all stations in the target line and all stations in any candidate line according to the spatial vectors of the stations to obtain a spatial similarity total value of the candidate line and the target line, and then each candidate line obtains a spatial similarity total value with the target line according to the method;
the preferred candidate line screening module is used for sorting the spatial similarity total values of all candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q spatial similarity total values as preferred candidate lines;
the data similarity total value calculation module is used for calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain data similarity total values of the preferred candidate line and the target line, and then each preferred candidate line obtains a data similarity total value with the target line according to the mode;
the optimal candidate line screening module is used for calculating the similarity between each optimal candidate line and the target line according to the obtained data similarity total value, and selecting the optimal candidate line corresponding to the maximum similarity value as the optimal candidate line;
the generation module is used for taking the optimal candidate route and the target route as a learning sample and a target sample, and generating passenger travel data of the target sample according to the route travel data of the learning sample through a cyclic generation type confrontation network algorithm;
the data acquisition module, the space vector module, the space similarity total value calculation module, the preferred candidate line screening module, the data similarity total value calculation module, the optimal candidate line screening module and the generation module are in mutual communication in a wireless or wired mode.
Example 3:
the present embodiment is similar to embodiment 2, and is different in that the data acquisition module, the spatial vector module, the spatial similarity total value calculation module, the preferred candidate line screening module, the data similarity total value calculation module, the optimal candidate line screening module, and the generation module of the system in embodiment 2 are integrated into one processor, and the analog data finally output by the processor is displayed on a display screen, so that an operator can visually observe the output result.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A single-line bus passenger travel data generation method is characterized by comprising the following steps:
s1, randomly selecting G bus routes as candidate routes, and acquiring historical bus taking data and station space information of all the candidate routes;
s2, determining space vectors of all stations in the G candidate lines according to the station space information;
s3, calculating the spatial similarity indexes of all the sites in the target line and all the sites in any candidate line according to the spatial vectors of the sites to obtain the total spatial similarity value of the candidate line and the target line;
s4, obtaining a total spatial similarity value with the target line by each candidate line according to the mode of the step S3, sequencing the total spatial similarity values of all the candidate lines and the target line from large to small, and selecting the candidate lines corresponding to the first Q total spatial similarity values as preferred candidate lines;
s5, calculating data similarity indexes of all stations in the target line and all stations in any preferred candidate line to obtain a data similarity total value of the preferred candidate line and the target line;
s6, each preferred candidate line obtains a data similarity total value with the target line in the mode of S5, the similarity between each preferred candidate line and the target line is calculated according to the obtained data similarity total value, and the preferred candidate line corresponding to the maximum value of the similarity is selected as the optimal candidate line;
s7, taking the optimal candidate line and the target line as a learning sample and a target sample;
and S8, generating passenger travel data of the target sample by learning the route travel data of the sample through a cycle generation type confrontation network algorithm.
2. The single-line bus passenger travel data generation method according to claim 1, wherein in step S2, the station spatial information includes longitude and latitude information and interest point classification information, and the process of determining the station spatial vector of the candidate line according to the station spatial information is as follows:
determining the longitude and latitude of each site and the data quantity of interest points under the tolerance radius, recording the longitude and latitude information of each site as a 2-dimensional vector l, recording the tolerable radius of the site as r, counting the number of k interest points of the site under the tolerable radius r, and recording the number as a k-dimensional vector h, wherein the space vector of each site is represented by a (2+ k) -dimensional vector (l, h).
3. The single-line bus passenger travel data generation method according to claim 2, wherein in step S3, the calculation formula of the spatial similarity index is as follows:
Figure FDA0003485896990000011
in the formula, siFor the ith station in the target line, bjIs the jth station in the candidate line, P(s)i,bj) For site siAnd site bjSpatial similarity index of (1)i、hiFor site siLatitude and longitude and point of interest information, |j、hjTo station bjThe longitude and latitude and the information of the interest point, alpha and beta are respectively the influence coefficients of the longitude and latitude and the information of the interest point on the space similarity index, and the value range of the space similarity index P is [ -1, 1]And the larger the value is, the higher the spatial similarity of the two sites is.
4. The single-line bus passenger travel data generation method according to claim 3, wherein in step S3, the specific process of obtaining the total spatial similarity value between the candidate line and the target line is as follows:
with n stations, i.e. s, for the target linei∈{s1,s2,...,sn1, 2, …, n, and m stations, i.e. b, of the candidate linej∈{b1,b2,...,bm},j=1,2,…,m;
Firstly, the station s in the target line1Calculating the spatial similarity index with m sites in the candidate line to obtain a site s1Selecting m spatial similarity indexes corresponding to m sites in the candidate line, and selecting the maximum value as a site s1The maximum value P of the spatial similarity index corresponding to the candidate line1And by analogy, obtaining n sites in the target line and the candidate line pairThe set of maximum values of the corresponding spatial similarity indices is denoted as { P }1,P2,...,Pn}, finally P is added1,P2,...,PnAnd accumulating to obtain the spatial similarity total value of the candidate line and the target line.
5. The single-line bus passenger travel data generation method according to claim 4, wherein in step S5, the calculation formula of the data similarity index is as follows:
Figure FDA0003485896990000021
in the formula, E(s)i,bj) For site siAnd site bjData similarity index of (1), Rs、RbTotal number of passengers, T, for target route and preferred candidate route respectivelyu、TdSites s, each being a target lineiThe number of persons getting on or off the vehicle, Cu、CdSite b of preferred candidate line respectivelyjThe number of people getting on or off the vehicle is within the value range of 0,1]And the larger the value is, the lower the data similarity of the two sites is.
6. The single-line bus passenger travel data generation method according to claim 5, wherein in step S5, the specific process of obtaining the data similarity total value of the preferred candidate route and the target route is as follows:
firstly, the station s in the target line1Calculating data similarity indexes with m sites in the preferred candidate line to obtain a site s1Selecting the maximum value from the m data similarity indexes corresponding to the m sites in the preferred candidate line as the site s1Maximum value E of data similarity index corresponding to the preferred candidate route1And by analogy, obtaining a set of maximum values of data similarity indexes corresponding to the preferred candidate route and the n sites in the target route, and marking as { E1,E2,...,EnGet E out of the solution1,E2,...,EnAnd accumulating to obtain the data similarity total value of the preferred candidate line and the target line.
7. The single-line bus passenger travel data generation method according to claim 6, wherein in step S6, the calculation formula of the similarity is as follows:
Figure FDA0003485896990000031
where V (S, b) is the similarity between the preferred candidate line and the target line, Ei∈{E1,E2,...,En},EiFor stations s in the target lineiMaximum value of data similarity index corresponding to preferred candidate line, i.e.
Figure FDA0003485896990000032
In order to optimize the data similarity total value of the candidate line and the target line, the value range of the similarity V is [0,1 ]]Meanwhile, the larger the value is, the more similar the preferred candidate line and the target line are.
8. The single-line bus passenger travel data generation method according to claim 1, wherein in step S7, a data set of a ticket for a target route is used as a data source of a target sample, and a data set of a complete passenger for an optimal candidate route is used as a data source of a learning sample.
9. The method for generating the passenger travel data on the single-line bus according to claim 1, wherein in step S8, a cyclic generation type confrontation network model is used as a generator, the travel data matrix of the learning sample is input into the generator, the simulation data is iteratively generated in the generator, the generated simulation data is put into a decision device for decision, and if the similarity of the simulation data is not lower than a set threshold, the generated simulation data is considered to be valid and output as the passenger travel data of the target sample.
10. The single-line bus passenger travel data generation method according to claim 9, wherein in the decision device, the similarity between the simulation data and the real travel data of the learning sample is calculated first, and then the similarity is compared with a set threshold, and if the similarity of the simulation data is not lower than the set threshold, the generated simulation data is considered to be valid and output;
the set threshold value is the similarity between the optimal candidate route obtained in step S6 and the target route.
CN202210081034.2A 2022-01-24 2022-01-24 Single-line bus passenger travel data generation method Pending CN114444795A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210081034.2A CN114444795A (en) 2022-01-24 2022-01-24 Single-line bus passenger travel data generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210081034.2A CN114444795A (en) 2022-01-24 2022-01-24 Single-line bus passenger travel data generation method

Publications (1)

Publication Number Publication Date
CN114444795A true CN114444795A (en) 2022-05-06

Family

ID=81370221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210081034.2A Pending CN114444795A (en) 2022-01-24 2022-01-24 Single-line bus passenger travel data generation method

Country Status (1)

Country Link
CN (1) CN114444795A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116663338A (en) * 2023-08-02 2023-08-29 中国电子信息产业集团有限公司第六研究所 Simulation analysis method, device, equipment and medium based on similar calculation example

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116663338A (en) * 2023-08-02 2023-08-29 中国电子信息产业集团有限公司第六研究所 Simulation analysis method, device, equipment and medium based on similar calculation example
CN116663338B (en) * 2023-08-02 2023-10-20 中国电子信息产业集团有限公司第六研究所 Simulation analysis method, device, equipment and medium based on similar calculation example

Similar Documents

Publication Publication Date Title
Qi et al. Analysis and prediction of regional mobility patterns of bus travellers using smart card data and points of interest data
CN108446470B (en) Medical facility accessibility analysis method based on vehicle trajectory data and population distribution
CN108009972B (en) Multi-mode travel O-D demand estimation method based on multi-source data check
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
WO2021013190A1 (en) Meteorological parameter-based high-speed train positioning method and system in navigation blind zone
CN116628455B (en) Urban traffic carbon emission monitoring and decision support method and system
CN111931998B (en) Individual travel mode prediction method and system based on mobile positioning data
CN113380043B (en) Bus arrival time prediction method based on deep neural network calculation
CN115049534A (en) Knowledge distillation-based real-time semantic segmentation method for fisheye image
CN112258029B (en) Demand prediction method for sharing bicycles around subway station
CN106846214A (en) Method of the analysis transport hub accessibility to region public transportation mode competitive influence
CN113298314A (en) Rail transit passenger flow prediction method considering dynamic space-time correlation
CN112884235A (en) Travel recommendation method, and training method and device of travel recommendation model
CN116523093A (en) Grid demand sensing system and method of energy system based on random source load prediction
Zhou et al. Support vector machine and back propagation neutral network approaches for trip mode prediction using mobile phone data
CN114444795A (en) Single-line bus passenger travel data generation method
CN115995149A (en) Multi-source data-based parking supply and demand characteristic dynamic evaluation method and system
CN117333669A (en) Remote sensing image semantic segmentation method, system and equipment based on useful information guidance
Xu et al. A taxi dispatch system based on prediction of demand and destination
CN112101132B (en) Traffic condition prediction method based on graph embedding model and metric learning
CN110704789B (en) Population dynamic measurement and calculation method and system based on &#39;urban superconcephalon&#39; computing platform
CN116128160B (en) Method, system, equipment and medium for predicting peak passenger flow of railway station
CN117172461A (en) Automatic driving bus dispatching system and bus dispatching method based on passenger flow prediction
Chen et al. Customized bus line design model based on multi-source data
CN114757447B (en) Multi-model mixed passenger transport hub station passenger flow prediction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination