CN108847037B - Non-global information oriented urban road network path planning method - Google Patents

Non-global information oriented urban road network path planning method Download PDF

Info

Publication number
CN108847037B
CN108847037B CN201810677156.1A CN201810677156A CN108847037B CN 108847037 B CN108847037 B CN 108847037B CN 201810677156 A CN201810677156 A CN 201810677156A CN 108847037 B CN108847037 B CN 108847037B
Authority
CN
China
Prior art keywords
road
time
intersection
node
road network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810677156.1A
Other languages
Chinese (zh)
Other versions
CN108847037A (en
Inventor
胡征兵
胡岑诺
唐传慧
蒋玲
杨琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central China Normal University
Original Assignee
Central China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central China Normal University filed Critical Central China Normal University
Priority to CN201810677156.1A priority Critical patent/CN108847037B/en
Publication of CN108847037A publication Critical patent/CN108847037A/en
Application granted granted Critical
Publication of CN108847037B publication Critical patent/CN108847037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals
    • G08G1/08Controlling traffic signals according to detected number or speed of vehicles
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • G08G1/0962Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
    • G08G1/0968Systems involving transmission of navigation instructions to the vehicle

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a non-global information-oriented urban road network path planning method, which realizes the improvement of urban traffic network efficiency by balancing traffic network flow. Current existing path planning methods either can only calculate the optimal route for a single vehicle; or require more information and make assumptions. Aiming at the problems, the method provided by the invention intelligently learns and evaluates the congestion index of the current road section and constructs a PST model by monitoring the state change of the road section periodically, and uses A*And the R algorithm is used for selecting a path, so that the road network state is in a flow balance state. A large number of experiments show that the method can more effectively balance traffic network flow under the scene that the number of vehicles is relatively large, alleviate traffic jam and have good performance on key indexes such as reduction of average running time and average running distance of the vehicles. In addition, even under the condition that a large number of vehicles in a road network do not accept route planning navigation, the method can also obviously improve traffic jam.

Description

Non-global information oriented urban road network path planning method
Technical Field
The invention belongs to the technical field of computer science, intelligent traffic and machine learning, and relates to a non-global information-oriented urban road network path planning method.
Background
In order to improve the utilization rate of urban road section resources, relieve road section congestion and reduce vehicle running delay time, scholars at home and abroad propose a plurality of methods. Current research on path optimization mainly includes: a. the*The algorithm comprises an algorithm, a Dijkstra shortest path algorithm, an SPFA algorithm, a dynamic planning algorithm and the like, wherein the algorithms belong to deterministic algorithms, namely, a unique path can be calculated for a given road network state; what corresponds to a deterministic algorithm is an indeterminate algorithm, such as: PSO algorithm, genetic algorithm, ant colony optimization algorithm, neural network algorithm, etc., which are also called intelligent algorithm, can provide optimal or suboptimal route according to probability, and in addition, optimization algorithm based on current traffic condition, such as TomTom, Google navigation and the like can provide the optimal path and a plurality of suboptimal paths under the current situation. The method solves the path optimizing problem of a single vehicle, and has the following problems: the optimal route can only be calculated for individual vehicles, competition of other vehicles for road segment resources is not considered, high dynamics of a modern urban traffic network is ignored, and if congestion occurs, new congestion can be caused by providing the same alternative route for a plurality of vehicles at the same time. Therefore, many new methods are also proposed for the uncertain dynamic scene students of multiple vehicles: in order to achieve traffic network traffic balance, Khodadadi et al combine the ant colony algorithm with fuzzy logic to calculate the instantaneous state of the traffic network and distribute traffic according to the minimum travel time to improve traffic management as much as possible; the method comprises the following steps that (1) Joger and the like perform track prediction on a moving object by using a hidden Markov model based on mass moving tracks in a big data environment; liang Z et al propose active path planning based on congestion prediction; strictly equal people establish a multi-intersection path selection model to uniformly distribute traffic flow on selectable paths on the premise of ensuring the preference of vehicles, so that the application efficiency of urban traffic network road resources is maximized; wang L et al guide the vehicle to select a travel route by pricing based on the congestion level of the route. However, these methods either cannot effectively handle the scenario of high traffic; or more information and assumptions need to be made, such as the need to obtain a set of ODs for all vehicles in advance, road impedance functions, etc.
Reinforcement learning is widely applied in the field of intelligent transportation: wiering et al apply Q learning to the field of traffic signal control, establish a utility value function based on vehicles with the aim of minimizing the accumulated waiting time of signal lamps passed by all vehicles when the vehicles go out of and enter a city, and optimally combine the optimal path selection of the vehicles with the minimal single-node delay; tantawy et al propose a traffic light adaptive algorithm based on multi-Agent system modular Q reinforcement learning, and optimize a phase sequence by utilizing cooperative control of adjacent traffic lights, so that average time delay at intersections is reduced. The method mainly applies reinforcement learning to optimization of traffic lights, besides, scholars also explore the application of reinforcement learning to path planning, Basha N and the like use SARSA reinforcement learning to solve the problem of dynamic traffic routing based on a network simulated by a cell transmission model, so that a road network reaches a balanced state; to reduce traffic delay in order to avoid congestion, Arokhlo et al propose to calculate a minimum cost path from an origin to a destination based on a multi-Agent reinforcement learning method. However, either the agents can only study and decide in isolation with each other, and the overall coordination and cooperation of the agents cannot be realized; or the coordinated cooperation of the agents is realized, high space-time cost is caused, and the convergence speed of the algorithm is reduced.
In summary, most algorithms existing at present can only find the optimal path for a single vehicle based on the current road network state; or the application scene of the algorithm needs a plurality of assumptions as a premise; or perform poorly in the case of a greater density of vehicles.
Disclosure of Invention
Aiming at the three defects, the invention provides a method for evaluating the congestion index for road sections of a road network through a self-adaptive learning period, fully utilizes the continuity of the macroscopic behaviors of vehicles, and simultaneously plans the paths for all vehicles, thereby having good performance under the condition of higher vehicle density.
The technical scheme adopted by the invention is as follows: a non-global information oriented urban road network path planning method is characterized by comprising the following steps:
step 1: preprocessing road network information by using an additionally-arranged virtual edge method, introducing a multi-Agent system, and forming a module system consisting of a plurality of agents by taking each road section in a road network as a center, wherein each module system makes an independent decision;
step 2: intelligently learning and evaluating the road congestion Index by observing the state change of the road section in the area;
and step 3: constructing a PST model according to the road network congestion index;
and 4, step 4: utilizing A in PST model*The R algorithm is used for selecting paths, so that traffic flow is distributed in the whole road network in a balanced manner;
and 5: and returning to execute the step 2 after the preset time period is reached.
Compared with the prior art, the invention has the beneficial effects that: because the urban traffic system has the characteristics of dynamicity, nondeterminiseness, complexity and the like, although the prior art achieves certain achievement, most methods can only find the optimal path for a single vehicle based on the current road network state; or the application scene of the algorithm needs a plurality of assumptions as a premise; or perform poorly in the case of a greater density of vehicles. The reinforcement learning has low requirement on prior knowledge of the environment, so that a good learning effect can be obtained in a complex system. The method can more effectively balance traffic network flow even under the scene that the number of vehicles is relatively large, relieves traffic jam, and has good performance on key indexes such as reduction of average running time and average running distance of the vehicles. In addition, even under the condition that a large number of vehicles in a road network do not accept route planning navigation, the method can also obviously improve traffic jam.
Drawings
FIG. 1 is a schematic block diagram of an embodiment of the present invention;
fig. 2 is a schematic diagram of a method for adding a virtual edge to a road network according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the developing area of economic technology in Beijing Yaozhuang according to the embodiment of the present invention;
FIG. 4 shows an embodiment of the present invention for different vehicle sizes FSFA schematic diagram of the influence on the average travel time and the travel distance, wherein a is 2000 vehicles, b is 4000 vehicles, and c is 6000 vehicles;
FIG. 5 is a comparison of average travel times for different vehicle sizes in accordance with an embodiment of the present invention;
FIG. 6 is a comparison of average distance traveled on different vehicle scales in accordance with an embodiment of the present invention;
FIG. 7 is a graph comparing the number of road jams over time for an embodiment of the present invention;
FIG. 8 is a graph comparing average travel times at different compliance rates for embodiments of the present invention.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.
Referring to fig. 1, the non-global information oriented urban road network path planning method provided by the present invention includes the following steps:
step 1: preprocessing road network information by using an additionally-arranged virtual edge method, introducing a multi-Agent system, and forming a module system consisting of a plurality of agents by taking each road section in a road network as a center, wherein each module system makes an independent decision;
for the city road network G ═ (L, E), let the number of intersections in the city road network be m, LxIndicating the x-th intersection, L ═ L1,l2,…,l m1, 2, …, m; for any two adjacent intersections lxAnd lyIf it is from intersection lxCan reach the intersection lyThen there is a road section (l)x,ly) E, if by intersection lyCan also reach the intersection lxThen there is also a link (l)y,lx) E belongs to E; in order to embody the self weight of the intersection node, the road network information is preprocessed by using an additionally-arranged virtual edge method, and the node weight is converted into the edge weight; according to the actual structure of the intersection, one-to-many expansion is carried out on the intersection nodes, virtual edges are additionally arranged among the expansion nodes according to the driving direction inside the intersection, and the weight value of the additionally arranged virtual edges represents the driving time consumption of corresponding steering inside the intersection; after the road network information is preprocessed by the method of additionally arranging the virtual edges, two types of edges exist, namely a real edge and a virtual edge, the real edge is mapped to the road section in the road section set E, and the virtual edge is mapped to the driving direction in the intersection.
FIG. 2 shows an intersection represented by the method of adding a broken edge, and an intersection l is represented by the method of adding a broken edgexExpand into 8 nodes { x1,x2…,x8}. The 8 nodes are defined into two types according to the attribute and the action of the nodes at the intersection: crossing out point, crossing in point. An intersection out-point represents a node that can leave the intersection, such as x in FIG. 25,x6,x7,x8All are crossing out points. An intersection in-point represents a node that can enter the intersection, such as x in FIG. 21,x2,x3,x4Are all intersection points. Let Bx={x5,x6,x7,x8Denotes lxIntersection departure set of intersections, Yx={x1,x2,x3,x4Denotes lxAnd (4) gathering intersection access points of the intersection.
For a section of road (l)x,ly) Suppose that the method of adding virtual edge is composed of a real edge [ x ]u,yv]Is represented by the formula (I) in which xu∈Bx,yv∈YyLet To (y)v) Indicating arrival at intersection in-point yvSet of nodes of (2), From (x)u) Indicating a departure point x from an intersectionuThe calculation methods of the reachable node sets are respectively shown as formulas (1) and (2):
To(yv)={xu} (1)
From(xu)={yv} (2)
step 2: intelligently learning and evaluating the road congestion Index by observing the state change of the road section in the area;
calculating congestion indexes Index of each road section of the road network according to the traffic flow information of each road section of the road network acquired in real time by applying reinforcement learning and adopting an MQ (modulation Q-learning) algorithm, wherein the Index reflects the expectation of the traffic flow of the road section, and the larger the Index is, the smaller the expectation of the traffic flow is;
let NB for any road segment iiThe neighbor road section set of the road section i is represented, the k-th neighbor road section of the road section i uses NBi[k]And (3) representing that a modular Agent system is established with the adjacent road section set by taking any road section as the center. Taking section i as an example, AgentiRepresenting Agents on a road segment as AgentsiAs a center and a neighbor
Figure BDA0001710052540000041
Form a module, Agent inside the moduleiRespectively and arbitrarily react
Figure BDA0001710052540000042
Forming a pair of Agent Q reinforcement learning;
the MQ algorithm is realized by the following steps:
at time t, Agent is obtained by road sensoriAnd
Figure BDA0001710052540000051
state of (1)
Figure BDA0001710052540000052
Meanwhile, the Agent at the t-1 moment can be obtainediAnd
Figure BDA0001710052540000053
selection of the best action
Figure BDA0001710052540000054
Figure BDA0001710052540000055
And the value of the return obtained
Figure BDA0001710052540000056
1. Federated state
Figure BDA0001710052540000057
Optimum response of
Figure BDA0001710052540000058
Is calculated as shown in equation (3):
Figure BDA0001710052540000059
wherein, AgentiState of(s)iExpressed in total number of vehicles on road section i; agentiAct a ofiIs represented by a number, aiIs used for calculating the wayCongestion index of segment i:
Figure BDA00017100525400000523
time (i) represents the travel time on the free-flow route section i, AiRepresenting AgentiA set of actions that are likely to be taken under all environmental conditions;
Figure BDA00017100525400000510
representing AgentiIn a state of
Figure BDA00017100525400000511
Acting as ai
Figure BDA00017100525400000512
In a state of
Figure BDA00017100525400000513
Acting as
Figure BDA00017100525400000514
Q value of (1);
Figure BDA00017100525400000515
indicating that time t-1 is in a joint state
Figure BDA00017100525400000516
Lower part
Figure BDA00017100525400000517
Taking action
Figure BDA00017100525400000518
The probability of (d);
2、
Figure BDA00017100525400000519
the update at time t is as shown in equation (4), where α is the learning rate and γ is the discounting factor:
Figure BDA00017100525400000520
3. agent at time tiOptimal action
Figure BDA00017100525400000521
Is calculated as shown in equation (5):
Figure BDA00017100525400000522
wherein, AgentiState of(s)iExpressed in total number of vehicles on road section i; according to the AgentiThe action a taken calculates the congestion index (i) of the link iiWherein timeiRepresenting travel time on a free-flow condition link i, a ∈ Ai(ii) a The flow rate of any section i should be maintained at the saturation flow rate SFiTraffic flow and SF at the next timeiThe closer the return value is, the larger the return value is, the same as SFiThe larger the difference, the smaller the return value.
The saturated flow SF is calculated as shown in equation (6):
SF=leng×lane×state×speed×fmy×FSF/100 (6)
wherein, leng represents the length of the road section, and the value range is (0, infinity); lane represents the number of lanes of the road section, and the value range is (0, infinity); the state represents the flatness degree of the road section, the larger the value is, the flatter the road section is, the value range is (0, 1)](ii) a speed represents the ratio of the highest speed limit of the road section to the highest speed limit of the road network, and the value range is (0, 1)](ii) a fmy, the greater the value, the more familiar the road section, the range is (0, 1)];FSFThe saturated flow coefficient of the road network is expressed, the calculation of SF is adjusted, if the number of vehicles in the whole road network is less, F is setSFSetting F at a small value, and if the number of vehicles is large, setting F at a small valueSFSet at a larger value.
And step 3: constructing a PST model according to the road network congestion index;
the principle of the PST model is that the whole path planning time T range is divided into a plurality of time scales according to time granularity delta, nodes obtained by road network information preprocessing are mapped to ordinate, the time scales are mapped to abscissa, and links are connected among the nodes according to conditions; because the road network signal lamp phase scheme has an influence on the travel time consumption, intersection phase and road section traffic capacity constraints must be considered simultaneously. In order to meet the challenges, the PST model integrates a signal lamp phase scheme into the PST model, and the PST model combines a road network congestion index to realize the integral consideration of the phase scheme and the road section traffic capacity constraint.
The specific implementation comprises the following substeps:
step 3.1: for time t (initial value is 0), if traffic light is given to any intersection lxAt intersection point xu∈YxGet to the intersection and go out of the point xv∈BxRight of passage, then:
new construction point (t, x)u)、(t+time(xu,xv),xu) New creation of edge (t, x)u)→(t+time(xu,xv),xu);
time(xu,yv) Representing a free flow regime of xuDirection of travel y of nodevThe travel time of the node; b isxIs represented byxIntersection departure set of intersections, YxIs represented byxAnd (4) gathering intersection access points of the intersection. An intersection out-point represents a node that can leave the intersection, and an intersection in-point represents a node that can enter the intersection.
Step 3.2: for any node yz∈From(xv),
New construction point (t + time (x)u,xv)+Index(xv,yz),yz) New creation of edge (t + time (x)u,xv),xu)→(t+time(xu,xv)+Index(xv,yz),yz);
From(xv) Indicating intersection out point xvA set of nodes that can be reached directly;
Figure BDA0001710052540000061
denotes xvNode to yzCongestion index of the node.
Step 3.3: adding 1 to T, and returning to the step 3.1 if T is less than T;
step 3.4: for time t (initial value is 0), for any intersection lxAt any intersection point xu∈YxIf there is a point (t, x)u) Then find a time s that is greater than time t and minimal such that point of presence (s, x)u) (ii) a New edge (t, x)u)→(s,xu);
Step 3.5: adding 1 to T, and if T is less than T, repeatedly executing the step 3.4;
step 3.6: traversing all the edges currently existing;
for any edge (t, x)u)→(s,yv) The edge (t, x)u)→(s,yv) The edge weight of (a) is set to s-t.
And 4, step 4: utilizing A in PST model*The R algorithm is used for selecting paths, so that traffic flow is distributed in the whole road network in a balanced manner;
for any vehicle F ∈ F, the set of road segments Passed that the vehicle F has traveled is knownfFor any link i ∈ PassedfTotal time sum of vehicles spent on the road sectioniThe known vehicle f is a road section set from the starting node to the x node in the path planning
Figure BDA0001710052540000071
Order to
Figure BDA0001710052540000072
Representing the set of traveled road segments comprised by the path of vehicle f from the origin node to the x node, then
Figure BDA0001710052540000073
Calculating the following formula (7):
Figure BDA0001710052540000074
for vehicle f, pair
Figure BDA0001710052540000075
The reselection of the middle road segment is the waste of road segment resources, so in order to avoid the waste of the road segment resources, a penalty needs to be made for the selection, and a penalty function e (x) is defined as shown in formula (8):
Figure BDA0001710052540000076
wherein the content of the first and second substances,
Figure BDA0001710052540000078
is A*The coefficients of the R-algorithm are,
Figure BDA0001710052540000079
the setting of the values was explored in the experimental part;
assuming that the parent node of the x node is y ═ fast (x), the recursion of the e (x) function is shown in equation (9):
Figure BDA0001710052540000077
wherein sum (y, x) represents that the vehicle is on the road section (l)y,lx) The total time spent on the process.
The valuation function f (x) is shown as equation (10):
f(x)=g(x)+h(x)+e(x) (10)
wherein, f (x) is an evaluation function, and the shortest time when the shortest path passes through the x node is estimated; g (x) is the shortest time from the starting node to the x node, h (x) is a function of heuristically estimating the shortest time for the x node to reach the destination node, and the shortest time is obtained by dividing the Euclidean distance from the current node to the destination node by the maximum allowable speed of the road network.
And 5: and returning to execute the step 2 after the preset time period is reached.
Aiming at the problems that the existing path selection algorithm can only realize path optimization of a few vehicles and is easy to cause local congestion, the invention provides a method based onPST model and reinforcement learning A*Exclusion algorithms to solve these problems. To verify A*The effectiveness of the SR method is that a real road network in the developing area of the economic technology in beijing jizhuang (see fig. 3) is obtained from OpenStreetMap, and a vanet is used to simulate the movement of vehicles in the road network, and is an open-source micro traffic simulation software. By constructing different vehicle scale scenes, A*The SR method is compared with 4 different path optimization algorithms:
(1) SP (shortest path algorithm);
(2) DSP (dynamic shortest path algorithm);
(3) RkSP (random k shortest path algorithm);
(4) DTA (dynamic traffic allocation algorithm).
The road network information is configured by using extensible markup language, and describes attributes of road network sections such as start coordinates, target coordinates, number of lanes, highest speed limit and the like, and the vehicle information is configured by using extensible markup language and comprises attributes of vehicles such as start coordinates, target coordinates, departure time, highest speed and the like. The road network in the developing area of economic technology of Beijing Yazhu is shown in fig. 3, all road sections are bidirectional, different road sections have different speed limits, the same road section has the same number of lanes, and each intersection of the real road network has a signal lamp.
In this embodiment:
1. explore A*Setting parameters of SR method by means of saturation flow coefficient F of road sectionSF、A*Coefficient of R algorithm
Figure BDA0001710052540000081
The best parameter settings are obtained.
2. To verify A*The SR method reduces the capacity of the number of the traffic jam road sections, designs an experiment A*The SR method is compared with 4 path optimization algorithms under different vehicle scales.
3. To verify A*The effectiveness of the SR method in reducing travel time and distance problems is realized by designing an experiment A*SR method at different vehicle scalesThe following performance was compared to 4 methods.
4. By A*Compared with the effects of other algorithms under different navigation compliance rates, the SR method further verifies the effectiveness of the algorithm in optimizing traffic.
Firstly, selecting parameters;
shown in Table 1 is A*The SR method involves the following relevant parameters: time period Tc、A*Coefficient of R algorithm
Figure BDA0001710052540000082
Road network saturation flow coefficient FSF. The three parameters have great influence on the performance of the algorithm, and a large amount of experiments are carried out to explore the optimal setting of the parameters. Obtaining OD set randomly on the basis of real road network, wherein the departure place is concentrated in the upper left corner area of the map, the destination is concentrated in the right area of the map, and all vehicles are assumed to completely comply with A*The SR method plans the resulting route. For brevity, T is not described in detail in this embodimentcAnd
Figure BDA0001710052540000098
in the following experiment, let TcAnd
Figure BDA0001710052540000099
equal to 15s and 2, respectively, because when these two parameters are set to these two values, it is possible to achieve a good result regardless of the vehicle size. According to the above for the saturation flow coefficient FSFCan know FSFSmaller values of (d) indicate smaller numbers of vehicles the road is allowed to accommodate; otherwise the larger the number of vehicles allowed to be accommodated. In the embodiment, a plurality of different vehicle scales are designed, and the road section saturation flow coefficient F is obtained under the scenes of different vehicle scalesSFAnd setting different parameter values, and calculating the average driving time and the average driving distance of the vehicle. As shown in fig. 4, the calculation results of the average traveling time and the average traveling distance of the vehicle when the vehicle sizes are 2000, 4000 and 6000 vehicles, respectively.
TABLE 1 relevant parameters
Figure BDA0001710052540000091
Order to
Figure BDA0001710052540000092
Is represented by FSFOptimal settings at different vehicle scales, then from the simulation results in fig. 4 it can be seen that:
(1) in the case of a smaller vehicle size, the average travel time and the travel distance of the vehicle are followed by FSFIncreases and decreases and then levels off. This is because when FSFWhen the value of (A) is too small, the congestion condition is easily reached, A*The SR method guides the vehicle to travel on routes that are relatively long, resulting in increased travel time and travel distance. When F is presentSFTo a certain value
Figure BDA0001710052540000093
When the road is in the road network, the road section in the road network can not reach A due to the small vehicle scale*The SR method provides for congestion conditions so that most vehicles will still be traveling a better route. When in use
Figure BDA0001710052540000094
In time, the number of vehicles accommodated in the road section in the road network can not reach F any moreSFTherefore, the average travel time and the course tend to be smooth. When the number of vehicles is 2000, the number of vehicles,
Figure BDA0001710052540000095
(2) at medium vehicle scale, the average vehicle travel time is FSFIncrease to
Figure BDA0001710052540000096
Decrease rapidly to a minimum and thereafter with FSFThere is a tendency that the average vehicle running time increases slowly because when F is reachedSFAfter a certain value is reached, the judgment of road congestion is improvedUnder the condition of congestion, traffic flow on certain roads is relatively large under the condition of medium vehicle scale, and congestion is actually generated in a road network. When the number of vehicles is 4000,
Figure BDA0001710052540000097
(3) in the case of a large vehicle scale, the overall road network traffic flow is large, and therefore
Figure BDA0001710052540000101
Must be large at FSFIncrease to
Figure BDA0001710052540000102
The front average travel time is at a higher level because of FSFSetting too small may result in a greater probability of the road being determined to be congested, causing a concussion in the road network state. When in use
Figure BDA0001710052540000103
Actual congestion is likely to occur at that time, so that the average travel time rapidly increases. When the number of the vehicles is 6000,
Figure BDA0001710052540000104
secondly, comparing the performances of different vehicle scales;
to verify A*SR method in reducing running time, this section will A in different vehicle scales*The SR method is respectively compared with the DSP, the RkSP and the DTA path optimization algorithm. Fig. 5 shows the statistics of the average vehicle travel time of these algorithms under different vehicle scale scenarios. Simulation results show that*The SR method reduces the average running time of the vehicle to different degrees under different vehicle scale scenes. The DSP algorithm will dynamically update the travel path based on real-time traffic conditions, but in some cases, assign the same route to many vehicles at the same location and destination, resulting in new congestion, so the average travel time is at a lower level with a smaller number of vehicles, and the average travel time is flat as the traffic density increasesThe average travel time increases by a large margin. The RkSP avoids the disadvantages of DSP algorithms by randomly balancing traffic flow over different routes, but apparently is not sufficient by a random method alone. DTA works better at reducing the average travel time, as expected, because it brings the road network as well as possible into equilibrium. In all algorithms, the average travel time tends to increase as the vehicle size increases, and this increasing trend corresponds to: DSP > RkSP > DTA > A*SR。A*The reason why the SR method is less prone to increase in average travel time is that a increases as the number of vehicles increases*The SR method can obtain a better strategy through reinforcement learning. This increasing trend ultimately results in the average travel time achieved by all algorithms meeting when the vehicle size reaches 6000 vehicles: DSP > RkSP > DTA > A*SR。
The average driving time and the average driving distance of the vehicle are key indexes for evaluating a path planning strategy, and experiments show that for most path planning algorithms, the average driving time and the average driving distance show strong positive correlation under the scene with few vehicles, and the average driving time and the average driving distance do not show strong correlation under the scene with high traffic density. FIG. 6 shows that*And (3) comparing the average driving distance of the vehicle under different vehicle scale scenes by using the SR algorithm with the SP algorithm, the DSP algorithm, the RkSP algorithm and the DTA algorithm. The SP algorithm uniquely plans the shortest path for all vehicles according to the road network space distance, so that the average driving distance is the minimum no matter what level the traffic flow density is. Along with the increase of the traffic flow, the average driving distance of the DSP algorithm is increased quickly, and the main reason is that only the current road network state is considered during path planning, so that the road network state is vibrated. A. the*The SR method takes into account the repulsive force of the vehicle to the traveled trajectory, avoiding the re-selection of the road segment once traveled, thereby greatly reducing the average traveled distance. Under the scene of low traffic density, the average driving distance of each algorithm is basically equal. With the increase of the traffic flow density, the effect of each algorithm in reducing the average travel distance is finally satisfied: SP > A*SR>DTA>RkSP>DSP。
It was concluded through experiments that as the vehicle size increased, A*The SR method has a better effect in reducing the average travel time and the average travel distance of the vehicle than other algorithms.
III, A*SR ability to alleviate traffic congestion;
A*the principle of the SR method is to balance traffic flow on all road sections of a road network, improve the utilization rate of road resources and reduce the number of congestion of the road sections. In the present embodiment, the number of road jams in the road network is detected every minute, and is A as shown in FIG. 7*And the SR algorithm and SP, DSP, RkSP and DTA algorithms are used for relieving the comparison graph of the road network congestion under the scene that the vehicle size is 5000. As can be seen from fig. 7, since the SP algorithm cannot dynamically adjust the selection of the route, the road congestion number is always at a high level, and the descending trend is very slow. DSP and DTA algorithms have the ability to re-plan paths and therefore perform better than SP algorithms. A. the*Although the link congestion rate is higher than that of the DTA in the early stage, the SR method has a higher tendency of decreasing the link congestion rate to the middle and late stages than that of the DTA because of the adaptive learning capability. Sequencing all algorithms according to the congestion number of the maximum road section to obtain: SP > DSP > A*SR ≧ RkSP > DTA, but A can be seen*The progressive increase of the SR with time has a stronger ability to reduce the number of congested road segments.
Fourthly, comparing the performance performances under different following navigation probabilities;
it is impractical in real life that the vehicle fully complies with the navigation system instructions, and the probability that the vehicle complies with the navigation can greatly affect the actual effect of the algorithm and is therefore also an important factor that must be considered in designing the path planning strategy. The vehicle compliance rate in the embodiment means that each vehicle has a certain probability to choose to receive the route planning of the navigation system in each time period, otherwise, the vehicle continues to run according to the existing route. In the present embodiment, in a scene of a vehicle scale of 5000, average travel times of the algorithms at different vehicle compliance rates are calculated, and an experimental result is shown in fig. 8. From FIG. 8, it can be seen that A is the rate of vehicle compliance regardless of the degree to which A is applied*The SR, DTA, DSP and RkSP algorithms can obviously reduce the averageThe driving time is because vehicles in the road network following the navigation instructions can always receive better routes and road segment resources are made available for vehicles not following the navigation instructions. Especially when compliance rates are low, A*The average travel time of the SR algorithm is significantly lower than that of the other algorithms. A. the*The SR method can achieve such a good effect because the reinforcement learning algorithm can continuously adjust the strategy according to the change of the environmental state, and the finally obtained strategy is suitable for the current compliance rate.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (3)

1. A non-global information oriented urban road network path planning method is characterized by comprising the following steps:
step 1: preprocessing road network information by using an additionally-arranged virtual edge method, introducing a multi-Agent system, and forming a module system consisting of a plurality of agents by taking each road section in a road network as a center, wherein each module system makes an independent decision;
the method for preprocessing the road network information by using the method of adding the virtual edge comprises the following specific implementation processes: for the city road network G ═ (L, E), let the number of intersections in the city road network be m, LxIndicating the x-th intersection, L ═ L1,l2,…,lm1, 2, …, m; for any two adjacent intersections lxAnd lyIf it is from intersection lxCan reach the intersection lyThen there is a road section (l)x,ly) E, if by intersection lyCan also reach the intersection lxThen there is also a link (l)y,lx) E belongs to E; to embodyPreprocessing road network information by using an additionally-arranged virtual edge method according to the self weight of the intersection node, and converting the node weight into the weight of an edge; according to the actual structure of the intersection, one-to-many expansion is carried out on the intersection nodes, virtual edges are additionally arranged among the expansion nodes according to the driving direction inside the intersection, and the weight value of the additionally arranged virtual edges represents the driving time consumption of corresponding steering inside the intersection; the road network model obtained by preprocessing the road network information by using the method of additionally arranging the virtual edges has two types of edges, namely a real edge and a virtual edge, the real edge is mapped to the road section in the road section set E, and the virtual edge is mapped to the driving direction in the intersection;
step 2: intelligently learning and evaluating the road congestion Index by observing the state change of the road section in the area;
the congestion Index of each road section of the road network is calculated by adopting a modulation Q-learning algorithm and applying reinforcement learning according to the real-time acquired traffic flow information of each road section of the road network, the Index reflects the expectation of the traffic flow of the road section, and the larger the Index is, the smaller the expectation of the traffic flow is;
let NB for any road segment iiSet of neighbor segments representing segment i, NB for kth neighbor segment of segment ii[k]Representing that a modularized Agent system is established with a neighbor road section set by taking any road section as a center; for a section i, AgentiRepresenting Agents on a road segment as AgentsiAs a center and a neighbor
Figure FDA0002614771270000011
Form a module, Agent inside the moduleiRespectively and arbitrarily react
Figure FDA0002614771270000012
Forming a pair of Agent Q reinforcement learning, wherein the basic idea of the reinforcement learning, namely training agents continuously take actions with the expectation value of the maximized Q value as a target and obtain a return value, the agents use the return value to evaluate the previous action and update the knowledge, and then the agents turn to the next state, so that the optimal strategies under different environmental states are learned;
at time t, Agent is obtained by road sensoriAnd
Figure FDA0002614771270000013
state of (1)
Figure FDA0002614771270000014
Simultaneously obtains the Agent at the t-1 momentiAnd
Figure FDA0002614771270000021
selection of the best action
Figure FDA0002614771270000022
And the value of the return obtained
Figure FDA0002614771270000023
The specific implementation of the Modular Q-learning algorithm comprises the following sub-steps:
step 2.1: computing federated states
Figure FDA0002614771270000024
Optimum response of
Figure FDA0002614771270000025
Figure FDA0002614771270000026
Wherein, AgentiState of(s)iExpressed in total number of vehicles on road section i; agentiAct a ofiIs represented by a number, aiThe function of (a) is to calculate the congestion index of the road segment i: index (i) ═ aiTime (i) represents the travel time on the free-flow route section i, AiRepresenting AgentiA set of actions that are likely to be taken under all environmental conditions;
Figure FDA0002614771270000027
representing AgentiIn a state of
Figure FDA0002614771270000028
Acting as ai
Figure FDA0002614771270000029
In a state of
Figure FDA00026147712700000210
Acting as
Figure FDA00026147712700000211
Q value of (1);
Figure FDA00026147712700000212
indicating that time t-1 is in a joint state
Figure FDA00026147712700000213
Lower part
Figure FDA00026147712700000214
Taking action
Figure FDA00026147712700000215
The probability of (d);
step 2.2: updated at time t
Figure FDA00026147712700000216
Figure FDA00026147712700000217
Where α is the learning rate and γ is a discount factor;
step 2.3: agent for calculating t timeiOptimal action
Figure FDA00026147712700000218
Figure FDA00026147712700000219
Step 2.4: according to the AgentiActions taken
Figure FDA00026147712700000220
Calculating congestion index for road segment i
Figure FDA00026147712700000221
Figure FDA00026147712700000222
The flow rate of any section i should be maintained at the saturation flow rate SFiTraffic flow and SF at the next timeiThe closer the return value is, the larger the return value is, the same as SFiThe larger the difference, the smaller the return value;
the saturated flow SF is calculated in the following way:
SF=leng×lane×state×speed×fmy×FSF/100;
wherein, leng represents the length of the road section, and the value range is (0, infinity); lane represents the number of lanes of the road section, and the value range is (0, infinity); the state represents the flatness degree of the road section, the larger the value is, the flatter the road section is, the value range is (0, 1)](ii) a speed represents the ratio of the highest speed limit of the road section to the highest speed limit of the road network, and the value range is (0, 1)](ii) a fmy, the greater the value, the more familiar the road section, the range is (0, 1)];FSFThe saturated flow coefficient of the road network is expressed, the calculation of SF is adjusted, and if the number of vehicles in the whole road network is small, F is set manuallySFSetting F at a small value, and if the number of vehicles is large, setting F at a small valueSFSet at a larger value;
and step 3: constructing a PST model according to the road network congestion index;
and 4, step 4: carrying out path selection in a PST model by using an A-R algorithm so that the traffic flow is uniformly distributed in the whole road network;
and 5: and returning to execute the step 2 after the preset time period is reached.
2. The non-global information oriented urban road network path planning method according to claim 1, wherein in step 3, the PST model is constructed according to road network congestion indexes, the whole path planning time T range is divided into a plurality of time scales according to time granularity Δ, nodes obtained by road network information preprocessing are mapped to ordinate, the time scales are mapped to abscissa, and links are connected between the nodes according to conditions;
the specific implementation comprises the following substeps:
step 3.1: for the time t, the initial value is 0, if the traffic light is given to any intersection lxAt intersection point xu∈YxGet to the intersection and go out of the point xv∈BxRight of passage, then:
new construction point (t, x)u)、(t+time(xu,xv),xu) New creation of edge (t, x)u)→(t+time(xu,xv),xu);
time(xu,yv) Representing a free flow regime of xuDirection of travel y of nodevThe travel time of the node; b isxIs represented byxIntersection departure set of intersections, YxIs represented byxGathering intersection access points of the intersection; an intersection out-point represents a node that can leave the intersection, and an intersection in-point represents a node that can enter the intersection;
step 3.2: for any node yz∈From(xv),
New construction point (t + time (x)u,xv)+Index(xv,yz),yz) New creation of edge (t + time (x)u,xv),xu)→(t+time(xu,xv)+Index(xv,yz),yz);
From(xv) Indicating intersection out point xvA set of nodes that can be reached directly; index (X)v,yz) Denotes xvNode to yzCongestion index of the node;
step 3.3: adding 1 to T, and returning to the step 3.1 if T is less than T;
step 3.4: for time t, the initial value is 0, for any intersection lxAt any intersection point xu∈YxIf there is a point (t, x)u) Then find a time s that is greater than time t and minimal such that point of presence (s, x)u) (ii) a New edge (t, x)u)→(s,xu);
Step 3.5: adding 1 to T, and if T is less than T, repeatedly executing the step 3.4;
step 3.6: traversing all the edges currently existing;
for any edge (t, x)u)→(s,yv) The edge (t, x)u)→(s,yv) The edge weight of (a) is set to s-t.
3. The non-global information oriented urban road network path planning method according to claim 1, wherein in step 4, the path selection is performed in the PST model by using an a R algorithm, and the specific implementation process is as follows: for any vehicle F ∈ F, the set of road segments Passed that the vehicle F has traveled is knownfFor any link i ∈ PassedfTotal time sum of vehicles spent on the road sectioniThe known vehicle f is a road section set from the starting node to the x node in the path planning
Figure FDA0002614771270000041
Order to
Figure FDA0002614771270000042
Representing the set of traveled road segments comprised by the path of vehicle f from the origin node to the x node, then
Figure FDA0002614771270000043
Comprises the following steps:
Figure FDA0002614771270000044
for vehicle f, pair
Figure FDA0002614771270000045
The reselection of the middle road section is the waste of road section resources, so in order to avoid the waste of the road section resources, a penalty needs to be made for the selection, and a penalty function e (x) is defined as follows:
Figure FDA0002614771270000046
wherein, FA*RIs a x R algorithm coefficient;
assuming that the parent node of node x is y ═ fast (x), then the recursion of the e (x) function is:
Figure FDA0002614771270000047
wherein sum (y, x) represents that the vehicle is on the road section (l)y,lx) The total time spent;
the valuation function f (x) is:
f(x)=g(x)+h(x)+e(x);
wherein, f (x) is an evaluation function, and the shortest time when the shortest path passes through the x node is estimated; g (x) is the shortest time from the starting node to the x node, h (x) is a function of heuristically estimating the shortest time for the x node to reach the destination node, and the shortest time is obtained by dividing the Euclidean distance from the current node to the destination node by the maximum allowable speed of the road network.
CN201810677156.1A 2018-06-27 2018-06-27 Non-global information oriented urban road network path planning method Active CN108847037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810677156.1A CN108847037B (en) 2018-06-27 2018-06-27 Non-global information oriented urban road network path planning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810677156.1A CN108847037B (en) 2018-06-27 2018-06-27 Non-global information oriented urban road network path planning method

Publications (2)

Publication Number Publication Date
CN108847037A CN108847037A (en) 2018-11-20
CN108847037B true CN108847037B (en) 2020-11-17

Family

ID=64202695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810677156.1A Active CN108847037B (en) 2018-06-27 2018-06-27 Non-global information oriented urban road network path planning method

Country Status (1)

Country Link
CN (1) CN108847037B (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109579861B (en) * 2018-12-10 2020-05-19 华中科技大学 Path navigation method and system based on reinforcement learning
CN109959388B (en) * 2019-04-09 2023-03-14 南京大学 Intelligent traffic refined path planning method based on grid expansion model
CN110097288B (en) * 2019-05-08 2023-11-10 哈尔滨工业大学(威海) Urban crowdsourcing distribution task distribution method and device based on graph search
CN110942625B (en) * 2019-11-06 2021-03-26 深圳市城市交通规划设计研究中心有限公司 Dynamic OD estimation method and device based on real path flow backtracking adjustment
CN113252054B (en) * 2020-02-11 2023-11-28 株式会社日立制作所 Navigation method and navigation system
CN111487962B (en) * 2020-03-30 2021-10-01 北京化工大学 Multi-robot path rapid planning method applied to warehousing environment
CN111553527B (en) * 2020-04-26 2023-09-29 南通理工学院 Road traffic time prediction method based on PSO and neural network series optimization
CN111553539A (en) * 2020-05-09 2020-08-18 上海大学 Driving path planning method based on probabilistic model inspection
CN111623790B (en) * 2020-05-26 2022-04-12 武汉大学深圳研究院 Rapid path planning method for dynamic urban traffic network
CN111649758B (en) * 2020-06-16 2023-09-15 华东师范大学 Path planning method based on reinforcement learning algorithm in dynamic environment
CN113822502A (en) * 2020-06-18 2021-12-21 阿里巴巴集团控股有限公司 Bus operation planning method, bus operation state evaluation method and equipment
CN111862660B (en) * 2020-07-23 2023-10-10 中国平安财产保险股份有限公司 Real-time path planning method and related equipment based on utility compensation mechanism
CN111738627B (en) * 2020-08-07 2020-11-27 中国空气动力研究与发展中心低速空气动力研究所 Wind tunnel test scheduling method and system based on deep reinforcement learning
CN112418610B (en) * 2020-10-31 2023-03-17 国网河北省电力有限公司雄安新区供电公司 Charging optimization method based on fusion of SOC information and road network power grid information
CN112289030B (en) * 2020-11-02 2021-12-21 吉林大学 Method for calculating maximum number of vehicles capable of being accommodated in urban road network
CN112767683B (en) * 2020-12-22 2021-12-21 安徽百诚慧通科技有限公司 Path induction method based on feedback mechanism
CN113053116B (en) * 2021-03-17 2022-02-11 长安大学 Urban road network traffic distribution method, system, equipment and storage medium
CN113065240B (en) * 2021-03-19 2023-04-07 成都安智杰科技有限公司 Self-adaptive cruise simulation method and device, electronic equipment and storage medium
CN113516277B (en) * 2021-04-13 2023-10-17 南京大学 Internet intelligent traffic path planning method based on road network dynamic pricing
CN112991745B (en) * 2021-04-30 2021-08-03 中南大学 Traffic flow dynamic cooperative allocation method under distributed framework
CN113380064B (en) * 2021-05-21 2022-07-05 徐州工程学院 Efficient highway passing system and method
CN113503888A (en) * 2021-07-09 2021-10-15 复旦大学 Dynamic path guiding method based on traffic information physical system
CN113240218B (en) * 2021-07-13 2021-09-21 佛山市墨纳森智能科技有限公司 Logistics distribution planning method and system based on big data
CN113758494B (en) * 2021-08-31 2023-07-28 北京百度网讯科技有限公司 Navigation path planning method, device, equipment and storage medium
CN113847917B (en) * 2021-09-16 2024-01-16 西安电子科技大学 Vehicle path planning method based on digital twin and user personalized requirements
CN114038216B (en) * 2021-10-08 2022-12-23 之江实验室 Signal lamp control method based on road network division and boundary flow control
CN114264313A (en) * 2021-12-23 2022-04-01 上海逐路智能科技发展有限公司 Potential energy-based lane-level path planning method, system, equipment and storage medium
CN114384901B (en) * 2022-01-12 2022-09-06 浙江中智达科技有限公司 Reinforced learning aided driving decision-making method oriented to dynamic traffic environment
CN115273457A (en) * 2022-06-16 2022-11-01 重庆长安汽车股份有限公司 Global optimal path planning method considering dynamic change of travel time of urban road network
CN116343487B (en) * 2023-05-19 2023-08-01 武汉理工大学 Urban traffic network toughness assessment method considering global efficiency and local dislocation
CN116864088B (en) * 2023-06-07 2024-02-13 深圳慧锐通智能技术股份有限公司 Community medical resource real-time regulation and control method and device based on Internet of things

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101789182B (en) * 2010-02-05 2012-10-10 北京工业大学 Traffic signal control system and method based on parallel simulation technique
CN102905307B (en) * 2012-09-12 2014-12-31 北京邮电大学 System for realizing joint optimization of neighbor cell list and load balance
CN103337189B (en) * 2013-06-08 2015-07-29 北京航空航天大学 A kind of vehicle route guidance method dynamically divided based on section
CN106096756A (en) * 2016-05-31 2016-11-09 武汉大学 A kind of urban road network dynamic realtime Multiple Intersections routing resource

Also Published As

Publication number Publication date
CN108847037A (en) 2018-11-20

Similar Documents

Publication Publication Date Title
CN108847037B (en) Non-global information oriented urban road network path planning method
CN107705557B (en) Road network signal control method and device based on depth-enhanced network
CN108510764B (en) Multi-intersection self-adaptive phase difference coordination control system and method based on Q learning
CN109785619B (en) Regional traffic signal coordination optimization control system and control method thereof
CN103996289B (en) A kind of flow-speeds match model and Travel Time Estimation Method and system
CN104766484B (en) Traffic Control and Guidance system and method based on Evolutionary multiobjective optimization and ant group algorithm
CN109959388B (en) Intelligent traffic refined path planning method based on grid expansion model
CN109215355A (en) A kind of single-point intersection signal timing optimization method based on deeply study
CN111785045A (en) Distributed traffic signal lamp combined control method based on actor-critic algorithm
CN110570672B (en) Regional traffic signal lamp control method based on graph neural network
CN113296513B (en) Rolling time domain-based emergency vehicle dynamic path planning method in networking environment
Pham et al. Learning coordinated traffic light control
Zhu et al. Intelligent traffic network control in the era of internet of vehicles
CN113724507B (en) Traffic control and vehicle guidance cooperative method and system based on deep reinforcement learning
CN113516277A (en) Network connection intelligent traffic path planning method based on dynamic pricing of road network
CN113392577B (en) Regional boundary main intersection signal control method based on deep reinforcement learning
CN115019523A (en) Deep reinforcement learning traffic signal coordination optimization control method based on minimized pressure difference
Boukerche et al. FECO: an efficient deep reinforcement learning-based fuel-economic traffic signal control scheme
Fang et al. Multi-Objective Traffic Signal Control Using Network-Wide Agent Coordinated Reinforcement Learning
Shamshirband A distributed approach for coordination between traffic lights based on game theory.
Zhang et al. Coordinated control of distributed traffic signal based on multiagent cooperative game
CN112991745B (en) Traffic flow dynamic cooperative allocation method under distributed framework
CN115472023A (en) Intelligent traffic light control method and device based on deep reinforcement learning
Song et al. Path planning in urban environment based on traffic condition perception and traffic light status
Shahriar et al. Intersection traffic efficiency enhancement using deep reinforcement learning and V2X communications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant