CN101982843B - Method for selecting state vector in nonparametric regression short-time traffic flow prediction - Google Patents

Method for selecting state vector in nonparametric regression short-time traffic flow prediction Download PDF

Info

Publication number
CN101982843B
CN101982843B CN2010105141116A CN201010514111A CN101982843B CN 101982843 B CN101982843 B CN 101982843B CN 2010105141116 A CN2010105141116 A CN 2010105141116A CN 201010514111 A CN201010514111 A CN 201010514111A CN 101982843 B CN101982843 B CN 101982843B
Authority
CN
China
Prior art keywords
prediction
mrow
state vector
particles
road section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2010105141116A
Other languages
Chinese (zh)
Other versions
CN101982843A (en
Inventor
郑亮
马寿峰
贾宁
朱宁
王鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN2010105141116A priority Critical patent/CN101982843B/en
Publication of CN101982843A publication Critical patent/CN101982843A/en
Application granted granted Critical
Publication of CN101982843B publication Critical patent/CN101982843B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for selecting the state vector in the nonparametric regression short-time traffic flow prediction, relating to the technical field of short-time traffic flow prediction. At four conditions comprising peak hours, even hours, low hours and all the day, by using the method provided by the invention, the forecast accuracy, the stability, the speed and the transportability are improved, and the operation time is shortened, thus verifying the effectiveness and the necessity of the method provided by the invention.

Description

Method for selecting state vector in nonparametric regression short-time traffic flow prediction
Technical Field
The invention relates to the technical field of short-term traffic flow prediction, in particular to a method for selecting a state vector in nonparametric regression short-term traffic flow prediction.
Background
At present, many researchers at home and abroad apply the non-parameter regression method to the short-time traffic flow prediction research, and the non-parameter regression method is necessarily improved according to the requirements of practical problems. In 1991, Davis and Nihan really apply a nonparametric regression method to traffic prediction, and although problems of model selection, parameter setting and the like are avoided, the method needs a huge representative historical database and consumes a long time for running. In 1995, Smith applies a non-parametric regression method to single-point short-term traffic flow prediction, and experimental results achieve better effects than historical average and neural networks, but the problem of too slow search speed also exists. Aiming at the problem of too low searching speed, Oswald et al sets up a fuzzy nearest neighbor method from a KD tree, thereby improving a historical data structure mode and a neighbor searching method in a nonparametric regression method and improving the operating efficiency of the method. Zhangxiaoli provides a K-neighborhood nonparametric short-time traffic flow prediction method based on a balanced binary tree, and a case database is established by adopting a clustering method and a balanced binary tree structure, so that the prediction precision is improved, and the real-time requirement is met. These are mainly improvements from the storage patterns of the history database and the neighbor search method.
However, the selection of the state vector describing the causal relationship between the flow rates of the upstream road segment and the road segment to be detected mainly includes a principal component analysis method, a correlation coefficient method, an autocorrelation coefficient and the like, which are all analyzed from the point of statistics, and the factors relatively related to the flow rate of the road segment to be detected are used as the components of the state vector, so that the study on whether the state vector is selected and the prediction effect is improved is lacked. It is noted that even if the operation time of the method is shortened by improving the storage mode of the historical database and the neighbor searching method, the final prediction effect is not satisfactory if the selection of the state vector is not enough to describe the flow causal relationship between the upstream road segment and the road segment to be detected.
Disclosure of Invention
In order to solve the problems, improve the prediction precision, shorten the running time and meet the requirements in practical application, the invention provides a method for selecting a state vector in non-parametric regression short-time traffic flow prediction, which comprises the following steps:
(1) judging whether an upstream road section related to the road section to be detected is in the upstream road section set according to a first preset criterion, and if so, executing the step (2); if not, the upstream road segment is not in the upstream road segment set;
(2) acquiring the average speed of the traffic flow in the range of a square circle L of a road section to be detected through preset data;
(3) obtaining historical retroactive maximum cycle number m according to the average speed and the prediction cycle;
(4) acquiring an initial state vector according to the upstream road section set and the historical retroactive maximum cycle number m;
(5) determining the encoding length of the particle according to the dimension M of the initial state vector;
(6) setting the number of particles as Z, and randomly generating Z particles;
(7) defining a fitness function, and acquiring the fitness of Z particles according to the fitness function;
(8) acquiring an individual extreme value and a global extreme value of the particles according to the fitness of the Z particles;
(9) respectively performing cross operation on the codes of the Z particles, the codes of the individual extremum and the global extremum, and performing mutation operation according to a preset probability to obtain global optimal particles;
(10) judging whether the preset times is reached, if so, outputting the global optimal particles; if not, re-executing the step (7);
(11) and performing dot product operation on the global optimal particles and the initial state vector to obtain a state vector.
The first preset criterion in the step (1) is specifically:
<math> <mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </munder> <mi>dis</mi> <mrow> <mo>(</mo> <msubsup> <mi>p</mi> <mi>i</mi> <mi>upstream</mi> </msubsup> <mo>,</mo> <msubsup> <mi>p</mi> <mi>j</mi> <mrow> <mi>inter</mi> <mi>sec</mi> <mi>tion</mi> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>&le;</mo> <mi>L</mi> </mrow> </math>
wherein,indicating the coordinate position of the point in the ith link in the upstream link,
Figure BDA0000028976970000023
a coordinate position indicating the j-th intersection center of the upstream link,
Figure BDA0000028976970000024
and represents the distance between the coordinate position of the midpoint of the ith road segment in the upstream road segment and the coordinate position of the jth intersection center of the upstream road segment.
The history tracing maximum cycle number m in the step (3) is specifically as follows:
c denotes a prediction period of the time period,
Figure BDA0000028976970000026
the average speed is indicated.
The dimension M of the initial state vector in step (5) is specifically:
m is (s +1) (M +1), and s represents the number of elements in the upstream segment number set.
The fitness function in the step (7) is specifically as follows:
F(VAR,ARE,PER,EC)=λ1EV+λ2ARE+λ3/PER+λ4EC, EV represents the variance of the prediction error, ARE represents the average relative error, PER represents the prediction relative error in the interval [0, alpha ]]In percent between, EC represents the coefficient of equality, lambda1Denotes the weight of EV, λ2Denotes the weight of ARE, λ3Represents the weight of PER, λ4Denotes the weight of EC and α denotes the prediction relative error.
The step (7) of obtaining the fitness of the Z particles according to the fitness function specifically includes:
defining a current prediction period flow state mode;
performing dot product operation on the particle codes and the current prediction period flow state mode to obtain a current flow state mode;
performing dot product operation on the particle codes and the historical database flow state mode to obtain the current historical database flow state mode;
predicting the flow of the next cycle of the current prediction cycle through K neighbor matching and equal weight prediction according to the current flow state mode and the current historical database flow state mode to obtain a first prediction error, and taking the first prediction error as the fitness of the current particles.
The technical scheme provided by the invention has the beneficial effects that:
the embodiment of the invention provides a method for selecting state vectors in non-parametric regression short-time traffic flow prediction, which is adopted under four conditions of peak time, flat time, low time and all weather, so that the prediction precision, stability, speed and portability are improved, the running time is shortened, and the effectiveness and the necessity of the method are verified.
Drawings
FIG. 1 is a flow chart of non-parametric regression provided by the present invention;
FIG. 2 is a schematic diagram of a distance method according to the present invention;
FIG. 3 is a flow chart of a method for selecting a state vector in non-parametric regression short-term traffic flow prediction according to the present invention;
FIG. 4 is a diagram illustrating the fitness of Z particles obtained according to the fitness function F according to the present invention;
FIG. 5 is a graph comparing peak time prediction results provided by the present invention;
FIG. 6 is a comparison graph of peak-flattening period predictions provided by the present invention;
FIG. 7 is a comparison graph of the prediction results for the low peak periods provided by the present invention;
FIG. 8 is a comparison graph of all-weather prediction results provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
In order to solve the problems, improve the prediction precision, shorten the running time and meet the requirements in practical application, the embodiment of the invention provides a method for selecting a state vector in non-parametric regression short-time traffic flow prediction.
Referring to fig. 1, nonparametric regression is a data-driven heuristic prediction mechanism that predicts future values by searching a historical database for data that is similar to the current observed value. It can be generally divided into five components: selection of historical data, generation of a sample database, definition of data similarity, K neighbor matching and prediction methods. When nonparametric regression is adopted to predict short-term traffic flow, a historical database is firstly constructed, and if traffic flow data are stored in the database indiscriminately, problems that the size of the database cannot be borne, dimension disasters are generated when the data are matched and the like are caused. Therefore, the selected traffic flow data should be the flow or combination of flows with completeness and typicality that is closest to the flow of the section under test. Meanwhile, the sample database is the core of the non-parametric regression, and the structure (including a logical structure and a physical structure) and the space-time efficiency of a search data algorithm play a decisive role in the quality of the non-parametric regression performance. Therefore, the important point of the research in the embodiment of the present invention is how to reasonably select the organization mode (i.e., the state vector) of the sample database, so that the causal relationship between the flow rates of the upstream road segment and the road segment to be detected can be described, the storage space can be saved, and the search efficiency can be increased. After the sample database is generated, the definition of data similarity, K neighbor matching and prediction can be carried out. After relevant elements of the nonparametric regression model are set, neighbor matching with the current real-time observation data K can be found from the historical database, and finally, traffic flow prediction quantity at the next moment of the current moment can be obtained by prediction. When comparing the current traffic flow observation data with the traffic flow historical database, a comparison standard is needed, and the state vector is the description of the standard. Such as road occupancy, driving speed and weather conditions, affect the traffic at the next time of the road segment. Traffic flow data for adjacent road segments involves the problem of taking several time intervals and several road segments adjacent to each other, even at the closest adjacent time. Whether the state vector is reasonable or not is directly related to the prediction precision. At present, the selection of the state vector has no unified standard, and the accuracy of prediction cannot be improved by considering as many factors as possible in the state vector, but a longer running time is caused; however, if the selected state vector is not enough to describe the main cause and effect of the upstream and downstream road section flow, the good prediction effect is not achieved. The selection of the K value of the neighbor point in the K neighbor matching is very important, and the prediction accuracy is affected by too large or too small K value. If the selected K value is equal to the number of historical database patterns, then non-parametric regression is not accurate. However, the selection of the K value cannot be too small, and if the K value is too small, the component of the incidental factor is increased, which affects the accuracy of prediction. In addition, during some abnormal time, the flow rate of some road sections is obviously reduced, while the flow rate of other road sections is suddenly increased, and K can take a smaller value, and if the value is taken to be a larger value, the information is weakened, so that the prediction error is larger. Thus, in the case of an abnormally large or low flow rate section, the value of K may be set small or predicted with one prediction section. However, when the short-term traffic prediction is performed, a uniform paradigm is not provided for determining the accurate K value, so that the optimal K value needs to be selected by analyzing a curve graph of prediction errors and the K value according to different series of sample data participating in offline prediction inspection. Because the cause-and-effect relationship of the flow of the upstream road section and the road section to be detected at different time intervals in a day is different, when online rolling prediction is carried out, firstly, necessary offline detection analysis is carried out by using the method provided by the embodiment of the invention and historical data of the corresponding time interval to obtain a state vector describing the cause-and-effect relationship of the flow, and then, the corresponding time interval of the road section to be detected is subjected to real-time online prediction by using the state vector. Both during peak-flat periods and all weather conditions, the relative error of the predicted average correction is high due to the low flow, even 0, that occurs. In order to improve the practicability and operability of prediction, two different K values can be selected to construct a prediction interval, and as long as the predicted values are acceptable in the corresponding prediction interval, the average correction relative error of the non-parametric regression prediction result is improved to a certain extent. Suppose that the real-time traffic flow data mode is:
{vi(t-m),Vi(t-m+1),L,Vi(t), i belongs to U + { f }, wherein U is a related upstream road section number set, the number of elements in the upstream road section number set U is set to be s, f is a road section number to be tested, { f } is a road section number set to be tested, m is the maximum cycle number of historical retrospection, t is prediction time, and i is the mark number of the upstream road section. The traffic flow data mode of the historical database is { V }ih(t-m),Vih(t-m+1),LVih(t) }, i ∈ U + { f }. The distance measures the matching degree of the real-time data and the sample data, however, different K neighbors can be searched by different distance measurement criteria, and the accuracy of the predicted value is further influenced. The distance measurement criterion adopted by the embodiment of the invention is as follows:
<math> <mrow> <mi>D</mi> <mo>=</mo> <munder> <mi>max</mi> <mrow> <mi>i</mi> <mo>&Element;</mo> <mi>U</mi> <mo>+</mo> <mo>{</mo> <mi>f</mi> <mo>}</mo> <mo>,</mo> <mi>l</mi> <mo>&Element;</mo> <mo>{</mo> <mn>0,1</mn> <mo>,</mo> <mi>L</mi> <mo>,</mo> <mi>m</mi> <mo>}</mo> </mrow> </munder> <mo>|</mo> <msub> <mi>V</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>+</mo> <mi>l</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>V</mi> <mi>ih</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>+</mo> <mi>l</mi> <mo>)</mo> </mrow> <mo>|</mo> </mrow> </math>
wherein the distance metric criterion D is (s +1) (m +1) -dimensional, and the component distance with the largest distance is selected from the (s +1) (m +1) -dimensional components of the real-time data and the sample data as the matched metric criterion. The distance measurement criterion fully considers different distance information of (s +1) (m +1) dimensions, and is equivalent to a hypercube of (s +1) (m +1) dimensions, wherein the side length of the hypercube is the dimension with the largest distance, namely, the smaller the value of D, the higher the similarity of matching.
Selecting K nearest historical state vectors in the historical database through a distance measurement criterion, and supposing that K neighbors are found in the historical database, wherein the distances between the real-time data and the K neighbors are respectively Dk(K ═ 1, 2, L, K), these neighbors correspond to the neighbors to be treatedMeasuring the flow of the next prediction period of the road section as Vkh(t+1)。
The prediction method mainly comprises a weighted prediction method and an equal-weighted prediction method. Since traffic systems contain determinism and randomness: the certainty is reflected by the close proximity, namely when the state vectors are close, the predicted value and the true value are also close to have certain certainty; however, the system has randomness, so that no rule that the closer the state vectors are, the closer the predicted value and the true value are exists. According to the fact that two identical propositions do not exist in the world: when 95% of the characteristics of two leaves are almost identical, the possibility of obvious difference of the other 5% of the characteristics is rather high. Therefore, the prediction method in the embodiment of the present invention adopts an equal weight prediction method, and the expression is as follows:
<math> <mrow> <mi>V</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mi>K</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </munderover> <msub> <mi>V</mi> <mi>kh</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>
referring to fig. 2, the number s of relevant upstream segments and the maximum number m of cycles for historical retrospection can be determined more coarsely by the distance method. The road section marked by the black rhombus is the road section to be detected, the road section marked by the black point is the central point of the intersection, and the road section marked by the black square frame is the upstream road section of the road section to be detected. Through empirical analysis, it is found that if the upstream road sections with the urban distance to the road section to be detected within the range of L have a significant influence on the flow of the road section to be detected, the upstream road sections form an upstream road section set of the state vector, wherein the values of the number s and the range of L of the upstream road sections are determined according to the specific application condition in practical application, and the embodiment of the invention does not limit the flow in the specific implementation. Obtaining a relatively rough state vector according to the road network characteristics and the distance method, and then optimizing the first prediction error by using a PSO (particle swarm Optimization) -GA (Genetic Algorithm) hybrid intelligent Algorithm and inspection data, so as to obtain the state vector capable of describing the flow causal relationship between the upstream road section and the road section to be detected. Finally, the found state vector can be used for real-time online flow prediction. Referring to fig. 3, a detailed implementation process of the embodiment of the present invention is described:
101: judging whether an upstream road section related to the road section to be detected is in the upstream road section set according to a first preset criterion, and if so, executing the step 102; if not, the upstream road section is not in the upstream road section set;
wherein the first preset criterion is
Figure BDA0000028976970000062
Figure BDA0000028976970000063
Indicating the coordinate position of the point in the ith link in the upstream link,
Figure BDA0000028976970000064
a coordinate position indicating the j-th intersection center of the upstream link,
Figure BDA0000028976970000065
representing the distance between the coordinate position of the middle point of the ith road section in the upstream road section and the coordinate position of the center of the jth intersection of the upstream road section, wherein when a first preset criterion is met, the upstream road section belongs to the upstream road section set, namely i belongs to U; when the first preset criterion is not satisfied, the upstream road segment is not in the upstream road segment set, namely
102:Acquiring the average speed of traffic flow in the range of square circle L of the road section to be detected through preset data
Figure BDA0000028976970000067
The preset data is traffic flow of a certain road section in the time periods of early peak time, noon peak time, late peak time and the like, and the average speed of traffic flow in the range of the square circle L of the road section to be detected can be obtained by counting and analyzing the traffic flow
Figure BDA0000028976970000071
103: obtaining historical retroactive maximum cycle number m according to the average speed and the prediction cycle;
Figure BDA0000028976970000072
c denotes a prediction period of the time period,
Figure BDA0000028976970000073
the average speed is indicated.
The prediction period may be set to 5min, 10min, 15min, and the like according to a specific application condition in an actual application, and this is not limited in the embodiment of the present invention in specific implementation, and the embodiment of the present invention is described with 5min as an example.
104: acquiring an initial state vector according to the upstream road section set U and the maximum cycle number m of historical retrospection;
{Vi(t-m),Vi(t-m+1),L,Vi(t)},i∈U+{f}
105: determining the encoding length of the particle according to the dimension M of the initial state vector;
the dimension M of the initial state vector is (s +1) (M +1), and s represents the number of elements in the upstream segment number set.
106: setting the number of particles as Z, and randomly generating Z particles;
where the dimension of each particle is the dimension M of the initial state vector.
107: defining a fitness function F, and acquiring the fitness of Z particles according to the fitness function F;
F(VAR,ARE,PER,EC)=λ1EV+λ2ARE+λ3/PER+λ4/EC
wherein EV represents the variance of the prediction error and represents the robustness of the prediction algorithm; ARE represents the average relative error and represents the overall performance of the prediction algorithm; PER represents that the prediction relative error is in the interval 0, alpha]The percentage between, the individual performance of the predicted effect; EC represents the coefficient of equality of the coefficients,the quality of the overall prediction effect is shown; lambda [ alpha ]1Represents weight, λ, of EV2Weight, λ, representing ARE3Weight, λ, representing PER4Denotes the weight of EC and α denotes the prediction relative error. Lambda [ alpha ]1、λ2、λ3And λ4Is determined according to the situation in practical application by the value of lambda1、λ2、λ3And λ4The adjustment of (a) adjusts the proportion of EV, ARE, PER and EC in the fitness function F, and the value range of alpha is usually 20% or 30%.
The fitness function F represents the fitness of the particle, and the smaller the value of the fitness function F, the stronger the fitness of the particle, and the more likely the good gene is to be inherited to the next generation. In each iteration process, each particle adjusts itself by tracking two extreme values, so that the particle adapts to the living environment of the particle more and more, one is an individual extreme value which can be found by the particle itself, and the other is a global extreme value which can be found by the whole particle swarm at present. Referring to fig. 4, the steps specifically include the following steps, which are described in detail below:
1. defining a current prediction period flow state mode CM;
CM=[Vs(t-m)Vs(t-m+1)LVs(t)LVf(t-m)Vf(t-m+1)LVf(t)]1×M
2. performing dot product operation on the particle codes and the flow state mode of the current prediction period to obtain a current flow state mode CM*
The embodiments of the present invention are described by taking binary coding as an example, and binary coding of a particle is [ 10L 1L 10L 0 ]]1×MPerforming dot product operation on the particle and each state mode in the CM to obtain a current flow state mode: CM (compact message processor)*=[Vs(t-m)0LVs(t)LVf(t-m)0L0]1×M
3. Performing dot product operation on the particle codes and the flow state mode HM of the historical database to obtain the current flow state mode HM of the historical database*
HM={Vih(t-m),Vih(t-m+1),LVih(t)}H×M,i∈U+{f}
<math> <mrow> <msup> <mi>HM</mi> <mo>*</mo> </msup> <mo>=</mo> <msub> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <msub> <mi>V</mi> <mrow> <mn>1</mn> <mi>s</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mn>0</mn> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <msub> <mi>V</mi> <mrow> <mn>1</mn> <mi>s</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <msub> <mi>V</mi> <mrow> <mn>1</mn> <mi>f</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mn>0</mn> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <mn>0</mn> </mtd> </mtr> <mtr> <mtd> <msub> <mi>V</mi> <mrow> <mn>2</mn> <mi>s</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mn>0</mn> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <msub> <mi>V</mi> <mrow> <mn>2</mn> <mi>s</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <msub> <mi>V</mi> <mrow> <mn>2</mn> <mi>f</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mn>0</mn> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <mn>0</mn> </mtd> </mtr> <mtr> <mtd> <mi>M</mi> </mtd> <mtd> <mi>M</mi> </mtd> <mtd> </mtd> <mtd> <mi>M</mi> </mtd> <mtd> </mtd> <mtd> <mi>M</mi> </mtd> <mtd> <mi>M</mi> </mtd> <mtd> </mtd> <mtd> <mi>M</mi> </mtd> </mtr> <mtr> <mtd> <msub> <mi>V</mi> <mi>Hs</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mn>0</mn> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <msub> <mi>V</mi> <mi>Hs</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <msub> <mi>V</mi> <mi>Hf</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>-</mo> <mi>m</mi> <mo>)</mo> </mrow> </mtd> <mtd> <mn>0</mn> </mtd> <mtd> <mi>L</mi> </mtd> <mtd> <mn>0</mn> </mtd> </mtr> </mtable> </mfenced> <mrow> <mi>H</mi> <mo>&times;</mo> <mi>M</mi> </mrow> </msub> </mrow> </math>
Wherein, H is the number of the flow state mode of the historical database, H belongs to [1, H]Performing dot product operation on the particle and each state mode in the HM to obtain the current historical database flow state mode HM*. The capacity of the constructed historical database is large enough and representative, namely the historical database contains various traffic state change trends and typical laws, and the currently acquired real-time data mode can find a similar historical data mode.
4. According to the current flow state mode CM*And current historical database traffic status mode HM*Predicting the flow of the next cycle of the current prediction cycle through K neighbor matching and equal weight prediction to obtain a first prediction error, and taking the first prediction error as the fitness of the current particles.
108: acquiring individual extreme values and global extreme values of the particles according to the fitness of the Z particles;
and taking the value with the minimum fitness of each particle as an individual extreme value of each particle, and taking the minimum value in the Z individual extreme values as a global extreme value.
109: respectively performing cross operation on the codes of the Z particles and the codes corresponding to the individual extreme values and the global extreme values, and performing mutation operation according to a preset probability to obtain global optimal particles;
wherein the steps are as follows: and defining a single-point crossover operator, and respectively performing crossover operation on the codes of the Z particles with the codes corresponding to the individual extreme values and the codes corresponding to the global extreme values according to the defined single-point crossover operator. The codes of the Z particles are respectively crossed with the particle codes corresponding to the individual extremum, so that each particle can inherit the own superior partial gene. The codes of the Z particles are respectively crossed with the particle codes corresponding to the global extremum, so that each particle can inherit the optimal partial gene of the particle swarm. Defining mutation operators, and the codes of 2 parents are recombined to have possible mutation of children, and the children are converted with preset probability. During specific implementation, firstly, one sub-individual is randomly selected from a group consisting of the sub-individuals, and the value of a certain bit of code in the sub-individual is randomly changed for the selected sub-individual with preset probability. As in the biological world, the probability of occurrence of the mutation in GA is very low, and the preset probability value is usually between 0.001 and 0.01, so that the mutation provides an opportunity for the generation of new children.
The particle coding mode is various, binary coding, real number coding and the like can be adopted, the binary coding is preferably adopted in the embodiment of the invention, and 0 represents the state vector component and has no obvious influence on the prediction result; and 1 represents that the state vector component has obvious influence on the prediction result, and the optimal binary coding individual is obtained after the iteration of a PSO-GA mixed intelligent algorithm. Wherein the number of digits of the parent individual is defined according to the dimension of the initial state vector, a cross point p of the parent individual is randomly generated, the range of the cross point p is [1, M-1], and the high p bits of the parent individual 1 and the high p bits of the parent individual 2 are exchanged at the cross point p. For example: dimension M of the initial state vector is equal to 9 and parent individual 1 can be defined as 101011101; the parent individual 2 is 010101010; the range of the intersection p is [1, 8], and when the intersection position p is 5, the 5-high bit of the parent individual 1 and the 5-high bit of the parent individual 2 are exchanged at the intersection 5, and two children are generated after intersection, which are: subjects 1: 010101101, respectively; subjects 2: 101011010. for binary coded children, mutation means that the value at a certain bit flips. For each sub-individual, the value change encoded on a particular bit is random, for example: the children before mutation were: 010101101, when the fourth bit is mutated, the mutated offspring are: 010001101. in order to prevent convergence to the local optimal solution, the preset probability needs to be linearly increased, a specific value of the preset probability is set according to a specific application condition in practical application, and the embodiment of the present invention is not limited in specific implementation.
110: judging whether the preset times are reached, and if so, outputting globally optimal particles; if not, step 107 is re-executed.
The preset number is specifically set according to the situation in practical application, and the embodiment of the present invention is not limited thereto, and the preset number is generally about 2000.
111: and performing dot multiplication on the global optimal particles and the initial state vector to obtain a state vector.
In summary, the embodiment of the invention provides a method for selecting a state vector in non-parametric regression short-time traffic flow prediction, and the method provided by the embodiment of the invention is adopted under four conditions of peak time, peak-off time, low-peak time and all weather, so that the prediction precision, stability, speed and portability are improved, and the effectiveness and necessity of the method provided by the embodiment of the invention are verified.
The feasibility of the method for selecting the state vector in the non-parametric regression short-time traffic flow prediction provided by the embodiment of the invention is verified by adopting a test, which is described in the following:
traffic data as used herein is from the University of MinnesotaDuluth (University of MinnesotaDuluth, http:// www.d.umn.edu/tdrl/traffic /). In order to verify the effectiveness and the necessity of searching for the optimal state vector in the off-line analysis process of the PSO-GA algorithm proposed herein, the predicted effect is compared with the predicted effect which is not subjected to the off-line analysis of the PSO-GA algorithm under different traffic conditions, the comparison predicted effect is shown in table 1, and the comparison predicted result is shown in fig. 5, 6, 7 and 8.
TABLE 1
From the test data in table 1, the feasibility of the embodiment of the present invention can be verified by analyzing the data of EV, ARE, PER, and EC. From the comparison among the method provided by the embodiment of the present invention, the direct prediction result obtained by the method in the prior art, and the actual value of the road section in fig. 5, fig. 6, fig. 7, and fig. 8, the feasibility of the method for selecting the state vector in the non-parametric regression short-time traffic flow prediction provided by the embodiment of the present invention can be obtained, the prediction accuracy is improved, a better prediction effect is obtained, and the requirements in practical application are met.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. A method for selecting a state vector in non-parametric regression short-time traffic flow prediction is characterized by comprising the following steps:
(1) judging whether an upstream road section related to the road section to be detected is in the upstream road section set according to a first preset criterion, and if so, executing the step (2); if not, the upstream road segment is not in the upstream road segment set;
(2) acquiring the average speed of the traffic flow in the range of a square circle L of a road section to be detected through preset data;
(3) obtaining historical retroactive maximum cycle number m according to the average speed and the prediction cycle;
(4) acquiring an initial state vector according to the upstream road section set and the historical retroactive maximum cycle number m;
(5) determining the encoding length of the particle according to the dimension M of the initial state vector;
(6) setting the number of particles as Z, and randomly generating Z particles;
(7) defining a fitness function, and acquiring the fitness of Z particles according to the fitness function;
(8) acquiring an individual extreme value and a global extreme value of the particles according to the fitness of the Z particles;
(9) respectively performing cross operation on the codes of the Z particles, the codes of the individual extremum and the global extremum, and performing mutation operation according to a preset probability to obtain global optimal particles;
(10) judging whether the preset times is reached, if so, outputting the global optimal particles; if not, re-executing the step (7);
(11) performing dot product operation on the global optimal particles and the initial state vector to obtain a state vector;
wherein, the obtaining the fitness of the Z particles according to the fitness function in the step (7) specifically includes:
defining a current prediction period flow state mode;
performing dot product operation on the particle codes and the current prediction period flow state mode to obtain a current flow state mode;
performing dot product operation on the particle codes and the historical database flow state mode to obtain the current historical database flow state mode;
predicting the flow of the next cycle of the current prediction cycle through K neighbor matching and equal weight prediction according to the current flow state mode and the current historical database flow state mode to obtain a first prediction error, and taking the first prediction error as the fitness of the current particles.
2. The method for selecting the state vector in the nonparametric regression short-term traffic flow prediction according to claim 1, wherein the first preset criterion in the step (1) is specifically:
<math> <mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </munder> <mi>dis</mi> <mrow> <mo>(</mo> <msubsup> <mi>p</mi> <mi>i</mi> <mi>upstream</mi> </msubsup> <mo>,</mo> <msubsup> <mi>p</mi> <mi>j</mi> <mrow> <mi>inter</mi> <mi>sec</mi> <mi>tion</mi> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>&le;</mo> <mi>L</mi> </mrow> </math>
wherein,indicating the coordinate position of the point in the ith link in the upstream link,
Figure FDA0000098917400000022
a coordinate position indicating the j-th intersection center of the upstream link,
Figure FDA0000098917400000023
and the distance between the coordinate position of the middle point of the ith road section in the upstream road section and the coordinate position of the center of the jth intersection of the upstream road section is represented, and L represents the square circle of the road section to be measured.
3. The method for selecting the state vector in the nonparametric regression short-term traffic flow prediction according to claim 1, wherein the historical retroactive maximum cycle number m in the step (3) is specifically:
Figure FDA0000098917400000024
c denotes a prediction period of the time period,
Figure FDA0000098917400000025
the average speed is indicated.
4. The method for selecting the state vector in the non-parametric regression short-time traffic flow prediction according to claim 1, wherein the dimension M of the initial state vector in the step (5) is specifically:
m is (s +1) (M +1), and s represents the number of elements in the upstream segment number set.
5. The method for selecting the state vector in the nonparametric regression short-term traffic flow prediction according to claim 1, wherein the fitness function in the step (7) is specifically:
F(EV,ARE,PER,EC)=λ1EV+λ2ARE+λ3/PER+λ4EC, EV represents the variance of the prediction error, ARE represents the average relative error, PER represents the prediction relative error in the interval [0, alpha ]]In percent between, EC represents the coefficient of equality, lambda1Denotes the weight of EV, λ2Denotes the weight of ARE, λ3Represents the weight of PER, λ4Denotes the weight of EC and α denotes the prediction relative error.
CN2010105141116A 2010-10-21 2010-10-21 Method for selecting state vector in nonparametric regression short-time traffic flow prediction Expired - Fee Related CN101982843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105141116A CN101982843B (en) 2010-10-21 2010-10-21 Method for selecting state vector in nonparametric regression short-time traffic flow prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105141116A CN101982843B (en) 2010-10-21 2010-10-21 Method for selecting state vector in nonparametric regression short-time traffic flow prediction

Publications (2)

Publication Number Publication Date
CN101982843A CN101982843A (en) 2011-03-02
CN101982843B true CN101982843B (en) 2012-05-09

Family

ID=43619740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105141116A Expired - Fee Related CN101982843B (en) 2010-10-21 2010-10-21 Method for selecting state vector in nonparametric regression short-time traffic flow prediction

Country Status (1)

Country Link
CN (1) CN101982843B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629418B (en) * 2012-04-09 2014-10-29 浙江工业大学 Fuzzy kalman filtering-based traffic flow parameter prediction method
CN102736725B (en) 2012-05-18 2016-03-30 华为技术有限公司 A kind of energy saving of hard disks control method, device and central processing unit
CN102880771B (en) * 2012-10-31 2014-12-03 贵州大学 Method for predicting surface roughness of workpiece during high-speed cutting machining
CN104021665B (en) * 2014-06-06 2016-09-07 河南理工大学 Genetic search Short-time Traffic Flow Forecasting Methods
CN104091444B (en) * 2014-07-03 2016-03-30 四川省交通科学研究所 A kind of short-term traffic flow forecast method based on periodic component extractive technique
CN104464304A (en) * 2014-12-25 2015-03-25 北京航空航天大学 Urban road vehicle running speed forecasting method based on road network characteristics
CN105336163B (en) * 2015-10-26 2017-09-26 山东易构软件技术股份有限公司 A kind of Short-time Traffic Flow Forecasting Methods based on three layers of k nearest neighbor
CN108418774A (en) * 2018-02-09 2018-08-17 电子科技大学 For reducing the PSO-GA combined optimization algorithms of peak-to-average power ratio
CN110111606A (en) * 2019-03-18 2019-08-09 上海海事大学 A kind of vessel traffic flow prediction technique based on EEMD-IAGA-BP neural network
CN112614346B (en) * 2020-12-17 2022-02-15 东南大学 Short-term traffic flow prediction method based on singular spectrum analysis and echo state network
CN113470356B (en) * 2021-06-28 2022-11-04 青岛海信网络科技股份有限公司 Electronic equipment and regional road condition prediction method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6226389B1 (en) * 1993-08-11 2001-05-01 Jerome H. Lemelson Motor vehicle warning and control system and method
CN101685577A (en) * 2008-09-25 2010-03-31 福特全球技术公司 Method of assessing vehicle paths in a road environment and a vehicle path assessment system
CN101789176A (en) * 2010-01-22 2010-07-28 天津市市政工程设计研究院 Forecasting method for port area short-time traffic flow under model of reservation cargo concentration in port

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6226389B1 (en) * 1993-08-11 2001-05-01 Jerome H. Lemelson Motor vehicle warning and control system and method
CN101685577A (en) * 2008-09-25 2010-03-31 福特全球技术公司 Method of assessing vehicle paths in a road environment and a vehicle path assessment system
CN101789176A (en) * 2010-01-22 2010-07-28 天津市市政工程设计研究院 Forecasting method for port area short-time traffic flow under model of reservation cargo concentration in port

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
张晓利 等."基于K-邻域非参数回归短时交通流预测方法".《***工程学报》.2009,第24卷(第2期),第178-182页.
范鲁明 等."改进的K近邻非参数回归在短时交通流量预测中的应用".《长沙交通学院学报》.2007,第23卷(第4期),第39-43页.
范鲁明 等."改进非参数回归在交通流量预测中的应用".《重庆交通大学学报(自然科学版)》.2008,第27卷(第1期),第96-99页.
范鲁明 等."改进非参数回归在交通流量预测中的应用".《重庆交通大学学报(自然科学版)》.2008,第27卷(第1期),第96-99页. *

Also Published As

Publication number Publication date
CN101982843A (en) 2011-03-02

Similar Documents

Publication Publication Date Title
CN101982843B (en) Method for selecting state vector in nonparametric regression short-time traffic flow prediction
Corizzo et al. Anomaly detection and repair for accurate predictions in geo-distributed big data
CN106446540B (en) A kind of Wind turbines health status real time evaluating method
Zheng et al. Composite quantile regression extreme learning machine with feature selection for short-term wind speed forecasting: A new approach
CN106650767B (en) Flood forecasting method based on cluster analysis and real-time correction
Yu et al. Short term wind power prediction for regional wind farms based on spatial-temporal characteristic distribution
Beccali et al. Forecasting daily urban electric load profiles using artificial neural networks
CN108009674A (en) Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks
CN110570030A (en) Wind power cluster power interval prediction method and system based on deep learning
Kong et al. A variational bayesian inference-based En-Decoder framework for traffic flow prediction
CN113408659A (en) Building energy consumption integrated analysis method based on data mining
Zhang et al. Anomaly detection of heat energy usage in district heating substations using LSTM based variational autoencoder combined with physical model
Guo et al. A Short‐Term Load Forecasting Model of LSTM Neural Network considering Demand Response
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
Wang et al. Ensemble probabilistic wind power forecasting with multi-scale features
Zhang et al. Prediction method of line loss rate in low‐voltage distribution network based on multi‐dimensional information matrix and dimensional attention mechanism‐long‐and short‐term time‐series network
Zhu Research on adaptive combined wind speed prediction for each season based on improved gray relational analysis
Qu et al. Research on short‐term output power forecast model of wind farm based on neural network combination algorithm
CN116913098B (en) Short-time traffic flow prediction method integrating air quality and vehicle flow data
CN116167465A (en) Solar irradiance prediction method based on multivariate time series ensemble learning
Zheng et al. [Retracted] Application Based on Artificial Intelligence in Substation Operation and Maintenance Management
Shen et al. An interval analysis scheme based on empirical error and MCMC to quantify uncertainty of wind speed
Kong et al. Traffic Flow Prediction via Variational Bayesian Inference-based Encoder-Decoder Framework
Bencekri et al. Investigation of Shared-Bike Demand Using Data Analytics
Jin et al. Synthetic minority oversampling based machine learning method for urban level building EUI prediction and benchmarking

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120509

Termination date: 20121021