CN115438869A

CN115438869A - Asphalt pavement deflection basin performance prediction method based on chaotic particle swarm and XGboost

Info

Publication number: CN115438869A
Application number: CN202211138163.7A
Authority: CN
Inventors: 时欣利; 李卓轩; 曹进德; 陶萌; 万颖
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2022-12-06

Abstract

The invention relates to a chaotic particle swarm and XGboost-based asphalt pavement deflection basin performance prediction method, which comprises the following steps of: step 1, acquiring service performance detection data of an asphalt pavement and service performance influence factor data of the asphalt pavement over the years, and step 2, determining a subsection interval of subsection regression by taking a sudden change position of a load level as a segmentation point according to a change rule of an area of a deflection basin; step 3, establishing an XGboost regressor for each subsection interval in the step 2 and training; step 4, local optimization is avoided by using the characteristics of randomness and ergodicity of the chaos phenomenon through chaotic particle swarm optimization parameters; and 5, giving out deflection basin data prediction of the structural pavement. According to the technical scheme, the parameters of the XGboost and the corrected piecewise regression method are optimized and improved by using the chaotic particle group, so that the prediction efficiency is remarkably improved.

Description

Asphalt pavement deflection basin performance prediction method based on chaotic particle swarm and XGboost

Technical Field

The invention relates to a prediction method, in particular to a method for predicting the performance of a deflection basin of an asphalt pavement based on Chaotic Particle Swarm Optimization (CPSO) and XGboost, and belongs to the technical field of asphalt pavement service performance prediction.

Background

Damage to the pavement structure has always affected the service life of the road. With the increase in traffic and vehicle axle loads, the development of long-life asphalt pavements has been a serious challenge for researchers. Due to structural damage such as fatigue cracking, permanent deformation and the like of the traditional asphalt pavement, the asphalt pavement with long service life becomes an important development trend of the pavement. The long-life asphalt pavement is an asphalt pavement with the design service life of 40-50 years, and is characterized in that the damage of the pavement only occurs on the pavement in the service life, the structural damage does not occur, and the performance of the pavement can be ensured through regular maintenance. Under relatively busy traffic conditions, the long-life asphalt pavement helps to alleviate the frequency of pavement reconstruction, thereby reducing the pavement maintenance cost. In 2017, a RIOHRACK is used for carrying out loading test, the multipurpose performance evolution law under the condition of the nonlinear road structure and the full life cycle of the material is collected and researched, and the design method and the material of the road structure are verified and improved. The structural deflection basins with 19 different rigidity levels are periodically tested and data are collected under different load levels, and the change conditions of the structural deflection basins are analyzed. As an important index for evaluating the bearing capacity of an asphalt pavement structure, the deflection basin is always the focus of attention of highway builders and scientific researchers. Regular road testing and post-processing of the collected data is cumbersome and requires a great deal of expertise, considerable time, money and other resources.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a chaotic particle swarm algorithm combined with a piecewise regression strategy, and optimizes the XGboost model.

In order to achieve the purpose, the technical scheme of the invention is that the asphalt pavement deflection basin performance prediction method based on the chaotic particle swarm and the XGboost comprises the following steps:

step 1, collecting service performance detection data of an asphalt pavement and service performance influence factor data of the asphalt pavement all the year round, and performing characteristic engineering according to the collected data;

step 2, according to the change rule of the area of the deflection basin, taking the sudden change position of the load level as a segmentation point, and determining a segmentation interval of segmentation regression;

step 3, establishing an XGboost regressor for each subsection interval in the step 2 and training;

step 4, local optimization is avoided by using the randomness and the ergodic characteristics of the chaos phenomenon through the chaos particle swarm optimization parameters;

and 5, providing deflection basin data prediction of the structural pavement by using an XGboost piecewise regression model optimized by chaotic particle swarm.

As an improvement of the invention, the step 1 is specifically as follows, the detection data of the use performance of the asphalt pavement and the data of the influence factors of the use performance of the asphalt pavement in the past year are collected, and the main characteristic engineering comprises new characteristic extraction and data processing. Before the model is established, the number of accurate features is generally unclear, so that new features can be generated by monitoring data; some characteristics may lack some values, and the numerical characteristics with less missing values are filled in by using average values; because the data of the area of the deflection basin are periodically collected, a certain amount of noise exists, which has certain negative influence on the prediction of the performance of the deflection basin, and in order to reduce the influence of the noise, a first-order exponential smoothing is carried out on the region of the deflection basin by adopting a smoothing coefficient; finally, because some features are orders of magnitude large, in experiments, data was logarithmically compressed and transformed.

As an improvement of the invention, the step 2 is specifically as follows, according to the change rule of the deflection basin area, the position where the load level changes suddenly is taken as a segmentation point, the segmentation interval of segmentation regression is determined, and XGboost is provided ₁ And XGboost ₂ Two models, for T respectively ₁ And T ₂ The area of the deflection basin in a time period is predicted and analyzed, and can be formalized as follows:

where y is the predicted value, α and β are the correction coefficients of the two models, respectively, and index is the position of the cycle.

As an improvement of the present invention, step 3 implements a second order taylor expansion of the cost function by introducing a regularization term to avoid overfitting. The comprehensive tree model is

Where K is the total number of submodels, F = { F | F (x) = w _q(x) Is the set of all regression trees, w _q(x) Is a weight vector consisting of the weights of all leaf nodes of the regression tree,

is the sample prediction value, x _i Is a sample input feature, f _k The kth regression tree, each regression tree has independent leaf weight w and tree structure q; the introduced objective function is

Wherein

Is a loss function and Ω is a regularization term. By performing multiple iterations through an addition strategy, and combining a loss function and a regularization term, it can be deduced that:

wherein

N is the number of regression trees. The method comprises the following specific steps:

(1) Establishing a new decision tree;

(2) Calculate g for each training sample according to the objective function shown in the equation _i And h _i And starting iteration;

(3) Finding the optimal segmentation point by using an approximate greedy algorithm to obtain a decision tree structure f _t (x) Wherein t represents the tth iteration;

(4) Will f is mixed _t (x) Adding the tree into the integration tree model;

(5) And (4) carrying out multiple iterations according to the steps (1) to (4) to obtain a final classification model.

As an improvement of the present invention, step 4 specifically includes the following steps: firstly, a training data set and a test data set are segmented and recorded as data _train And data _test (ii) a Randomly initializing the speed and the position of the particles in the population, and initializing iteration times t, a counter SG and a local extremum judgment threshold SGmax; then defining parameters of a relevant XGboost piecewise regression model, taking the relevant parameters of the initial position of the particle as initial parameters, then training the model to find the initial fitness value e of the particle and calculating the initial fitness value e

And

according to formula of inertia weight

Updating the weight omega of the particle _ti Wherein ω is _max ，ω _min Respectively an upper limit and a lower limit of the inertia weight, wherein k is the current iteration number, and maxgen is the maximum iteration number; then according to the equation

Updating the particle velocity according to the formula

Updating the position of the particles, wherein

Velocity and position of the jth particle in d-dimensional search space at the kth iteration respectivelyC1 and c2 are acceleration factors,

respectively determining the historical optimal positions of the jth particle and the population in the kth iteration; finally, calculating the fitness value e of the particles, and updating the optimal positions of the particles and the population to

And

making SG as the number of continuous iterations for which the optimal position of the population is not updated, if the number of particles is

If not, setting SG = SG +1, otherwise, setting SG =0; if SG is less than or equal to SG _max Using the formula

Generating chaos optimization and setting SG =0, wherein

Is a chaotic variable cx _i The value after t iterations; and setting T = T +1, if T is less than T and T is the maximum iteration number, returning to calculate relevant parameters such as the speed and the position of the particles, and otherwise, outputting G and E.

As a modification of the present invention, step 5 is specifically as follows,

(1) Segmenting a data set

The data set obtained in the step 1 is randomly divided into a training set and a testing set according to experiments, wherein the training set is used for training a piecewise regression model based on chaotic particle swarm and XGboost, a local optimal value is continuously obtained and parameters are adjusted, so that an optimal piecewise regression model is obtained, and the testing set is used for testing the prediction of a regressor and observing the prediction effect of the regressor;

(2) Setting initial parameters

According to the characteristics of the improved chaotic particle swarm algorithm, the characteristics of the parameter range of the XGboost and the structural data of the asphalt pavement are combined, and corresponding initial parameter setting is carried out: maximum iteration number T, population number M, initial maximum position x _max Initial maximum velocity v _max And local extremum determination threshold SG _max . Because the initial population is randomly generated by the chaos particle swarm algorithm, suboptimal and even optimal parameters with high probability can be found through a plurality of experiments.

(3) Parameter optimization

When the particles fall into the local optimum, the local optimum is jumped out by using chaotic disturbance, the maximum mean square error MSE and the maximum absolute percentage error MAPE are used as evaluation indexes, and the prediction precision is improved by adjusting the correction coefficient.

Compared with the prior art, the method has the following advantages that 1) the XGboost algorithm is applied to the prediction problem of the deflection basin for the first time. The XGboost algorithm is superior to mainstream machine learning algorithms such as Random Forest (RF), KNN and Support Vector Regression (SVR) in the aspects of Mean Square Error (MSE) and Mean Absolute Error (MAE), and the XGboost algorithm has good performance in the aspect of nonlinear fitting. The performance and the calculation speed are better than those of the compared algorithm; 2) Aiming at the characteristics of the deformation basin measurement data, the invention provides a segmentation regression strategy, and the load change rate is used as a segmentation point, so that the prediction precision is effectively improved; 3) The XGboost algorithm is optimized by combining the chaotic particle swarm algorithm and the piecewise regression strategy, so that the optimization is remarkably improved compared with the unoptimized algorithm and the piecewise regression strategy combined with the PSO algorithm; 4) The artificial intelligence algorithm is combined with the deflection prediction problem, so that the future deflection experiment times can be reduced, and the deflection basin area can be predicted with high precision.

Drawings

FIG. 1 is a schematic diagram of a RIOHRACK test section;

FIG. 2 is a frame diagram of a prediction model of the deflection basin area based on chaotic particle swarm and XGboost piecewise regression;

FIG. 3 is a frame diagram of a segmented regression model training module and a prediction module based on chaotic particle swarm and XGboost.

The specific implementation mode is as follows:

for the purpose of enhancing an understanding of the present invention, the present embodiment will be described in detail below with reference to the accompanying drawings.

Example 1: a method for predicting the performance of an asphalt pavement deflection basin based on chaotic particle swarm and XGboost comprises the following steps:

and 5, providing the deflection basin data prediction of the structural pavement by using the XGboost piecewise regression model optimized by the chaotic particle swarm.

The method specifically comprises the following step 1, wherein measurement data of full-scale pavement structures of 19 asphalt pavements are used as data sources, the asphalt pavement service performance detection data and the asphalt pavement service performance influence factor data in the past year are collected, and the main characteristic engineering comprises new characteristic extraction and data processing. Before the model is established, the number of the accurate features is generally unclear, so that new features can be generated through monitoring data, and the generated new features have related load axis time, load axis time change rate, load level change rate, temperature change rate and temperature change difference; some characteristics may lack some values, and the numerical characteristics with less missing values are filled in by using average values; because the data of the area of the deflection basin are periodically collected, a certain amount of noise exists, the noise is reflected on the data and is irregular spines, so that the prediction of the deflection basin area has certain negative influence, and in order to reduce the influence of the noise, the first-order exponential smoothing is carried out on the data of the deflection basin by adopting a smoothing coefficient; finally, since the magnitude of the cumulative axis-order features is large, in the experiment, the data is compressed and converted by taking the logarithm.

The step 2 specifically comprises the following steps of determining a segmentation interval of segmentation regression by taking a sudden load level change position as a segmentation point according to a change rule of the area of the deflection basin, and providing XGboost ₁ And XGboost ₂ Two models, for T respectively ₁ And T ₂ The area of the deflection basin of a time interval is predicted and analyzed, and can be formed as follows:

Wherein, step 3 implements second-order taylor expansion of the cost function by introducing a regularization term to avoid overfitting. The comprehensive tree model is

Where K is the total number of submodels, F = { F | F (x) = w _q(x) Is the set of all regression trees, w _q(x) Is a weight vector composed of the weights of all leaf nodes of the regression tree,

Wherein

Is a loss function and Ω is a regularization term. By performing multiple iterations through the addition strategy, and combining the loss function and the regularization term, it can be deduced that:

wherein

(1) Establishing a new decision tree;

(4) Will f is _t (x) Adding the tree into the integration tree model;

and (4) carrying out multiple iterations according to the steps (1) to (4) to obtain a final classification model. Wherein, the step 4 is concretely as follows: firstly, a training data set and a test data set are segmented and recorded as data _train And data _test (ii) a Randomly initializing the speed and the position of the particles in the population, and initializing iteration times t, a counter SG and a local extremum judgment threshold SGmax; then defining parameters of a relevant XGboost piecewise regression model, taking the relevant parameters of the initial position of the particle as initial parameters, and then training the model to find the initial fitness value e of the particle and the initial values of the particle and the population

And

according to formula of inertia weight

Updating the weight omega of the particle _ti Wherein ω is _max ，ω _min Are respectively inertiaThe upper limit and the lower limit of the weight, k is the current iteration number, and maxgen is the maximum iteration number; then according to the equation

Updating the particle velocity according to the formula

Updating the position of the particles, wherein

Respectively the velocity and position of the jth particle in the d-dimensional search space at the kth iteration, c1, c2 are acceleration factors,

And

making the number of continuous iterations for which the optimal position of the population is not updated be SG, if the number of particles is

If not, SG = SG +1 is set, otherwise SG =0 is set; if SG is less than or equal to SG _max Using the formula

Generating chaos optimization and setting SG =0, wherein

Is a chaotic variable cx _i The value after t iterations; and setting T = T +1, if T is less than T and T is the maximum iteration number, returning to calculate the speed, the position and other related parameters of the particles, and otherwise, outputting G and E. Wherein, the step 5 is concretely as follows,

(1) Dividing the data set, randomly dividing the data set obtained in the step 1 into a training set and a testing set according to a period, wherein the training set is used for training the piecewise regression model, continuously jumping out a local optimal value and adjusting parameters so as to obtain an optimal piecewise regression model, and the testing set is used for testing the prediction of the regressor and observing the prediction effect of the regressor;

(2) Setting initial parameters, and according to the characteristics of the improved chaotic particle swarm algorithm, combining the parameter range of the XGboost and the characteristics of the asphalt pavement structure data, setting the corresponding initial parameters: maximum iteration number T, population number M, initial maximum position x _max Initial maximum velocity v _max And local extremum determination threshold SG _max . Because the initial population is randomly generated by the chaos particle swarm algorithm, suboptimal even optimal parameters with high probability can be found through a plurality of experiments.

(3) And optimizing parameters, namely jumping out the local optimum by utilizing chaotic disturbance when the particles fall into the local optimum, taking the maximum Mean Square Error (MSE) and the Maximum Absolute Percentage Error (MAPE) as evaluation indexes, and improving the prediction precision by adjusting a correction coefficient.

Example 2: the measured deflection basin data is from a full-scale road test loop road project of the department of transportation. It possesses a full-size field-accelerated road surface test track of 2.038 kilometers in length, known as a RIOHTrack. The full-loop comprises 25 asphalt pavement structures. The pavement structure layout is shown in fig. 1. 19 main test pavement structures are arranged on the test loop, and the long-term performance and evolution of the asphalt pavement structure, the asphalt pavement structure with different structural rigidity combinations and the material are researched and compared. From the infrastructure type point of view, it includes four typical structures: rigid base layer structure, semi-rigid base layer structure, flexible base layer structure and full thickness asphalt pavement structure. The invention takes the measurement data of 19 full-scale pavement structures of asphalt pavements as a data source, and aims to try to establish a reliable prediction method, namely a segmental regression model based on chaotic particle swarm and XGboost to predict the area of a deflection basin with high precision.

(1) The data set was divided, with a total of 5016 data samples of 19 pavements, of which 3800 were used for training and the rest for testing. The initial data mainly comprises accumulated load shaft time, load level, road surface temperature and deflection basin area. Some characteristics may lack values and the average value is used to fill in the numerical features that lack fewer values. Relevant load axis time, load axis time rate of change, load level rate of change, temperature difference of change. In feature engineering, they are compressed and processed by logarithmic transformation.

(2) Setting initial parameters, and carrying out corresponding setting according to the characteristics of the improved chaotic particle swarm algorithm and by combining the parameter range of the XGboost and the characteristics of the asphalt pavement structure data. For chaotic particle swarm optimization, the initial parameters are set as follows: maximum iteration T =100, population M =40, x _max ＝5、v _max =0.05 and SG _max And =7, segmented regression models based on chaotic particle swarm and XGboost are respectively established for 19 asphalt pavements.

(3) And (3) parameter optimization, wherein the chaotic particle swarm algorithm is fast in convergence in about 20 iterations, and when the chaotic particle swarm algorithm enters local optimization, the chaotic mapping escapes from a local optimal point, so that the optimization searching precision is improved. And taking the maximum mean square error MSE and the maximum absolute percentage error MAPE as evaluation indexes, wherein the overall optimal model can be obtained by optimizing parameters through the chaotic particle swarm optimization.

It should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and all equivalent modifications and substitutions based on the above-mentioned technical solutions are within the scope of the present invention as defined in the claims.

Claims

1. A method for predicting the performance of an asphalt pavement deflection basin based on chaotic particle swarm and XGboost is characterized by comprising the following steps:

step 4, local optimization is avoided by using the characteristics of randomness and ergodicity of the chaos phenomenon through chaotic particle swarm optimization parameters;

2. The asphalt pavement deflection basin performance prediction method based on the chaotic particle swarm and the XGboost according to claim 1, wherein the step 1 comprises the steps of collecting asphalt pavement service performance detection data and asphalt pavement service performance influence factor data in the past year, the main characteristic engineering comprises new characteristic extraction and data processing, the number of the characteristics which are usually unclear and accurate before the model is established is generated, and new characteristics are generated through monitoring data; filling the numerical characteristics with fewer missing values by using the average value; performing first-order exponential smoothing on the area of the deflection basin by adopting a smoothing coefficient; data is compressed and converted logarithmically.

3. The asphalt pavement deflection basin performance prediction method based on chaotic particle swarm and XGboost according to claim 1, characterized in that step 2 is specifically as follows, according to the change rule of deflection basin area, the sudden change position of load level is taken as a segmentation point, the segmentation interval of segmentation regression is determined, and XGboost is provided ₁ And XGboost ₂ Two models, for T respectively ₁ And T ₂ The area of the deflection basin in the time period is predicted and analyzed, and the form is as follows:

4. The asphalt pavement deflection basin area prediction method based on chaotic particle swarm and XGboost piecewise regression as claimed in claim 1, wherein in step 3, a regularization term is introduced to realize second-order Taylor expansion of a cost function, and a comprehensive tree model is

is the sample prediction value, x _i Is a sample input feature, f _k The kth regression tree is provided, and each regression tree has an independent leaf weight w and a tree structure q; the introduced objective function is

Wherein

Is a loss function, omega is a regularization term, and multiple iterations are performed through an addition strategy, and by integrating the loss function and the regularization term, it can be deduced that:

wherein

N is the number of regression trees, and step 3 is specifically as follows:

(1) Establishing a new decision tree;

(4) Will f is _t (x) Adding the tree into the integration tree model;

(5) And (4) carrying out multiple iterations according to the steps (1) to (4) to obtain a final model.

5. The asphalt pavement deflection basin area prediction method based on chaotic particle swarm and XGboost piecewise regression as claimed in claim 1, wherein step 4 comprises the following specific steps of firstly carrying out segmentation processing on a training data set and a test data set, and recording the segmentation processing as data _train And data _test (ii) a Randomly initializing the speed and the position of the particles in the population, and initializing iteration times t, a counter SG and a local extremum judgment threshold SGmax; then defining parameters of a related XGboost piecewise regression model, taking the related parameters of the initial position of the particle as initial parameters, then training the model to find the initial fitness value e of the particle and calculating the initial fitness value e

And

formula according to inertia weight

Updating the weight omega of the particle _ti Wherein ω is _max ,ω _min Respectively an upper limit and a lower limit of the inertia weight, wherein k is the current iteration number, and maxgen is the maximum iteration number; then according to the equation

Update the particle velocity according toFormula (II)

Updating the position of the particles, wherein

Respectively the velocity and position of the jth particle in the d-dimensional search space at the kth iteration, c ₁ ,c ₂ In order to accelerate the factor(s) of the vehicle,

And

If not, setting SG = SG +1, otherwise, setting SG =0; if SG is less than or equal to SG _max By the formula

Generating chaos optimization and setting SG =0, wherein

Is a chaotic variable cx _i The value after t iterations; set t = t +1, if t<And T is the maximum iteration number, the related parameters such as the speed, the position and the like of the particles are returned to be calculated, and otherwise G and E are output.

6. The asphalt pavement deflection basin area prediction method based on chaotic particle swarm optimization and XGboost piecewise regression according to claim 1, characterized in that the step 5 is specifically as follows,

(1) Segmenting a data set, randomly dividing the data set obtained in the step 1 into a training set and a testing set according to experiments, wherein the training set is used for training a piecewise regression model based on chaotic particle swarm and XGboost, continuously jumping out local optimal values and adjusting parameter parameters so as to obtain an optimal piecewise regression model, and the testing set is used for testing the prediction of a regressor and observing the prediction effect of the regressor;

(2) Setting initial parameters, and according to the characteristics of the improved chaotic particle swarm algorithm, combining the parameter range of the XGboost and the characteristics of the asphalt pavement structure data, setting the corresponding initial parameters: maximum iteration number T, population number M and initial maximum position x _max Initial maximum velocity v _max And local extremum determination threshold SG _max Because the chaos particle swarm algorithm randomly generates an initial population, suboptimal or even optimal parameters with high probability can be found through a plurality of experiments;

(3) And (3) optimizing parameters, namely jumping out the local optimum by utilizing chaotic disturbance when the particles fall into the local optimum, taking the maximum Mean Square Error (MSE) and the Maximum Absolute Percentage Error (MAPE) as evaluation indexes, and improving the prediction precision by adjusting a correction coefficient.