CN116739376A - Highway pavement preventive maintenance decision method based on data mining - Google Patents

Highway pavement preventive maintenance decision method based on data mining Download PDF

Info

Publication number
CN116739376A
CN116739376A CN202310691169.5A CN202310691169A CN116739376A CN 116739376 A CN116739376 A CN 116739376A CN 202310691169 A CN202310691169 A CN 202310691169A CN 116739376 A CN116739376 A CN 116739376A
Authority
CN
China
Prior art keywords
road
pavement
road surface
performance
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310691169.5A
Other languages
Chinese (zh)
Inventor
李一锋
肖琨
王静
陈戈
刘进友
何宏国
姜弘
屈国强
路伟
杨家权
崔镜宇
倪强
邓云川
冉惟可
李剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Railway Eryuan Engineering Group Co Ltd CREEC
Original Assignee
China Railway Eryuan Engineering Group Co Ltd CREEC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Railway Eryuan Engineering Group Co Ltd CREEC filed Critical China Railway Eryuan Engineering Group Co Ltd CREEC
Priority to CN202310691169.5A priority Critical patent/CN116739376A/en
Publication of CN116739376A publication Critical patent/CN116739376A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data mining-based highway pavement preventive maintenance decision method, which belongs to the technical field of analysis of data related to road maintenance, and comprises the following steps: preprocessing the technical condition record data of the road maintenance pavement by using a data mining abnormality detection algorithm, and marking an abnormality detection result; carrying out combined prediction on the abnormal detection result by adopting a regression analysis prediction algorithm and a gray system prediction algorithm, and carrying out association analysis on the abnormal detection result by adopting a pavement performance attenuation factor analysis rule and a heavy truck proportion influence analysis rule; establishing a gray matter element analysis algorithm, judging a road section to be cured and a curing priority by combining the combined prediction results, performing auxiliary sequencing on the curing sequence of the road section to be cured, integrating with the associated analysis results, and performing curing decision by auxiliary curing functional departments. The maintenance work efficiency of the road surface is improved, the service life of the road surface is prolonged, the maintenance cost is saved, the maintenance cost is reduced, and the maintenance and management level of equipment is finally improved.

Description

Highway pavement preventive maintenance decision method based on data mining
Technical Field
The invention relates to the technical field of road maintenance related data analysis, in particular to a road pavement preventive maintenance decision method based on data mining.
Background
Under the development trend of intelligent traffic, road construction and maintenance are two major subjects of road development, and along with the continuous increase of road mileage in China, the scale of a road traffic network is continuously enlarged, and a large amount of maintenance work is also carried out. For highway departments, the highway construction task is guaranteed to be finished in a guaranteed quantity so as to improve the coverage degree of the high-grade highway, and the maintenance of the established highway are also required to be enhanced, so that the excellent pavement usability is ensured, the maintenance cost is reduced, and the smoothness of the highway is ensured. The maintenance of the highway is to keep good performance of the highway and prolong the service life of the highway, if the maintenance work of the highway is not in line with the speed of road surface loss, the technical condition of the road surface can be rapidly reduced, the service level of the road can be influenced necessarily, and the original purpose of road construction is difficult to realize. The series of reality makes the development of the highways in China gradually change from the original 'construction mainly' to the 'construction and maintenance co-advancing' development stage, and the work of constructing, maintaining, reconstructing and upgrading the highways is carried out, so that the heavy work makes the optimization of the maintenance decision of the highways for the highway maintenance departments put higher demands. However, most of the road surface maintenance decisions of the roads in China still adopt a post-correction maintenance decision mode, namely, the maintenance measures are decided after the road surface service capacity is reduced to a certain level, and the road surface maintenance decision mode is a passive maintenance mode, has high cost, large workload, low maintenance efficiency and can not be performed in time, and secondary damage can be caused during waiting for maintenance of the road surface.
A large amount of data is accumulated through informatization construction in recent years, but the potential value in the data cannot be fully mined by the traditional data analysis method, the processing of maintenance data still takes transmission and storage as cores, the application analysis is in a primary stage, and a large amount of maintenance data does not play an effective guiding role in actual highway maintenance decision. Therefore, how to better perform pavement maintenance decisions to improve the technical condition of the highway pavement, increase the service life of the highway, improve the service performance of the pavement, ensure the perfect service performance of the highway network and the safe and comfortable highway characteristics, and play the maximum benefit of limited pavement maintenance funds so as to better play the role of providing excellent traffic services for the economic construction development of China by the highway traffic network, and become an important research subject faced by the current highway maintenance field. The data mining technology is used for finding hidden knowledge in data, the data mining technology is applied to highway maintenance decisions, the data processing capability of the data mining technology is exerted, and useful information in historical data accumulated by highway maintenance is fully mined, so that the highway pavement maintenance work efficiency is improved, the pavement service life is prolonged, and the maintenance cost is reduced.
Disclosure of Invention
The invention aims to overcome the defects that most of the road surface maintenance decisions of the roads in China still adopt a post-correction maintenance decision mode in the prior art, namely, the maintenance measures are decided after the road surface service capacity is reduced to a certain level, the cost is high, the workload is large, the maintenance efficiency is low, the maintenance cannot be performed in time, and secondary damage can be caused during the waiting period of the road surface maintenance.
In order to achieve the above object, the present invention provides the following technical solutions:
a highway pavement preventive maintenance decision method based on data mining comprises the following steps:
s1: preprocessing the technical condition record of the road maintenance pavement in real time and in mass by using a data mining abnormality detection algorithm, and carrying out deep analysis and marking on an abnormality detection result;
s2: carrying out combined prediction on the abnormal detection result by adopting a regression analysis prediction algorithm and a gray system prediction algorithm to obtain a combined prediction result, and carrying out association analysis on the abnormal detection result by adopting a pavement performance attenuation factor analysis rule and a heavy-duty vehicle type proportion influence analysis rule to obtain an association analysis result;
S3: and establishing a gray matter element analysis algorithm, judging a road section to be cured and a curing priority by combining the combined prediction result, performing auxiliary sequencing on the curing sequence and the curing type of the road section to be cured, and integrating with the association analysis result to assist a curing functional department in making a curing decision.
By adopting the technical scheme, the digital mining technology is applied to the road maintenance decision, the targeted preventive maintenance can be carried out according to the main factors of the current road surface index attenuation, the prospective maintenance and the preventive maintenance of the road major electromechanical equipment are realized, the equipment fault preventive distribution and the change trend can be obtained, the maintenance section can be accurately analyzed, the maintenance efficiency is improved, and the road disease spreading risk is reduced.
As a preferred embodiment of the present invention, the step S1 includes:
s11: scanning a pavement performance data set from a database, randomly extracting a plurality of sub-samples, constructing a plurality of iTree isolated trees, and completing construction and training of a pavement performance record detection model;
s12: and carrying out abnormal evaluation on the road surface use performance record by adopting the road surface use performance record detection model.
As a preferred embodiment of the present invention, the step S11 includes:
s111: randomly acquiring a plurality of sub-samples;
s112: randomly selecting one attribute from four attributes of road surface technical grade, road surface use performance comprehensive index, road surface damage index and road surface running quality index of road surface technical condition data as a segmentation attribute, selecting one attribute from a maximum value interval and a minimum value interval of a value range of the attribute as a segmentation value, and then segmenting a plurality of sub-samples into left and right sub-trees according to the segmentation value;
s113: repeating step S112 for sub-samples of the left and right subtrees until there is only one record in the set of sub-samples;
s114: repeating the steps S111-S113 until a sufficient number of the iTree isolated trees are built to form an isolated forest.
As a preferred embodiment of the present invention, the step S12 includes: obtaining pavement use performance records needing to be subjected to anomaly detection, solving the path length of the pavement use performance records in each iTree according to the pavement use performance record detection model, calculating the anomaly indexes of the pavement use performance records, and classifying all anomaly indexes into two types after solving the anomaly indexes of all pavement use performance records, wherein the anomaly type with a large cluster center is marked as anomaly; the cluster center is small and is of a normal class, and the corresponding road surface performance record is marked as normal.
As a preferred embodiment of the present invention, the path length is calculated according to the following formula:
h (x) =L+C(n)
wherein h is (x) For the path length, x is the road surface use performance record, L is the height of the road surface use performance record in the binary tree of the leaf node where the road surface use performance record is finally located, and C (n) is the average height of the binary tree of n nodes;
wherein, h (n) =in (n) +gamma (n > 1), gamma is Euler constant, also called Euler-Marscond constant, gamma is set to 0.5772156649, H is the height of a piece of data recorded In the binary tree structure, and n is the number of records contained In the leaf node;
the calculation formula of the abnormality index is as follows:
wherein n is the total node number phi of the decision tree iTree, after calculating the abnormality index, judging whether the test data x is abnormal data according to the abnormality index, C (n) is the average height of binary trees of n nodes, E (h (x)) is the average height of the record x in each tree, h (x) represents the length path from the leaf node to the root node and is used for judging whether one record x is an abnormal point.
As a preferred embodiment of the present invention, the calculation formula of the regression analysis prediction algorithm in step S2 is as follows:
wherein the PPI is road surface performance index, PPI 0 Alpha is the life factor of the road surface, namely the age of the road surface when the road surface performance decays to 63.2% of the initial value, beta is the shape factor of the road surface performance decay curve, and t is the road age;
Variable substitution is carried out on the calculation formula of the regression analysis prediction algorithm, so that Int is equal to x, beta is equal to a, and beta lna-1 is equal toEqual to y, the resulting transformation formula is:
y=ax+b;
the values of the parameters a and b are estimated by adopting a least square method, and the calculation formulas are respectively as follows:
wherein (x) i ,y i ) Obtained from corresponding historical road surface performance index data PPI and t, namely x i Is the actual performance index of the current pavement, y i In order to obtain estimated values of parameters a and b for future road performance indexes predicted after the t road age is increased, values of alpha and beta are obtained according to the relation beta=a and beta lnalpha-1=b, n is the number of observation point groups,the historical road surface performance index data set number is obtained.
As a preferred embodiment of the present invention, the calculation formula of the gray system prediction algorithm in step S2 is as follows:
x 0 (k)+mz 1 (k)=q
wherein m and q are parameters to be estimated, and x 0 (k) For the actual state of the road surface performance index data at time k, z 1 (k) The prediction state of the pavement performance index data at the moment k is obtained;
the whitening formula of the grey system prediction algorithm is as follows:
where d is the length of the prediction error sequence, X 1 Is a prediction error sequence;
the estimation formulas for obtaining parameters m and q according to the least square method are respectively as follows:
and n is the number of the observation point groups, namely the number of the historical pavement performance index data groups.
As a preferred embodiment of the present invention, the calculation formula of the combined prediction in step S2 is as follows:
wherein Y (t) is the predicted value of the regression analysis prediction algorithm, Y (t) is the predicted value of the gray system prediction algorithm, and w j As a weight parameter, w 1 Weight of regression prediction result, w 2 The weights of the results are predicted for the gray system.
As a preferred embodiment of the present invention, the rule of association analysis described in the step S2 includes:
rule 1: when the proportion of the heavy-duty vehicle type in the road traffic flow reaches 30%, the probability of the comprehensive index grade of the road surface service performance is 0.99;
rule 2: when the proportion of the heavy-duty vehicle type reaches 40%, the probability that the comprehensive index grade of the pavement usability is reduced to the poor grade is 0.99;
rule 3: when the proportion of the heavy-duty vehicle type reaches 50%, the probability of the PQI grade falling to the secondary grade is 0.97;
rule 4: when the proportion of the heavy-duty vehicle type reaches 40%, the probability that the pavement damage index is of a poor grade is 0.96;
rule 5: when the proportion of the heavy-duty vehicle type reaches 50, the probability of the pavement damage index decreasing to the second time is 0.95;
rule 6: when the proportion of the heavy-duty vehicle type reaches 50%, the probability of poor running quality index is 0.90;
wherein, rule 1, rule 2 and rule 3 represent that the comprehensive index of the road surface use performance is inversely related to the proportion of the heavy-duty vehicle type;
Rule 4 and rule 5 indicate that too large proportion of heavy-duty vehicle types can cause more serious pavement damage;
the rule 6 indicates that the oversized heavy-duty vehicle type is also an important factor for the reduction of the road surface running quality index.
As a preferred scheme of the present invention, the step S3 of establishing a gray matter element analysis algorithm, and determining the road section to be maintained and the maintenance priority by combining the combined prediction result includes the following steps:
s31: establishing an optimal road section object gray element:
wherein C is 1 Is a highway integral maintenance technical condition index (MQI),C 2 Comprehensive index (PQI) and C for road surface use performance 3 Is road segment breakage index (PCI), C 4 Is road surface Running Quality Index (RQI), D n For the number of columns of the matrix, i.e. the number of index data, a n1 ~a n4 Corresponding matrix array data for specified pavement indexes;
s32: selecting the maximum value of the attribute in the n road segments as the value of the attribute in the gray element of the best road segment, and for the gray element forming the best object:
s33: according to the existing road section technical condition evaluation basic data and the related road section evaluation system, building a correlation coefficient 4-dimensional composite gray element of each road section and the optimal road section:
wherein L is n1~ L n4 Matrix column data corresponding to the specified pavement indexes;
S34: the calculation formula of the association value is as follows:
wherein, lij is the association value of the jth road segment and the ith characteristic parameter of the optimal road segment, delta ij Delta is the difference between the gray element of the optimal road section and the ith dimension characteristic of the jth road section min For the minimum value of the difference value of the characteristic indexes of each dimension of the best road section and all other road sections, delta max For the maximum value of the characteristic index difference value of each dimension of the best road section and all other road sections, the parameter p is a resolution coefficient, and the parameter p is usually equal to 0.5;
s35: integrating the discrete association coefficients to obtain an association degree composite gray element:
wherein L is 0j The correlation between each road section to be evaluated and the optimal road section is defined; w (w) i The weight coefficient of each characteristic attribute;
s36: and obtaining the association degree composite gray element by the following steps: each value represents the similarity degree of the corresponding road section and the best road section gray element, so the larger the value is, the better the road surface using performance of the corresponding road section is, the smaller the value is, the poorer the road surface using performance of the corresponding road section is, and the most basic principle in road section maintenance is that the road surface using performance is poorer when the road section maintenance is carried out, then the priority of each road section maintenance is inversely proportional to the association degree of the road section and the best road section gray element, the higher the road section maintenance priority of the smaller the association degree is, and the lower the road section maintenance priority of the larger the association degree is.
Compared with the prior art, the invention has the beneficial effects that: the invention can assist road maintenance departments to carry out road preventive maintenance planning in advance, and make reasonable road maintenance decisions, develop data processing capability, fully mine useful information in accumulated historical data of road maintenance, so as to improve road surface maintenance work efficiency, prolong road surface service life, save maintenance cost, reduce maintenance cost and finally improve road maintenance management level.
Drawings
FIG. 1 is a flow chart of a data mining-based highway pavement preventive maintenance decision method according to the present invention;
fig. 2 is a flowchart of abnormal pavement performance data detection based on ifeest of the pavement preventive maintenance decision method based on data mining according to embodiment 2 of the present invention;
FIG. 3 is an attribute table of an iForest model of an asphalt pavement based on a data mining-based preventive maintenance decision method for highway pavement according to the embodiment 2 of the present invention;
FIG. 4 is a table of the properties of an iForest model of a pavement of a highway based on the data mining preventive maintenance decision method of embodiment 2 of the present invention;
FIG. 5 is a data diagram of the technical status of a cement pavement based on the preventive maintenance decision method for highway pavement based on data mining according to the embodiment 2 of the present invention;
fig. 6 is an iTree sub-sample set diagram of a data mining-based highway pavement preventive maintenance decision method according to embodiment 2 of the present invention;
fig. 7 is a construction flow chart of pavement technical condition data iTree of a pavement preventive maintenance decision method based on data mining according to the embodiment 2 of the present invention;
fig. 8 is a path length flowchart of a road surface technical condition record to be detected in a road surface preventive maintenance decision method based on data mining according to embodiment 2 of the present invention;
FIG. 9 is a flowchart of the overall pavement performance prediction according to the data mining-based pavement preventive maintenance decision method of embodiment 2 of the present invention;
FIG. 10 is a chart showing the predicted value and the error with the true value of the PCI regression model of the method for determining preventive maintenance of highway pavement based on data mining according to the embodiment 2 of the present invention;
FIG. 11 is a graph of comparison of actual values of PCI predictions and trend of variation under a regression model of a method for decision-making preventive maintenance of highway pavement based on data mining according to example 2 of the present invention;
FIG. 12 is a graph of PCI predicted values and errors from actual values under a GM (1, 1) model of a method for preventive maintenance decision of highway pavement based on data mining according to the embodiment 2 of the present invention;
FIG. 13 is a graph showing the comparison of the actual values of PCI predictions and the trend of change under the GM (1, 1) model of a method for determining preventive maintenance of highway pavement based on data mining according to example 2 of the present invention;
fig. 14 is a graph of a road surface use performance combination prediction result of a data mining-based road surface preventive maintenance decision method according to embodiment 2 of the present invention;
FIG. 15 is a comparison chart of the combined prediction results of a data mining-based highway pavement preventive maintenance decision method according to the embodiment 2 of the present invention;
FIG. 16 is a flowchart of the Apriori algorithm of the method for preventive maintenance decision of highway pavement based on data mining according to the embodiment 2 of the present invention;
FIG. 17 is a table of internal influence association rules of pavement performance indexes of a pavement preventive maintenance decision method based on data mining according to embodiment 2 of the present invention;
FIG. 18 is a table of association rules of traffic flow on road surface performance impact according to the data mining-based road surface preventive maintenance decision method of embodiment 2 of the present invention;
FIG. 19 is a maintenance decision optimization flow chart of a highway pavement preventive maintenance decision method based on data mining according to the embodiment 2 of the present invention;
FIG. 20 is a table of asphalt pavement performance data of a preventive maintenance decision method for highway pavement based on data mining according to the embodiment 2 of the present invention;
fig. 21 is a road section basic data table of practical application of gray matter element of the data mining-based road surface preventive maintenance decision method according to embodiment 2 of the present invention;
FIG. 22 is a typical maintenance unit price table of a data mining-based highway pavement preventive maintenance decision method according to the embodiment 2 of the present invention;
fig. 23 is a comparative table of maintenance fund requirements under different decision methods of a data mining-based highway pavement preventive maintenance decision method according to embodiment 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to test examples and specific embodiments. It should not be construed that the scope of the above subject matter of the present invention is limited to the following embodiments, and all techniques realized based on the present invention are within the scope of the present invention.
Example 1
A highway pavement preventive maintenance decision method based on data mining, as shown in figure 1, comprises the following steps:
s1: preprocessing the technical condition record of the road maintenance pavement in real time and in mass by using a data mining abnormality detection algorithm, and carrying out deep analysis and marking on an abnormality detection result;
s2: carrying out combined prediction on the abnormal detection result by adopting a regression analysis prediction algorithm and a gray system prediction algorithm to obtain a combined prediction result, and carrying out association analysis on the abnormal detection result by adopting a pavement performance attenuation factor analysis rule and a heavy-duty vehicle type proportion influence analysis rule to obtain an association analysis result;
s3: and establishing a gray matter element analysis algorithm, judging a road section to be cured and a curing priority by combining the combined prediction result, performing auxiliary sequencing on the curing sequence and the curing type of the road section to be cured, and integrating with the association analysis result to assist a curing functional department in making a curing decision.
The core of the ifest algorithm is to construct a random forest consisting of IsolationTree (abbreviated as ifere). The ifest algorithm takes nodes as units, and one node is equivalent to one record in the database, and is formed by combining attributes of different attributes. Using d to represent a data node; a represents an attribute in a data node; ti represents a certain random data node set, which is called a sample set; d represents the collection of all data nodes, i.e. the database. The expression is defined according to the above as follows: dj= { Al, A2, A3, … …, an } represents a node (record) and An attribute included, a represents An attribute set, and d= { D1, D2, D3, … …, dn } is a set made up of all data records.
(1)IsolationTree
IsolationTree (iTree) is a random binary tree, each node is either a leaf node or an internal node containing two children, and the definition of the itene is as follows:
(1) randomly extracting phi data nodes from the data set D as a constructed sample set of the iTree
(2) Randomly selecting an attribute Aj from an attribute set A of the sample as a segmentation attribute, and randomly selecting a value p from a maximum value and a minimum value of the attribute Aj as a segmentation value;
(3) for a sample set ψ i Is defined as each data node d of j According to attribute A i The value of (d) j (A i ) A) dividing. If d j (A i )<p, data node d j Set in left subtreeIn (a) and (b); if d j (A i )>P, data node d j Put on the right subtree set ψ R Is a kind of medium.
(4) For the left subtree set psi in the way of step (3) L Sum phi R Performing iterative processing until one of the following conditions a) is satisfied: only one piece of data or all pieces of data in the subtree have the same attribute; b) The height of the binary tree reaches a specified height log2 (ψ).
(2) Path length h (x)
The meaning of h (x) is the height of a data record x in an itere binary tree. The calculation method is that record x starts from the root node of the binary tree, traverses according to the structure of the binary tree, finds the leaf node where x is located in the tree, and then calculates according to the path length formula (3-1).
h (x) =L+C(n) (3-1)
Wherein L is the height of the leaf node where x is finally located in the binary tree, n represents the record number contained in the current leaf node, C (n) represents the average height of the binary tree containing n nodes, and the average height is shown as a formula (3-2).
Where h (n) =in (n) +γ (n > 1), γ is a euler constant, also called euler-maskerni constant, γ has a value of 0.5772156649, h is a height of one piece of data recorded In the binary tree structure, and n is a number of recorded pieces included In the leaf node.
(3) Abnormality index
The abnormality index is a value representing the degree of abnormality of a record, and the calculation formula is shown in the formula (3-3):
where n is the total node number of the decision tree iTreeAfter the abnormality index is calculated by the formula (3-3), it can be judged whether the test data x is abnormal data or not based on the abnormality index.
The invention classifies the final abnormal score by adopting a k-means clustering method so as to judge whether the target record is abnormal, wherein the basic standard is that the classification with the abnormal index approaching 1 is abnormal classification, and the classification with the abnormal index base approaching 0 is normal classification.
The abnormal detection flow of the pavement performance data based on iForest is approximately divided into two stages, namely a model construction stage and an abnormal evaluation stage. The model construction stage mainly works to generate a sufficient number of iTree isolated trees through an original pavement performance data set, which is an implicit process for extracting data features; the abnormal evaluation stage is to calculate the abnormal score of the road surface using performance record to be detected by utilizing the generated iTree isolation tree, and then judge whether the road surface using performance record is abnormal or not according to the abnormal score, and the specific flow is shown in figure 2;
The specific description is as follows:
(1) And scanning a pavement performance data set from a database, randomly extracting a certain number of subsamples, starting to construct an iTree isolation tree in a specific mode, repeatedly extracting the subsamples to construct the iTree isolation tree if the number of the iTrees does not reach the requirement, ending the model construction stage if the number of the iTrees reaches the requirement, and entering an abnormal evaluation stage of the pavement use performance record.
(2) Obtaining a pavement performance record x needing to be subjected to anomaly detection, solving the path length of x in each iTree according to a specific algorithm mode, and then calculating the anomaly index of the record x according to a formula (3-3); after the abnormal indexes of all the pavement performance records are obtained, the abnormal indexes are classified into two types by adopting a k-means clustering method. The cluster center is large and is of an abnormal type, and the corresponding pavement performance record is marked as abnormal; the cluster center is small and is of a normal class, and the corresponding road surface performance record is marked as normal.
When the abnormal data mining of the highway pavement using performance record is developed, the applicable attribute indexes are reasonably selected first, and the requirements can be better met with the characteristics concerned by the highway maintenance decision at the present stage; next, the sub-sample size T of the tree to be constructed in the required model should be determined, and the distribution characteristics of the road surface usage performance data should be acquired as much as possible without increasing the modeling difficulty.
According to maintenance technical condition data provided by a typical highway, the composition of the maintenance technical condition data has a plurality of attributes, such as a route code, a start-stop point pile number, a detection direction, a road surface type, a road surface damage condition index (PCI), a road surface Running Quality Index (RQI), a road Rut Depth Index (RDI), a roadbed technical condition index (SCI), a line facility Technical Condition Index (TCI) and the like, and according to the object of the invention, the data attribute related to the use performance of the road surface is mainly studied. The road surface use performance has a plurality of evaluation attributes, specifically, a comprehensive road surface use performance index, such as a road surface use performance comprehensive index PQI; there are also individual performance indicators constituting the road surface performance, such as the running quality index RQI, etc.
Which attribute indexes are selected as data mining analysis attributes are not only constrained by system data, but also directly influence the further development of subsequent work. In consideration of the actual condition of a common road surface for road condition detection data acquisition, different attributes are respectively selected for the asphalt road surface and the cement road surface to build a model. The asphalt pavement model properties are as follows: road technical grade, road performance integrated index (PQI), road breakage index (PCI), road Running Quality Index (RQI), road rut index (RDI), as shown in fig. 3; b. cement pavement model properties: road technical grade, road performance integrated index (PQI), road breakage index (PCI), road Running Quality Index (RQI), as shown in fig. 4.
The anti-skid performance index is the selective detection data, and most road sections are not full, so the anti-skid performance index is not taken as the research scope of the invention.
Because the invention has two different road surface data of asphalt road surface and cement road surface, the process of training the two types of data models is basically the same, the iForest model construction of the cement road surface is used for describing the concrete processing process of the model training stage. The most important process in the iForest model training phase is to construct a sufficient number of iTree orphan trees. Assuming that a part of cement pavement technical condition records is randomly extracted from a database as shown in fig. 5, the pavement technical grade attribute (TL) is numerically mapped ("primary" =1, "secondary" =2, "high-speed" =0), and the starting pile number and the dead pile number which are not used as model attributes are ignored, a standard sub-sample set for constructing one igree binary tree is generated as shown in fig. 6, the sub-sample size T generally selected by practical application is 256, and the process of model training modeling is described in detail in this chapter by using 20 recorded samples as sub-samples.
The specific processing procedure of the model training stage for the technical condition data of the road surface is described as follows:
A1: the 20 road surface technical conditions of the subsamples are recorded as random acquisition and are not put back to sampling, and are used at the time
After the 20 records build the iTree, these already used data are no longer selected during the subsequent iTree construction.
A2: one attribute is randomly selected from four attributes TL, PQI, PCI, RQI of the road technical condition data as a segmentation attribute of the current node, one is selected from the maximum value and the minimum value of the attribute value range as a segmentation value p, and then the current record is segmented into left and right subtrees according to the values of the segmentation attribute. The partition attribute selected in the first step is PCI, the partition value is 92.00, 20 records are partitioned accordingly, the set of sub-samples in the left subtree is {13,14,16,19,20}, and the set of sub-samples in the right subtree is {1,2,3, … …,12,15,17,18}.
A3: and (3) repeating the step A2 for the subsamples of the left subtree and the right subtree until only one record, all the records are the same in the subsamples set or the maximum limit height log2 (20) is reached, and ending the construction of the current iTree.
A4: the steps A1-A3 are repeated until a sufficient number of iTrees are built to form an isolated forest.
Performing model evaluation of pavement performance data, and firstly obtaining path length:
The core step of carrying out exception evaluation on the object x to be detected is to traverse each iTree to obtain the path length of the node where the object x is located in different trees. The process of finding the path length h (x) of a road technical condition record is described in detail with the above-constructed iTree as shown in fig. 7. Suppose that the 1 st record 1,97.13,100,94.86 and the 13 st record 1,89.65,88.64,89.07 as in fig. 6 need to be evaluated, the path length calculation procedure in the iTree is shown in fig. 8.
The concrete explanation is as follows:
(1) The record 1 is used to traverse the iTree to obtain its height h. Initially, using attribute PCI splitting, splitting value 92.00, because PCI of 1 st entry is 100, greater than 92.00, so enter right subtree, at this time height h=1; the second split uses the attribute PQI, split value 93.87, since PQI recorded in entry 1 is 97.13, greater than split value, and therefore still enters the right subtree, at which point height h=2; using the split attribute RQI for the third time, the split value is 92.67, and because the RQI recorded in the 1 st strip is 94.86 and is larger than the split value, entering the right subtree, and at this time, the height h=3; the fourth partition uses the attribute TL, the partition value is 1, because the 1 st entry TL is 1, enter the right subtree, at which time the height h=4; and reaching the leaf node of the iTree, and ending the traversal.
(2) Obtaining the final path length h (x), and if the number of samples of the finally arrived leaf node is 1, enabling h (x) to be equal to the height h obtained in the step 1; if the number of samples of the finally arrived leaf node is greater than 1, the path length h (x) thereof is calculated according to the formula (3-1).
(3) H (x) recorded in the itrate shown in fig. 7 according to the final 1 st strip described above is equal to 7.17; h (x) recorded in the itrate shown in fig. 7 at 13 is equal to 2.
And secondly, carrying out abnormality judgment, namely calculating abnormality indexes s (x, n) of the record according to a formula (3-3) after the path length of the record x with detection in all the iTree is obtained through the steps, and classifying the obtained abnormality indexes into two types according to k-means clustering after calculating the abnormality indexes of all the records to be detected and classifying the corresponding records.
Specifically, the effect verification of the anomaly detection model is performed by taking the pavement technical condition data in the road maintenance statistical report of certain province in 2018 as an example, and because the asphalt pavement and the cement pavement adopt different model attributes, the corresponding original data also have differences in attributes, and the pavement using performance data before the anomaly detection and the pavement performance data after the anomaly detection are compared and analyzed by the asphalt pavement and the cement pavement respectively.
1. Original data of pavement service performance before detection
As shown in fig. 9, when the prediction of the road surface usage performance is started, the normal road surface performance record subjected to the previous anomaly detection is first obtained, then the regression model parameters and the gray system parameters are estimated respectively by using the history data, the prediction is performed by using the future road surface usage performance indexes of the same road section with two different model degrees, and then the combined prediction comprehensive regression prediction result and the gray system prediction result are used as the prediction result of the final road surface performance index.
The research and practical observation show that the attenuation of the pavement using performance is not linear, the performance of the pavement is slowly reduced when the pavement is just started to be used, and the pavement is rapidly reduced when the performance is reduced to a certain value. Based on this characteristic, a nonlinear regression equation represented by the formula (4-1) is used for prediction.
Where PPI represents the road performance index, PPIO represents the initial value of the road performance index (typically starting at 100), α represents the life factor of the road, i.e. the period of time that the road performance decays to 63.2% of its initial value, β represents the shape factor of the road performance decay curve, and t represents the road age (0 years of best performance).
The equivalent transformation of formula (4-1) can be given by the following formula (4-2 a):
variable substitution of formula (4-2 a) is performed to let lnt equal to x, beta equal to a, beta lnα -1 equal to Equal to y, the original formula can be changed to the following formula
y=ax+b(4-2b)
The values of the parameters a and b are estimated by the least square method for the expression (4-2 b), and the expression is as follows. Wherein (xi, yi) can be obtained from the corresponding historical road performance index data PPI and t.
Wherein (x) i ,y i ) Obtained from corresponding historical road surface performance index data PPI and t, namely x i Is the actual performance index of the current pavement, y i Is as followsAnd predicting future pavement performance indexes after the t road age is increased, wherein n is the number of observation point groups, namely the number of historical pavement performance index data groups. After obtaining the estimated values of the parameters a, b, the values of α and β are obtained from the relations β=a, βlnα -1=b. The road surface use performance at a certain time in the future is then predicted according to the formula (4-1).
Because of the numerous factors and uncertainty of the pavement performance, the gray theory in the time series prediction model can be used for prediction analysis. The series prediction model GM (1, 1) is a more general time series prediction model, and the process of modeling and solving parameters is as follows.
Assume that there is an initial discrete number column: xo= { Xo (1), xo (2), xo (3, … Xo (n)) } is accumulated once to generate a new sequence xi= { x1 (1), xi (2), xi (3), … x1 (n) }, whereinThe immediately adjacent mean generation sequence of X1, zi= { Zi (1), zi (2), … z (n) }, where zi=0.5x1 (k) +0.5x1 (k-1), can then be found. The model equation for GM (1, 1) can be obtained as shown in equation (4-5):
x 0 (k)+az 1 (k)=b (4-5)
meanwhile, the formula (4-6) is referred to as a whitening equation of the above formula (4-5).
Wherein a and b are parameters to be estimated, and according to the least square method principle, estimation formulas (4-7) and (4-8) of a and b can be obtained as follows:
/>
solving a differential equation (4-6), taking xi (1) =xo (1), taking t as a discrete value k+1, and obtaining a prediction model of X1 as shown in the formula (4-9):
according to the relation between X1 and Xo, X1 (k+1) is restored to the original data Xo (k+1), as shown in the following formula (4-10):
the prediction result of the pavement service performance directly influences the selection of maintenance decisions, and in order to ensure the accuracy of the prediction result as much as possible and avoid uncertainty caused by using a single method, the invention designs a weighted arithmetic average combination prediction model based on the idea of weighted average to combine the regression prediction result and the gray system prediction result. For convenience of description, it is assumed that the results of the regression prediction model and the gray system prediction model are expressed as Y, (t) and Y (t), and W is set 1 Weight of regression prediction result, W 2 The weights of the results are predicted for the gray system.
The specific solving process of the weighted arithmetic average combined prediction model is shown in the following formula (4-11).
Wherein Y, (t) and Y, (t) are predicted values of the regression model and the gray system model, and are thus determined values, and since the prediction result of the road surface use performance is required to make a decision of the maintenance road section, a quadratic programming model is required to be established to determine the values of the weight parameters w1 and w2 by adopting the principle that the prediction result and the actual value have the smallest error as possible, and the specific programming equations are shown in the following formulas (4-12) and (4-13).
w 1 +w 2 =1 (4-13)
Where J '(t) represents the sum of squares of errors of the actual and predicted values, and the closer the predicted and actual values are to each other, the smaller J' (t) is. The solving process is as follows.
First, w is calculated for each of (4-12) 1 And w 2 And then let it equal to 0, the results are shown in the formulas (4-14) and (4-15).
Wherein the combination of formulas (4-13) and (4-14) (or formulas 4-13 and 4-15) gives two symmetrical groups of w 1 And w 2 A group in which J' (t) is made smaller is taken as a weight, and its calculation formula is shown below.
w 2 =1-w 1 (4-17)
Wherein, the optimal weight value w is estimated i And w 2 Then, the combination prediction can be performed according to the formula (4-11).
Taking a common trunk road section of a certain province as an example, a regression model and a GM (1, 1) model are respectively used for predicting the pavement use performance, and the pavement use performance index data of the road section from 2011 year to 2016 year are shown in the table:
index (I) 2011 2012 of 2012 2013 (2013) 2014 2015 2016 in the year
PQI 95.11 94.93 93.62 89.73 85.54 82.27
PCI 94.30 91.22 88.97 86.63 84.12 80.93
RQI 96.38 95.29 93.55 90.61 86.86 84.17
RDI 95.03 94.11 93.01 90.45 85.86 82.98
TABLE 1
Based on the historical data of each index of road section usage performance, a regression model is applied, and a road surface damage condition index (PCI) is taken as an example for discussion.
The PCI index was set as an initial value PPIo equal to 100, and the last 2016 years of PCI value was predicted from the previous 5 years of data and compared with the actual value. The corresponding (xi, yi) substitution variable value pair is obtained according to the variable substitution principle of the formula (4-2 a): { (0.0, -1.052), (0.693, -0.889), (1.098, -0.791), (1.386, -0.699), (1.609, -0.609) }; then, the estimated values of a and b are calculated to be a.apprxeq.0.4668 and b.apprxeq. 1.5227 according to the formula (4-3) and the formula (4-4). Finally, according to the values of a and b, estimated values of a life factor alpha and a shape factor beta are calculated, wherein the life factor alpha= 26.0985 and the shape factor beta=0.2685; then, the predicted value of the PCI index can be calculated according to the formula (4-1), and the predicted result is shown in FIG. 10.
As can be seen from fig. 10 and 11, when the regression model is built based on the historical data of the road surface damage condition indexes from 2011 to 2015, the model is better fitted to the data of the previous 5 years serving as the model modeling basis, but for the prediction of the road surface damage indexes in 2016, as the road surface damage is affected by weather, road section geographical environment and other factors, only the historical detection data is used for modeling in the model, so that certain errors exist between the actual value and the predicted value, and the prediction error in the example is 2.46; but the predictions are still accurate in general trends. In cases where more data is not available, the adjustment of the regression model parameters is difficult, and therefore a combination with a gray system is required to balance the error, making the prediction more accurate.
In order to ensure the reliability of the prediction result, the gray system GM (1, 1) model is adopted again to predict the road surface performance index, and the regression prediction error is balanced. Still using the PCI index value example in FIG. 9, the GM (1, 1) model was applied to predict the specific value of the last 2016 year PCI based on the previous 5 years of data.
According to the data as in fig. 14 with the original discrete number sequence xo= {94.30,91.22,88.97,86.63,84.12,80.93}, for X 0 Performing one-time accumulation to obtain an accumulation generation sequence xi= {94.30,185.52,274.49,361.12,445.24,526.17}; then according to X 1 Obtaining a next-to-mean generation sequence zi= {47.15,139.91,230.005,317.805,403.18,485.705}; finally, calculating model parameters a apprxeq 0.029 and b apprxeq 95.633 according to the formula (4-7) and the formula (4-8); finally, according to the formulas (4-9) and (4-10), the predicted value of the PCI index can be obtained, and the result is shown in FIG. 12.
As can be seen from fig. 12 and 13, since the gray system model is suitable for performing prediction of a small sample size, GM (1, 1) model established based on the road surface breakage index of 2011 to 2015 is better fitted to the trend of the road surface breakage index than the above regression model, and finally the error between the prediction result and the actual detection value of the road surface breakage index PCI of 2016 is 1.05, and the prediction value and the actual value of the road surface breakage condition index are closer to each other, but the error still exceeds the score of 1.0, so that the error will be reduced by combining the prediction comprehensive regression prediction with the gray system prediction result in the following.
In order to avoid larger errors of a single prediction result, a combined prediction method can be adopted to synthesize a regression prediction result and a gray system prediction result so as to improve the accuracy of pavement service performance prediction. The invention realizes the combined prediction based on the regression prediction and the gray system prediction result.
1. Weighted arithmetic mean combined prediction weight calculation
The weight values for the weighted arithmetic mean from the calculations and comparisons of equations (4-16) are as follows: regression prediction weight w of road surface breakage index PCI in the above case 1 Gray system prediction weight w for road surface damage index PCI of-0.156087 2 1.156087;
2. combining prediction results
And carrying out combined prediction according to the pavement using performance regression prediction result, the pavement using performance gray system prediction result and the formulas (4-11), wherein the prediction results are shown as follows.
As a result of analysis shown in fig. 14, it can be seen that in the case where regression prediction and gray system GM (1, 1) prediction were employed alone, the absolute values of the errors of the prediction results for the 2016-year road surface breakage index PCI were 2.46 and 1.05, respectively, whereas after combined prediction, the results of both predictions were integrated by the respective prediction values, balancing the errors of both, so that the final error was reduced to 0.53. As can be seen from the comparison graph of the predicted value trend in FIG. 15, the combined predicted result is closest to the actual variation trend of the pavement using performance, so that the error caused by a single predicted result can be effectively avoided by adopting the combined prediction, the predicted result of the pavement using performance is more accurate, and the more accurate predicted result is applied to the optimization of the subsequent pavement maintenance decision, so that the decision result is more scientific and reasonable.
Therefore, the invention provides four indexes of the pavement using performance by using the association rule analysis in the data mining, including the pavement comprehensive performance index, the pavement damage index, the pavement running quality index and the pavement rut depth index, carrying out association analysis, mining the association rule of the mutual influence among the performance indexes, finding out the factors which lead to the reduction of the pavement using performance at first and the pavement performance index which may be attenuated in the future, so as to facilitate the preventive maintenance of a maintenance department; on the other hand, the influence of the excavated traffic flow on each index of the road surface performance is analyzed through association so as to assist related departments in coordinating the traffic flow, and therefore the use performance of the road surface of the highway is better ensured to be at an excellent level.
Based on 2016-year ordinary road technical condition data and traffic flow data issued by a public road bureau of a certain province, respectively establishing an internal association data model of the road surface performance index and a data model of the influence of traffic flow on the road surface performance index, and carrying out mining analysis on the data model by using an association rule algorithm. However, since each performance index is a continuous value, and the association rule cannot effectively analyze the continuous value, the performance index is discretized according to the criteria shown in table 2:
Performance index ≥90 80~90 70~80 60~70 <60
Grade Excellent (excellent) Good grade (good) In (a) Difference of difference Secondary times
TABLE 2
The invention selects the association rule analysis based on the Apriori algorithm to analyze the attenuation factors of the pavement using performance indexes, and the mining process mainly comprises two steps of [44]:1. and finding out a pavement performance index frequent item set with the minimum support degree or more, and then generating a strong association rule according to the minimum confidence degree. The algorithm flow is shown in fig. 16.
The algorithm implementation is specifically described as follows:
(1) And storing the data to be analyzed into a road surface technical condition association analysis table D, and setting the minimum support degree as support. Traversing the D as a candidate 1-item set, calculating the support degree of all elements in the item set, finding out frequent 1-item sets with the support degree not smaller than a threshold value, circularly executing canditaegen (support) functions from the 2-order item sets to obtain candidate k-item sets, and stopping the circulation until the frequent (k-1) -item sets are empty. And for any subset c, calculating the support degree of the subset c, and acquiring all frequent item sets according to = { c.support.gtoreq.support =.
(2) Selecting a subset of any non-identical two frequent (k-1) -item sets, merging the two subset items together to generate candidate c=u, and removing candidate set combinations that are not possible to generate frequent item sets using an infequentgen (c,) function according to the value of the minimum support.
(3) And when judging whether the function of the frequent item set is contained in the index set (c), acquiring any subset y of each item set in the item set c, judging that the function belongs to the frequent item set if y belongs to the frequent (k-1) -item set, otherwise, discarding the subset not in the frequent (k-1) -item set.
Because the common highway is generally divided into a cement road surface and an asphalt road surface, the two road surfaces have certain difference in the selection of performance indexes, the asphalt road surface has 5 performance indexes, and the cement road surface has 3 performance indexes, so that the collected data can be subjected to unified mining analysis, and the common performance indexes of the two road surfaces need to be selected for analysis. According to the data provided by the public bureau of a certain province, selecting the following attributes as the attributes of the road surface performance index internal influence data model: road technical grade, road breakage index (PCI), quality of travel index (RQI), rut Depth Index (RDI), road performance integrated index (PQI), road base technical condition index (SCI), and along line utility Technical Condition (TCI).
Using weka as the association rule mining tool, the association rule mining is performed on the technical condition data of the road surface of certain province in 2016 by using the road surface performance index internal influence data model under the condition that the minimum support degree of mining parameters is set to be 10% and the minimum confidence degree is set to be 60%, and the result is shown in fig. 17.
Rule 1 indicates that the probability of the road having a secondary road technical grade from the current data is about 100%, and it is known that the road rut depth index has a great relationship with the road technical grade, and the higher the road grade is, the smaller the road rut depth is; rule 2 indicates that when the rut depth index and the roadbed technical condition index of the highway are both secondary, the probability that the technical condition index of the facilities along the highway is secondary is 100%, so that it can be known that the depth of the rut on the highway leads to poor running condition of the vehicle and influences the technical condition of the facilities along the highway, and the technical condition of the roadbed will also be serious along the facilities; rules 3 and 4 indicate that when the road surface breakage index PCI and the running quality index are both secondary, the probability that the road surface use performance integrated index PQI is secondary is 99%, and when the road surface breakage index PCI and the running quality index are both excellent, the probability that the road surface use performance integrated index PQI is excellent is 97%, whereby it is known that PCI and PQI are two main indexes affecting PQI; rule 5 indicates that when RDI is secondary, there is 96% probability RQI is also secondary, knowing that rut depth is an important factor affecting running quality index decay; rule 6 shows that when the running quality index is poor, the probability of the road breakage index being poor is 83%, and since the running quality index is mainly measured as road flatness, it is known from rule 6 that flatness is an important factor causing the road breakage index to be attenuated.
Using weka as the association rule mining tool as above, association rule mining was performed using 2016-year intermodulation data provided by the public bureau of a certain province and road technical status data with the minimum support degree of mining parameters set to 10% and the minimum confidence degree set to 60%, and the result is shown in fig. 18.
Rules 1, 2, 3 represent: when the proportion of the heavy-duty vehicle type in the road traffic flow reaches 30%, the probability of the comprehensive index grade of the road surface service performance is 0.99; when the proportion of the heavy-duty vehicle type reaches 40%, the probability of the PQI grade falling to the difference grade is 0.99; when the proportion of the heavy-duty vehicle type reaches 50%, the probability of the PQI grade falling to the secondary grade is 0.97. The 3 rules show that the comprehensive index of the road surface service performance is inversely related to the proportion of the heavy-duty vehicle type. Rules 4 and 5 show that when the proportion of the heavy-duty vehicle type reaches 40%, the probability that the road surface damage index is of a poor grade is 0.96; when the proportion of the heavy-duty vehicle type reaches 50, the probability of the pavement damage index decreasing to the second time is 0.95; as can be seen from the two rules, too large a proportion of heavy-duty vehicles can cause more serious road surface damage. Rule 6 shows that when the proportion of the heavy-duty vehicle type reaches 50%, the probability of the running quality index being poor is 0.90, and thus it is known that the excessive heavy-duty vehicle type is also an important factor of the reduction of the running quality index of the road surface, namely the reduction of the road surface flatness.
As shown in fig. 19, the basis for the maintenance decision optimization is to predict the road surface performance indexes described above, and to determine which kind the road surface maintenance strategy belongs to first by using the predicted road surface performance indexes; and then, carrying out gray matter element method priority ranking on all the road sections to be maintained under each classification, so as to ensure that the road sections which are most urgent to maintain or overhaul and middle repair are treated in time and ensure the maximum benefit of maintenance funds.
According to the technical conditions of different road segments aimed by the different maintenance strategies, and then integrating the technical specifications of highway maintenance, the following maintenance strategy selection criteria are selected and adopted, as shown in table 3:
TABLE 3 Table 3
When the gray matter element analysis method is used for evaluating the comprehensive damage condition of each road section of the common road in a certain province and carrying out maintenance decision, the priority ordering of the road sections with maintenance is carried out from the following 4 aspects based on the data provided by the road bureau in the certain province: the road overall maintenance technical condition index (MQI), the road use performance comprehensive index (PQI), the road damage condition index (PCI) and the road Running Quality Index (RQI) are used for forming road section maintenance priority decision features C1, C2, C3 and C4 by using the 4 evaluation indexes, and feature attribute values ai (i=1, 2,3 and 4) corresponding to the features are given according to actual conditions. The gray matter elements R of a road section are formed by the 4 characteristic attributes, comprise 4-dimensional characteristic attributes and are in the form of a single-column matrix of 4 rows and 1 columns, as shown in formula (5-1), and when n sections of different roads exist, the 4-dimensional gray matter elements of each section of road can be combined together to form a matrix of 4*n, as shown in formula (5-2).
According to the meaning of the pavement using performance index, the larger the performance index is, the more excellent the pavement using performance is, so that when the optimal road section gray element is established, the larger and better principle is adopted for each characteristic attribute to select. For each characteristic attribute, the maximum value of the attribute in n road segments is selected as the value of the attribute of the best road segment gray element, and for forming new gray elements, the gray element is called best object gray element R 0 As shown in formula (5-3).
And (3) establishing a correlation coefficient 4-dimensional composite gray element RL of each road section and the optimal road section according to the existing road section technical condition evaluation basic data and the related road section evaluation system, as shown in a formula (5-4).
Wherein: l (L) ij For the association value of the ith road segment and the ith characteristic parameter of the optimal road segment, the following formula is adopted for calculation:
wherein: delta ij Delta is the difference between the gray element of the optimal road section and the ith dimension characteristic of the jth road section min For the minimum value of the difference value of the characteristic indexes of each dimension of the best road section and all other road sections, delta max For the maximum value of the difference between the characteristic indexes of each dimension of the best road section and all other road sections, the parameter p is a resolution coefficient, and is usually equal to 0.5.
3) Weight coefficient of road section comprehensive evaluation characteristic index
The weight coefficient of the road section comprehensive evaluation characteristic index refers to the proportion of each characteristic index in the final comprehensive evaluation, namely the respective importance degree. The invention determines the relevant weight coefficient of the comprehensive evaluation according to the weight coefficient of each index in the evaluation standard, and specifically comprises the following steps:
W=(W 1 ,W 2 ,W 3 ,W 4 ) (5-6)
4) Association degree composite gray element
Because the correlation coefficients in the formula (5-4) are used for comparing the correlation degree between each characteristic index between each road section gray element and the road section optimal gray element, a plurality of correlation coefficients exist in one road section and the road section optimal gray element, and the correlation degree between each road section and the road section optimal gray element is largest and the correlation degree between each road section and the road section optimal gray element is smallest can not be intuitively compared due to the fact that the information is too discrete. Therefore, it is necessary to integrate the discrete correlation coefficients to obtain the composite gray element RD of the correlation degree L
Wherein: l (L) oj The correlation between each road section to be evaluated and the optimal road section is defined; w (W) i And the weight coefficient of each characteristic attribute.
5) Maintenance road segment decision priority ordering
Compounding gray elements RD by the previous association degree L Each value represents the similarity degree of the gray elements of the corresponding road section and the optimal road section, so that the larger the value is, the better the road surface using performance of the corresponding road section is, and the smaller the value is, the worse the road surface using performance of the corresponding road section is. The most basic principle in road section maintenance is that the road surface is firstly maintained with poorer service performance, and the priority of each road section maintenance is inversely proportional to the association degree of the road section and the optimal road section gray element, and the higher the road section maintenance priority with smaller association degree is, the lower the road section maintenance priority with larger association degree is.
The adopted characteristic evaluation indexes comprise the following four: PCI, RQI, PQI and MQI. The weight coefficient of each index is different according to the road surface type and the road surface technical grade, and the invention selects the first-level road weight coefficient of the asphalt road surface in the evaluation standard to carry out the practical application of decision optimization of the gray matter element method, and the weight vector W= (0.35,0.40,0.15,0.10). As shown in fig. 20, the original data of the road surface use performance of a part of asphalt road surface is presented, from which several road segments are randomly selected for maintenance decision optimization analysis.
And randomly selecting a plurality of road sections from the road sections to analyze the pavement maintenance decision maintenance example based on the gray matter element method, wherein the data of the selected road sections are shown as follows.
Building a road segment composite gray matter element R according to the data shown in FIG. 21 and the formula (5-2) 6x4 The following is provided.
Since the larger the road surface use performance is the index, the better the road surface use performance is, the larger and better the principle is adopted to establish the optimal road section gray element R 0 The following is provided.
Then, the difference between the composite gray matter element and the optimal road section gray element is calculated as follows:
can be obtained according to the formula (5-10): and (V) min Taking p=0.5, taking 4 na=35.83, and then calculating the association coefficient composite gray element R of all road sections according to the formula (5-5) L :
Then according to the formula (5-7) and the above formula (5-11)Calculating the compound association degree RD of all road sections and the optimal road section gray elements L The following is provided.
/>
Finally, it can be according to L oi Judging the road surface using performance of each road section, and from the formulas (5-12), it can be seen that the road section 1 has the best performance, the road section 5 has the worst performance, and the priority ranking result of all road sections is Ls>L2>L4>L3>L6>L1 (Li represents the i-th road segment). And when the road maintenance personnel judges which road needs to be maintained, the road maintenance personnel can carry out optimization of maintenance decision according to the road section priority sequence calculated by grey matter primary distribution and carry out maintenance according to the order of priority and priority.
The cost measurement and calculation in the auxiliary decision-making process of the preventive maintenance of the pavement is based on the maintenance unit price of various typical maintenance schemes, and the unit price of the maintenance scheme is determined according to the actual investigation result of maintenance in a certain province, as shown in fig. 22.
For the evaluation result of the technical condition of the road surface of the main line of certain province in 2016, the road surface maintenance planning result required to be carried out by certain province 2017 is measured and compared by combining the maintenance decision optimization strategy summarized by the invention, the gray matter method and the sequencing analysis of the maintenance priority, and the measurement and calculation result is shown in fig. 23 compared with the difference between the traditional large-scale qualitative maintenance decision and the road surface preventive maintenance decision method based on data mining.
It can be seen from fig. 23 that the conventional qualitative maintenance decision mode is adopted, and the maintenance fund requirement is larger although the road mileage required for maintenance is less; by adopting an auxiliary decision making mode based on data mining, the road surface is divided into different maintenance grades according to actual conditions, and although the total mileage of maintenance is increased, the planned maintenance fund demand is reduced, so that the limited maintenance fund is more effectively utilized, and the maximum effect of the maintenance fund is exerted.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. The highway pavement preventive maintenance decision-making method based on data mining is characterized by comprising the following steps of:
s1: preprocessing the real-time recorded data and the historical data of the technical condition of the road maintenance pavement by using a data mining anomaly detection algorithm, and carrying out deep analysis and marking on anomaly detection results;
s2: comprehensively analyzing and combining the abnormal detection results by adopting a regression analysis prediction algorithm and a gray system prediction algorithm to obtain comprehensive prediction results, and performing association analysis on the abnormal detection results by adopting a pavement performance attenuation factor analysis rule and a heavy-duty vehicle type proportion influence analysis rule to obtain association analysis results;
S3: establishing a gray matter element analysis algorithm, judging a road section to be cured and a curing priority by combining the combined prediction results, performing auxiliary sequencing on the curing sequence and the curing type of the road section to be cured, and integrating with the association analysis results to provide a road section preventive curing evaluation result and assist a curing functional department to perform curing decision.
2. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 1, wherein said step S1 comprises:
s11: scanning a real-time and historical pavement performance data set from a database, randomly extracting a plurality of sub-samples, constructing a plurality of iTree isolated trees, and completing construction and training of a pavement performance record detection model;
s12: and carrying out abnormal evaluation on the road surface use performance record by adopting the road surface use performance record detection model.
3. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 2, wherein said step S11 comprises:
s111: randomly acquiring a plurality of sub-samples;
s112: randomly selecting one attribute from four attributes of road surface technical grade, road surface use performance comprehensive index, road surface damage index and road surface running quality index of road surface technical condition data as a segmentation attribute, selecting one attribute from a maximum value interval and a minimum value interval of a value range of the attribute as a segmentation value, and then segmenting a plurality of sub-samples into left and right sub-trees according to the segmentation value;
S113: repeating step S112 for sub-samples of the left and right subtrees until there is only one record in the set of sub-samples;
s114: repeating the steps S111-S113 until a sufficient number of the iTree isolated trees are built to form an isolated forest.
4. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 2, wherein said step S12 comprises: obtaining pavement use performance records needing to be subjected to anomaly detection, solving the path length of the pavement use performance records in each iTree according to the pavement use performance record detection model, calculating the anomaly indexes of the pavement use performance records, and classifying all anomaly indexes into two types after solving the anomaly indexes of all pavement use performance records, wherein the anomaly type with a large cluster center is marked as anomaly; the cluster center is small and is of a normal class, and the corresponding road surface performance record is marked as normal.
5. The data mining-based highway pavement preventive maintenance decision method according to claim 4, wherein the path length calculation formula is:
h (x) =L+C(n)
Wherein h is (x) For the path length, x is the road surface use performance record, L is the height in the binary tree of the leaf node where the road surface use performance record is finally located, and C (n) is the flat of the binary tree of n nodesHeight-average;
wherein, h (n) =in (n) +gamma (n > 1), gamma is Euler constant, also called Euler-Marscond constant, gamma is set to 0.5772156649, H is the height of a piece of data recorded In the binary tree structure, and n is the number of records contained In the leaf node;
the calculation formula of the abnormality index is as follows:
wherein n is the total node number phi of the decision tree iTree, C (n) is the average height of binary trees of n nodes, E (h (x)) represents the average height value of the record x in each tree, h (x) represents the length path from the leaf node to the root node, and the length path is used for judging whether one record x is an abnormal point or not.
6. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 1, wherein the calculation formula of the regression analysis prediction algorithm in step S2 is as follows:
wherein the PPI is road surface performance index, PPI 0 Alpha is the life factor of the road surface, namely the age of the road surface when the road surface performance decays to 63.2% of the initial value, beta is the shape factor of the road surface performance decay curve, and t is the road age;
Variable substitution is carried out on the calculation formula of the regression analysis prediction algorithm, so that lnt is equal to x, beta is equal to a, and beta lna -1 Equal toEqual to y, the resulting transformation formula is:
y=ax+b;
the values of the parameters a and b are estimated by adopting a least square method, and the calculation formulas are respectively as follows:
wherein (x) i ,y i ) Obtained from corresponding historical road surface performance index data PPI and t, namely x i Is the actual performance index of the current pavement, y i In order to obtain estimated values of parameters a and b for future road performance indexes predicted after the t road age is increased, the values of alpha and beta are obtained according to the relation formula beta=a and beta lnalpha-1=b, and n is the number of groups of observation points, namely the number of historical road performance index data groups.
7. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 6, wherein the calculation formula of the gray system prediction algorithm in step S2 is as follows:
x 0 (k)+mz 1 (k)=q
wherein m and q are parameters to be estimated, and x 0 (k) For the actual state of the road surface performance index data at time k, z 1 (k) The prediction state of the pavement performance index data at the moment k is obtained;
the whitening formula of the grey system prediction algorithm is as follows:
where d is the length of the prediction error sequence, X 1 Is a prediction error sequence;
the estimation formulas for obtaining parameters m and q according to the least square method are respectively as follows:
And n is the number of the observation point groups, namely the number of the historical pavement performance index data groups.
8. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 7, wherein the calculation formula of the combined prediction in step S2 is:
wherein Y (t) is the predicted value of the regression analysis prediction algorithm, Y (t) is the predicted value of the gray system prediction algorithm, and w j As a weight parameter, w 1 Weight of regression prediction result, w 2 The weights of the results are predicted for the gray system.
9. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 1, wherein the rules of association analysis in step S2 include:
rule 1: when the proportion of the heavy-duty vehicle type in the road traffic flow reaches 30%, the probability of the comprehensive index grade of the road surface service performance is 0.99;
rule 2: when the proportion of the heavy-duty vehicle type reaches 40%, the probability that the comprehensive index grade of the pavement usability is reduced to the poor grade is 0.99;
rule 3: when the proportion of the heavy-duty vehicle type reaches 50%, the probability of the PQI grade falling to the secondary grade is 0.97;
rule 4: when the proportion of the heavy-duty vehicle type reaches 40%, the probability that the pavement damage index is of a poor grade is 0.96;
Rule 5: when the proportion of the heavy-duty vehicle type reaches 50, the probability of the pavement damage index decreasing to the second time is 0.95;
rule 6: when the proportion of the heavy-duty vehicle type reaches 50%, the probability of poor running quality index is 0.90;
wherein, rule 1, rule 2 and rule 3 represent that the comprehensive index of the road surface use performance is inversely related to the proportion of the heavy-duty vehicle type;
rule 4 and rule 5 indicate that too large proportion of heavy-duty vehicle types can cause more serious pavement damage;
the rule 6 indicates that the oversized heavy-duty vehicle type is also an important factor for the reduction of the road surface running quality index.
10. The method for preventive maintenance decision of a highway pavement based on data mining according to claim 1, wherein the step S3 of establishing a gray matter element analysis algorithm, and determining the road section to be maintained and the maintenance priority by combining the combined prediction result comprises the following steps:
s31: establishing an optimal road section object gray element:
wherein C is 1 Is the index of the overall maintenance technical condition of the highway, C 2 Comprehensive index of pavement performance, C 3 Index C of road section damage condition 4 Is the road surface running quality index, D n For the number of columns of the matrix, i.e. the number of index data, a n1 ~a n4 Corresponding matrix array data for specified pavement indexes;
S32: selecting the maximum value of the attribute in the n road segments as the value of the attribute in the gray element of the best road segment, and for the gray element forming the best object:
s33: according to the existing road section technical condition evaluation basic data and the related road section evaluation system, building a correlation coefficient 4-dimensional composite gray element of each road section and the optimal road section:
wherein L is n1 ~L n4 Matrix column data corresponding to the specified pavement indexes;
s34: the calculation formula of the association value is as follows:
wherein, lij is the association value of the jth road segment and the ith characteristic parameter of the optimal road segment, delta ij Delta is the difference between the gray element of the optimal road section and the ith dimension characteristic of the jth road section min For the minimum value of the difference value of the characteristic indexes of each dimension of the best road section and all other road sections, delta max For the maximum value of the characteristic index difference value of each dimension of the best road section and all other road sections, the parameter p is a resolution coefficient, and the parameter p is usually equal to 0.5;
s35: integrating the discrete association coefficients to obtain an association degree composite gray element:
wherein L is 0j The correlation between each road section to be evaluated and the optimal road section is defined; w (w) i The weight coefficient of each characteristic attribute;
s36: and obtaining the association degree composite gray element by the following steps: each value represents the similarity degree of the corresponding road section and the best road section gray element, so the larger the value is, the better the road surface using performance of the corresponding road section is, the smaller the value is, the poorer the road surface using performance of the corresponding road section is, and the most basic principle in road section maintenance is that the road surface using performance is poorer when the road section maintenance is carried out, then the priority of each road section maintenance is inversely proportional to the association degree of the road section and the best road section gray element, the higher the road section maintenance priority of the smaller the association degree is, and the lower the road section maintenance priority of the larger the association degree is.
CN202310691169.5A 2023-06-12 2023-06-12 Highway pavement preventive maintenance decision method based on data mining Pending CN116739376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310691169.5A CN116739376A (en) 2023-06-12 2023-06-12 Highway pavement preventive maintenance decision method based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310691169.5A CN116739376A (en) 2023-06-12 2023-06-12 Highway pavement preventive maintenance decision method based on data mining

Publications (1)

Publication Number Publication Date
CN116739376A true CN116739376A (en) 2023-09-12

Family

ID=87905661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310691169.5A Pending CN116739376A (en) 2023-06-12 2023-06-12 Highway pavement preventive maintenance decision method based on data mining

Country Status (1)

Country Link
CN (1) CN116739376A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557122A (en) * 2024-01-11 2024-02-13 山东路科公路信息咨询有限公司 Highway maintenance decision analysis method and system based on data relation graph technology
CN117726324A (en) * 2024-02-07 2024-03-19 中国水利水电第九工程局有限公司 Road traffic construction inspection method and system based on data identification

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557122A (en) * 2024-01-11 2024-02-13 山东路科公路信息咨询有限公司 Highway maintenance decision analysis method and system based on data relation graph technology
CN117557122B (en) * 2024-01-11 2024-03-22 山东路科公路信息咨询有限公司 Highway maintenance decision analysis method and system based on data relation graph technology
CN117726324A (en) * 2024-02-07 2024-03-19 中国水利水电第九工程局有限公司 Road traffic construction inspection method and system based on data identification
CN117726324B (en) * 2024-02-07 2024-04-30 中国水利水电第九工程局有限公司 Road traffic construction inspection method and system based on data identification

Similar Documents

Publication Publication Date Title
CN107610469B (en) Day-dimension area traffic index prediction method considering multi-factor influence
CN116739376A (en) Highway pavement preventive maintenance decision method based on data mining
CN111737916B (en) Road and bridge disease analysis and maintenance decision method based on big data
CN113096388B (en) Short-term traffic flow prediction method based on gradient lifting decision tree
CN108417033A (en) Expressway traffic accident analysis prediction technique based on multi-dimensional factors
CN111652520B (en) Pavement maintenance intelligent decision system and method based on big data
CN107273605B (en) Actually measured axle load spectrum determination method based on multiple classifier system
CN114299742B (en) Speed limit information dynamic identification and update recommendation method for expressway
CN113918538B (en) New road maintenance data migration system based on artificial neural network
CN110836675A (en) Decision tree-based automatic driving search decision method
CN109544926B (en) Traffic flow restoration method based on intersection correlation
CN116933946A (en) Rail transit OD passenger flow prediction method and system based on passenger flow destination structure
CN112508336B (en) Space and environmental efficiency correlation measurement method based on structural equation model
CN111311905A (en) Particle swarm optimization wavelet neural network-based expressway travel time prediction method
CN114548494A (en) Visual cost data prediction intelligent analysis system
CN113726558A (en) Network equipment flow prediction system based on random forest algorithm
CN116663964B (en) Engineering unit price rapid calculation method and system based on characteristic values of list items
CN112241808A (en) Road surface technical condition prediction method, device, electronic equipment and storage medium
CN115438453B (en) Method for constructing road network facility topological structure by using observation data
CN114626655A (en) Multi-standard comprehensive evaluation method for regional comprehensive energy system
CN116756825A (en) Group structural performance prediction system for middle-small span bridge
CN115906669A (en) Dense residual error network landslide susceptibility evaluation method considering negative sample selection strategy
CN115691140A (en) Analysis and prediction method for space-time distribution of automobile charging demand
CN114880954A (en) Landslide sensitivity evaluation method based on machine learning
CN113919729A (en) Regional three-generation space influence and cooperation level evaluation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination