CN114818493A - Method for quantitatively evaluating integrity degree of tunnel rock mass - Google Patents

Method for quantitatively evaluating integrity degree of tunnel rock mass Download PDF

Info

Publication number
CN114818493A
CN114818493A CN202210435391.4A CN202210435391A CN114818493A CN 114818493 A CN114818493 A CN 114818493A CN 202210435391 A CN202210435391 A CN 202210435391A CN 114818493 A CN114818493 A CN 114818493A
Authority
CN
China
Prior art keywords
index
integrity
rock mass
model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210435391.4A
Other languages
Chinese (zh)
Inventor
彭浩
梁铭
宋冠先
朱孟龙
韩玉
解威威
吕中玉
刘唐
梁健明
周邦鸿
谢灿荣
赵婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Highway Inspection Co ltd
Guangxi Road and Bridge Engineering Group Co Ltd
Original Assignee
Guangxi Highway Inspection Co ltd
Guangxi Road and Bridge Engineering Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Highway Inspection Co ltd, Guangxi Road and Bridge Engineering Group Co Ltd filed Critical Guangxi Highway Inspection Co ltd
Priority to CN202210435391.4A priority Critical patent/CN114818493A/en
Publication of CN114818493A publication Critical patent/CN114818493A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/14Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/14Force analysis or force optimisation, e.g. static or dynamic forces

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Pure & Applied Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Analysis (AREA)
  • Geometry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Excavating Of Shafts Or Tunnels (AREA)

Abstract

The invention relates to the technical field of tunnel engineering, in particular to a method for quantitatively evaluating the integrity degree of a tunnel rock mass, which comprises the following steps: s1, acquiring engineering data; s2, selecting and/or carrying out secondary calculation on the engineering data to obtain sample data; s3, training a machine learning model for predicting the integrity of the surrounding rock by using the sample data and the XGboost model; s4, calculating SHAP value of each index in sample data, performing interpretability analysis on the machine learning model, and determining weight of each index; s5, obtaining a tunnel rock integrity quantitative evaluation model by using a multivariate instability index analysis method by using the weight of each index obtained in the step S4; and S6, evaluating the integrity degree of the rock mass on the actual engineering data by using the tunnel rock mass integrity degree quantitative evaluation model. The accuracy of the final quantitative index is ensured by improving the utilization rate and the reasonability of the drilling parameters, and the usability in the technical field of tunnel engineering is also ensured.

Description

Method for quantitatively evaluating integrity degree of tunnel rock mass
Technical Field
The invention relates to the technical field of tunnel engineering, in particular to a quantitative evaluation method for the integrity degree of a tunnel rock mass.
Background
In the tunnel excavation process, the quality evaluation of the integrity of the rock mass which is not excavated in front of the tunnel face is an important flow for determining the subsequent excavation method and support measures of the tunnel and guaranteeing the construction safety of the tunnel. The mainstream tunnel rock mass quality quantitative evaluation methods such as a Q value method, an RMR method, a BQ and correction [ BQ ] method, an RQD method, a GSI method and the like all take the integrity of the rock mass as important reference indexes, but all rely on in-situ tests and indoor tests for obtaining, and greatly increase the difficulty of engineering application. Based on the reasons, at present, more researches are turned to by means of on-site advanced geological prediction results, and a reasonable method is searched for quickly and accurately quantitatively evaluating the integrity degree of the tunnel rock mass. Compared with the conventional geophysical prospecting method (geological radar, TSP, infrared water detection and the like), the digital drilling can most intuitively reflect the real geological information (including the intrinsic integrity information) in front of the tunnel face by directly drilling the surrounding rock in front of the tunnel face and returning a large number of drilling parameters, so that the digital drilling becomes an important research object in the aspect.
In the aspect of quantitatively evaluating the integrity degree of a rock mass based on drilling data, two technical schemes are mainly adopted at present: (1) a mathematical statistics (data fitting) method is a common method for establishing quantitative evaluation relationship between indexes. In this respect, there is a leining professor which adopts a drilling method to carry out experimental research on the response characteristic of the drilling energy to the discontinuity of rock mass and provides an empirical method for determining the rock quality index (RQD) and the drilling energy change based on the discontinuous frequency; the Yuanzhongqi teaching develops a digital Drilling Process Monitoring (DPM) system applied to the quality evaluation of the engineering rock mass, and improves the quality evaluation effect of the existing engineering rock mass including the integrity degree; a novel geological drilling machine digital drilling monitoring system is built by Cao Rui Lang, and quantitative new indexes DPI and the like for expressing the integrity of rock mass are provided. (2) In the machine learning method, because the digital drilling parameters are numerous and exhibit the characteristic of nonlinear correlation, and conventional mathematical statistics hardly achieve an ideal fitting effect, a learner performs prediction research on drilling data by using a machine learning model which is excellent in data analysis field in recent years. For example, the good House realizes the identification of complete rock mass, incomplete rock mass and weak interlayer in the limestone region through a neural network model; and the Liu poet realizes inversion prediction and the like of a broken zone which is not excavated in the tunnel and is easy to collapse through a convolution neural network model.
The technical problems of the above studies are as follows:
(1) the quantitative relation between drilling parameters and evaluation indexes of the integrity degree of a mainstream rock mass is obtained through a mathematical statistics (data fitting) method, or a new quantitative evaluation index is provided, the method usually makes certain idealized assumption in order to ensure the fitting effect in the application process, the complexity of the field drilling operation environment and the process is ignored, the selection of the drilling indexes and the determination of the weight are greatly influenced, the subjective factors are large, and the evaluation effect is not specific (the evaluation effect is concentrated on the intact rock mass and the non-intact rock mass);
(2) the method of machine learning is used for predicting the integrity of the rock mass based on drilling data, and has the defects that the method depends on a large amount of integrity data matched with the drilling data, and secondly, the machine learning is a black box model, cannot directly give quantitative indexes or formulas, and has poor interpretability of results and poor engineering practicability.
(3) The two methods also have a common defect that most of quantitative evaluation objects are single tunnel sections, tunnel construction objects are continuous rock masses, single section evaluation undoubtedly greatly increases the contingency on one hand, and on the other hand, the guidance significance for tunnel construction is limited.
Disclosure of Invention
The invention aims to solve the problem that the evaluation method based on machine learning in the prior art cannot directly give quantitative indexes or formulas, so that the interpretability of the result and the engineering practicability are poor, and provides a method for quantitatively evaluating the integrity degree of a tunnel rock mass.
In order to achieve the above purpose, the invention provides the following technical scheme:
a method for quantitatively evaluating the integrity of a tunnel rock mass comprises the following steps:
s1, acquiring engineering data;
s2, selecting and/or carrying out secondary calculation on the engineering data to obtain sample data;
s3, training a machine learning model for predicting the integrity of the surrounding rock by using the sample data and the XGboost model;
s4, calculating the SHAP value of each index in the sample data, performing interpretability analysis on the machine learning model for predicting the integrity degree of the surrounding rock, and determining the weight of each index;
s5, obtaining a tunnel rock integrity quantitative evaluation model by using a multivariate instability index analysis method according to the weight of each index obtained in the step S4;
and S6, evaluating the integrity degree of the rock mass on the actual engineering data by using the tunnel rock mass integrity degree quantitative evaluation model.
Preferably, the engineering data comprises four primary indexes, namely a propelling speed, a propelling force, a torque and a rotating speed.
Further, the refining and/or secondary calculation of the engineering data in step S2 includes one or more of the following steps:
s21, filtering abnormal values;
s22, segmenting the engineering data according to the same section distance, and calculating the mean value and variance of the four primary indexes in each segmentation paragraph to obtain eight secondary indexes;
s23, performing oversampling processing on the secondary index;
s23, performing feature correlation analysis on the primary index or the secondary index, and eliminating features with correlation degree higher than a first threshold value; and/or analyzing the feature importance of the primary index or the secondary index, and rejecting the index with the feature importance smaller than a second threshold value.
According to the characteristics of original drilling sampling data, the data noise reduction and equidistant segmentation steps are provided in a preferred embodiment, engineering data are segmented according to the same section distance, the mean value and the variance of four primary indexes in each segmentation section are calculated, the contingency caused by evaluation on a single tunnel section is avoided, the data rule of the rock mass integrity degree is potentially reflected by deep mining, and a high-quality data set is provided.
Preferably, the first threshold is 0.8.
Preferably, the second threshold is 0.1.
Further, in step S5, the number of changes is changedMathematical model D of quantitative instability index analysis method t The following were used:
Figure BDA0003612694750000041
wherein n is the number of influencing factors, d 1 ~d n Is an index factor of instability, w 1 ~w n Is an unstable exponential weight.
Further, the weight of each index is calculated according to the following formula:
Figure BDA0003612694750000042
wherein SHAP i Indicates the SHAP value of the i-th index.
Preferably, in step S2, the sample data includes five indexes, namely, a propulsion speed mean DRM, a propulsion force mean TPM, a torque mean TM, a rotation speed mean RM, and a propulsion speed variance DRV.
According to the method, indexes (namely characteristics) are screened in a preferred embodiment, and the strong related indexes or the low-importance indexes such as the propulsion variance TPV, the torque variance TV and the rotation speed variance RV in the eight secondary indexes are removed, so that the performance evaluation indexes of the machine learning model for predicting the integrity degree of the surrounding rock are improved by 3% -5%.
Further, in step S5, a calculation formula IRFI of the quantitative evaluation model of the integrity degree of the tunnel rock mass is as follows:
Figure BDA0003612694750000051
wherein, W 1 As a weight of the index DRM, W 2 To index the weight of the TPM, W 3 Is the weight of the indicator TM, W 4 Is a weight of the index RM, W 5 Is the weight of the DRV index.
Further, the evaluation result of the quantitative evaluation model of the integrity degree of the tunnel rock mass in the step S6 includes:
when DRM is more than or equal to 120, the evaluation result is argillaceous filling;
when the DRM is less than 120 and the IRFI value range is (0,31), the evaluation result is relatively complete;
when DRM is less than 120 and the IRFI value range is [31,55), the evaluation result is relatively broken;
when DRM is <120 and IRFI is > 55, the evaluation result is broken.
Compared with the prior art, the invention has the beneficial effects that:
1, the method comprises the steps of obtaining sample data by carrying out concentration and/or secondary calculation on engineering data, training a machine learning model for predicting the integrity of surrounding rocks by using the sample data and an XGboost model, carrying out interpretability analysis on the machine learning model for predicting the integrity of the surrounding rocks, determining drilling indexes closely related to the integrity and weights corresponding to the indexes, establishing a quantitative evaluation model for the integrity of tunnel rock masses by using a multivariate instability index analysis method, and carrying out quantitative interpretation on the integrity of the tunnel rock masses based on digital drilling data.
Drawings
FIG. 1 is a flow chart of a quantitative evaluation method for the integrity of a tunnel rock mass.
Fig. 2 is a schematic diagram of surrounding rocks of different degrees of integrity.
Figure 3 is the advanced drilling raw data.
FIG. 4 is a schematic diagram of the proportion of each label in the secondary index data set.
FIG. 5 shows the correlation analysis result of the features of the secondary index data set.
FIG. 6 is a result of the analysis of the primary feature importance of the secondary index dataset.
FIG. 7 is a histogram comparing the performance of the XGboost classification models before and after feature screening.
FIG. 8 is a schematic diagram of model interpretability analysis based on SHAP values.
Fig. 9 shows sample distributions of DRM feature values and the SHAP values.
FIG. 10 is a scatter plot based on IRFI.
Fig. 11 is a raw drilling data visualization image in example 2.
Detailed Description
The present invention will be described in further detail with reference to test examples and specific embodiments. It should be understood that the scope of the above-described subject matter is not limited to the following examples, and any techniques implemented based on the disclosure of the present invention are within the scope of the present invention.
Example 1
The invention innovatively combines mathematical statistics and machine learning methods, and is matched with a series of effective drilling data preprocessing measures to realize high-quality quantitative evaluation of the integrity degree of the tunnel rock mass:
(1) a multivariate unstable index analysis method is used as a mathematical model to construct a tunnel rock integrity quantitative interpretation comprehensive index, so that engineering application is facilitated;
(2) aiming at the characteristics of original drilling sampling data, data rules which potentially reflect the integrity degree of a rock mass are deeply mined through data noise reduction and equidistant segmentation, and a high-quality data set is provided;
(3) the unstable exponent and exponential power of multivariate unstable exponent analysis method are quantified and confirmed by using machine learning and interpretability thereof as tools, so that the utilization rate and rationality of drilling parameters are improved, and the accuracy of final quantitative indexes is ensured.
The embodiment provides a tunnel rock mass integrity quantitative interpretation comprehensive index calculation method based on digital drilling, which realizes quantitative evaluation of rock mass integrity, and as shown in figure 1, the method comprises the following steps:
s1, acquiring engineering data;
s2, selecting and/or carrying out secondary calculation on the engineering data to obtain sample data;
s3, training a machine learning model for predicting the integrity of the surrounding rock by using the sample data and the XGboost model;
s4, calculating SHAP value of each index in the sample data, performing interpretability analysis on the machine learning model for predicting the integrity degree of the surrounding rock, and determining the weight of each index;
s5, obtaining a tunnel rock integrity quantitative evaluation model by using a multivariate instability index analysis method according to the weight of each index obtained in the step S4;
and S6, evaluating the integrity degree of the rock mass on the actual engineering data by using the tunnel rock mass integrity degree quantitative evaluation model.
Specifically, in step S1, the engineering data is original drilling data collected from a certain tunnel under construction in Guangxi, the engineering geological conditions of the tunnel site area are complex, the engineering data mainly pass through a broken to broken weathered sandstone stratum, and a mud filled karst cave and cracks develop, so that adverse geological disasters such as face mud outburst and collapse are easily caused. The different degrees of integrity of the surrounding rock in situ are shown in figure 2.
In order to ensure the safety of tunnel construction, a C6-2 type multifunctional crawler-type drilling machine produced by Casagrander manufacturers is used on site to carry out advanced drilling operation and geological forecast based on drilling data and site conditions, the collected original drilling data are shown in figure 3, and four first-level indexes are respectively propulsion speed, propulsion force, torque and rotation speed. The invention collects ten thousand pieces of original drilling data including the left frame and the right frame of the tunnel, and the length of the tunnel covered by the data is accumulated to be about 150 meters.
In step S2, the engineering data is carefully selected and/or twice calculated to obtain sample data, which is a step of preprocessing the data, and specifically includes one or more of the following steps:
(1) outlier filtering
In the drilling process of the drilling machine, due to the reasons of operating environment, mechanical operation and the like, abnormal data and data of some special drilling states inevitably exist in a large amount of collected primary index data, and the specific expression is that one or more primary indexes have a collected numerical value of 0. Outlier filtering is therefore required before training of the machine learning model using the drilling data. The filtering rule is shown in the following formula (1):
T=x propulsion speed ·x Propulsive force ·x Torque of ·x Rotational speed (1)
In the formula (1), x Propulsion speed 、x Propulsive force 、x Torque of And the x rotation speed respectively represents the data values of four primary indexes of a certain sampling point, if T is equal to 0, the value of the sampling point is judged to be an abnormal value and is filtered, otherwise, T is equal to 0, the value is judged to be a normal value and is reserved.
(2) Data equidistant segmentation and feature engineering
In order to solve the above-mentioned problem of taking a fracture surface as an evaluation unit in the field of interpretation of advanced drilling data, after the abnormal value filtering of data is completed, the invention filters the abnormal value of original drilling data, equally divides the original drilling data according to the distance d of 0.5m, calculates the mean value and the variance of four primary indexes in each division section according to the formulas (2) to (3), and performs characteristic engineering processing to form a new secondary index, specifically as shown in the following
Figure BDA0003612694750000091
Figure BDA0003612694750000092
Wherein n is the number of samples, x n Is the value of the nth sample, x i Is the value of the i-th sample,
Figure BDA0003612694750000093
is the average of the n samples and is,
Figure BDA0003612694750000094
is the variance of n samples.
(3) Secondary index dataset analysis and processing
Through data segmentation and characteristic engineering, the formed secondary indexes comprise a propulsion mean value (DRM), a propulsion mean value (TPM), a torque mean value (TM), a rotating speed mean value (RM), a propulsion variance (DRV), a propulsion variance (TPV), a Torque Variance (TV) and a rotating speed variance (RV). The total number of samples of the formed secondary index data set is 295, and the percentage of each completion degree is as shown in fig. 4.
As can be seen from fig. 4, the data set has a problem of sample imbalance to a certain extent, and in order to ensure the accuracy and the effectiveness of the subsequent machine learning model, the data set needs to be oversampled. The SMOTE algorithm is an oversampling technique for synthesizing a few samples, and can be regarded as an improved strategy of a random oversampling algorithm. The traditional random oversampling adopts a simple sample copying strategy to increase the number of a few types of samples, and easily generates the problems of overfitting and reducing the generalization capability of a classifier. To overcome this problem, the basic principle of the SMOTE algorithm is to analyze a few samples and synthesize new samples from the few samples to add to the dataset.
The SMOTE oversampling process is shown in equation (4):
x new =x i +rand(0,1)|x i -x j | (4)
in equation (4): x is the number of i Represents the minority class sample, calculates and compares with all other minority class samples x i The Euclidean distance between to obtain k nearest neighbors. K is usually taken to be 5; x is the number of j Represents a sample randomly selected from k nearest neighbors; rand (0,1) represents a random number between 0 and 1. For nearest neighbor sample x j And minority class sample x i Synthesizing a new sample x at random points on their connecting lines new
According to the invention, by utilizing the oversampling technology for synthesizing a few samples, namely the SMOTE algorithm, in the preferred embodiment, the few samples can be analyzed, and new samples are synthesized according to the few samples and added into the data set, so that the problem of sample imbalance is solved, and the problems of overfitting caused by increasing the number of the few samples by adopting a simple sample copying strategy and reducing the generalization capability of the classifier are avoided.
And (3) carrying out Pearson feature correlation analysis on the data set subjected to the processing according to the formula (5), so that the possibility of reducing data dimensionality is conveniently explored before the machine model is trained.
Figure BDA0003612694750000101
In equation (5): a is i And b i Is the value of the data to be transmitted,
Figure BDA0003612694750000102
and
Figure BDA0003612694750000103
are average values.
The value of the pearson coefficient p is between-1 and + 1. Values near +1 indicate strong positive correlation, values near-1 indicate strong negative correlation, and values near 0 indicate no correlation. From the calculation results, a correlation heat map is plotted as shown in fig. 5.
As can be seen from FIG. 5, the highest positive correlation is 0.46 of TPV and DRV, the highest negative correlation is-0.68 of DRV and TPM, and no feature with high correlation degree (absolute value >0.8) appears, so no feature screening is performed according to the correlation analysis. Besides feature screening purely from a numerical point of view between features through correlation analysis, feature importance should also be considered from a model point of view. Therefore, the importance of the features is primarily visualized through the feature _ attributes function of the XGboost model itself, and the result is shown in FIG. 6.
As can be seen from fig. 6, in the eight secondary indexes, the importance of the first five DRM, TPM, TM, RM, and DRV features exceeds 0.1, which are important features, and the importance of the last three TPV, TV, and RV features is less than 0.1, which are not important features, so that the TPV, TV, and RV are subjected to feature elimination. And finally, reserving the data set formed by the first five items for training and learning of the machine learning model.
Step S2 is executed after the engineering data are preprocessed, step S3 is executed, and a machine learning model for predicting the integrity of the surrounding rock is trained by using the sample data and the XGboost model.
The XGboost is called Extreme Gradient Boosting, can be translated into a limit Gradient Boosting algorithm, is different from the traditional algorithm, and is a high-efficiency machine learning algorithm developed and evolved from the traditional machine learning classification regression tree algorithm (CART). As a representative algorithm of a Boosting method in an integrated algorithm, XGboost gradually accumulates and summarizes modeling results of a plurality of weak evaluators, namely CART, one by one on data through a plurality of iterations to obtain regression or classification performance better than that of a single model. This superposition strategy, with a single decision tree as the weak evaluator, can be expressed in the form of an addition, as shown in equation (6):
Figure BDA0003612694750000111
in formula (6):
Figure BDA0003612694750000112
(or in other words)
Figure BDA0003612694750000113
) Represents the prediction result of the whole model on the sample, K represents the total number of weak evaluators, f k Represents the kth decision tree, x i Representing the feature vector corresponding to sample i.
The XGboost introduces model complexity to measure the operation efficiency of the algorithm, so that an objective function is composed of a traditional loss function and the model complexity, and the formula expression is shown as a formula (7):
Figure BDA0003612694750000121
Ω(f k )=γT+λ||ω|| 2 /2 (8)
in equation (7): obj represents the objective function of the model, n represents the total amount of data imported into the kth tree, the first item represents the traditional loss function, and the real label y is measured i And the predicted value
Figure BDA0003612694750000122
The second term represents the complexity of the model, the second term is represented by using a certain transformation omega of the tree model, the change represents a formula for measuring the complexity of the tree model from the structure of the tree, the expansion is shown as a formula (8), in the formula (8), gamma and lambda represent coefficients of the complexity of the model, and T represents the number of leaf nodes of the decision tree of the model.
To solve the objective function, equation (7) can be operated using Taylor expansion, resulting in equation (9):
Figure BDA0003612694750000123
in formula (9): gamma and lambda represent the coefficient of the model complexity, T represents the number of leaf nodes of the decision tree of the model, g i 、h i Respectively represent samples x i J denotes the index of each leaf node, ω j Represents the sample weight at the jth leaf node, I j A subset of samples representing the jth leaf node.
By introducing the structure of the tree into the penalty function, i.e. for ω j Taking the derivative and making the derivative function equal to zero, the minimum value Obj of the objective function is obtained min 。Obj min The minimum value of the model is smaller, and the model is considered to perform better. The calculation formula is shown in formulas (10) to (11):
Figure BDA0003612694750000124
Figure BDA0003612694750000131
the invention uses machine learning as a tool to obtain the reasonability or not of the index of the multivariate unstable exponent analysis method and the corresponding exponential power, and one of the most important measurement indexes is whether the performance of the machine learning model is excellent enough or not. The interpretable analysis of the high-performance model is carried out, and therefore sufficient rationality is achieved only when important drilling indexes and weights are obtained. For this, the model needs to be evaluated for performance first.
In order to comprehensively evaluate the performance of the model, the selected model evaluation indexes comprise the Accuracy (ACC), the Precision (PRE), the recall Rate (REC) and the harmonic mean score (F) 1 ). The calculation expressions (12) to (15) of all the evaluation indexes are shown.
Figure BDA0003612694750000132
Figure BDA0003612694750000133
Figure BDA0003612694750000134
Figure BDA0003612694750000135
In the formula: TP is true positive and represents the number of samples that are actually positive and correctly predicted by the classifier as negative; FN is false negative, and represents the number of samples which actually belong to the negative class but are wrongly predicted to be the positive class; FP is false positive, indicating the number of samples that are actually negative but mispredicted as positive; TN is a true negative number representing the number of samples that are actually negative and correctly predicted as negative.
Finally, the performance of the XGBoost classification model constructed by the present invention is shown in fig. 7. In the histogram of fig. 7, the left side of each group is the model performance evaluation result without feature screening (8 features), and the right side is the model performance evaluation result after feature screening (5 features), so that it can be found that the four model performance evaluation indexes are all improved by 3% -5%, and the effectiveness of the feature engineering performed by feature importance of the invention is verified.
Therefore, according to the characteristics of original drilling sampling data, on one hand, data rules which potentially reflect the integrity degree of a rock mass are deeply mined through data noise reduction and equidistant segmentation, and a high-quality data set is provided; on the other hand, through index (namely characteristic) screening, strongly related indexes or low-importance indexes are removed, so that the performance evaluation indexes of the machine learning model for predicting the integrity degree of the surrounding rock are all improved by 3-5%.
After training a machine learning model for predicting the integrity of the surrounding rock, step S4 is performed to perform interpretability analysis on the model.
At present, many machine learning models belong to the "Black box" (Black box) algorithm, and although they can make very good predictions, they cannot explain well how they predict, and many data scientists have difficulty in knowing why the algorithm will get such a prediction result, which is really fatal. Therefore, it is a constant focus and difficulty in the field for machine learning Interpretability (Interpretability), or XAI (extensible architecture Intelligence).
Feature importance analysis (Feature importances) is the most commonly used interpretable method in tree models and integrated algorithms built based on tree model theory. By outputting the feature importance data, the user can intuitively know which features are most important for the model, which are key features, and which feature values have extremely low influence on the model prediction result. And performing feature engineering accordingly, or providing reference for predicting subsequent new sample data (ensuring the authenticity of important feature data as much as possible). In XGBoost, an import _ type is commonly used as a gain, i.e., the importance of each feature is measured by calculating a splitting yield (gain) of each feature. According to equation (11), the gain calculation of the split point is defined as shown in equation (16) below:
Figure BDA0003612694750000151
in the formula (I), the compound is shown in the specification,
Figure BDA0003612694750000152
l and R represent the left-hand split and the right-hand split of the tree model, respectively.
However, this method can only obtain the feature importance of the whole data set, i.e. the rough global interpretability, and the importance is usually based on the training set, on one hand, when the model is over-fitted, the importance of the feature is misleading, on the other hand, the feature is easily affected by a High number of category attributes (High Cardinality Features), and thus the numerical variables are often ranked up. For the classification problem, it is still necessary to use other interpretability theories and methods to further know the importance degree of each index corresponding to each label and the predictive interpretability of the local sample.
The SHAP model is model interpretable based on SHAP Value (Shapley Value), and the concept is from game theory. The model is a relatively totipotent and novel model interpretable method at present, and can be applied to previous global interpretation or local interpretation. SHAP belongs to a model post-explanation method, and the core idea of the SHAP is to calculate the marginal contribution of characteristics to model output and then explain a 'black box model' from a global layer and a local layer. SHAP builds an additive interpretation model, with all features considered "contributors". For each prediction sample, the model generates a prediction value, and the SHAP value is the value to which each feature in the sample is assigned.
Assume a model benchmark score (typically the mean of the target variables for all samples) of y base The ith sample is x i The kth feature of the ith sample is x i,k Shap value of this feature is f (x) i,k ) Then the model is aligned to the sample x i The predicted value of (A) is:
y i =y base +f(x i,1 )+f(x i,1 )+...+f(x i,k ) (17)
when f (x) i,k )>And 0, the characteristic plays a positive role in predicting the target value, and the characteristic and the target predicted value have opposite effects. Thus the SHAP value not only gives the magnitude of the influence of the feature, but also reflects the influence of the feature in each samplePositive and negative, based on the calculated SHAP values for each sample, a local and global interpretability analysis of the model can be performed.
In step S4 in this embodiment, the importance of each feature to the integrity of the four kinds of surrounding rocks is analyzed by calculating the SHAP value, and the result is shown in fig. 8, where the abscissa of fig. 8 is the calculated SHAP value, and the ordinate performs feature sorting according to the SHAP value from top to bottom. It can be seen that besides the filling of the muddy material, five important characteristics contribute to the correct prediction of the rock mass in various degrees of completeness to different degrees.
To further perform interpretability analysis on the DRM-based muddy filling prediction, the sample distribution and the SHAP value of the DRM feature value are plotted in the model under the condition that the label is muddy filling, and the result is shown in fig. 9. As can be seen from fig. 9, the prediction of the DRM eigenvalue for the muddy filling is basically critical at 120: DRM <120, which negatively contributes to the prediction of argillaceous filling, i.e. prediction error; DRM >120, this feature contributes positively to the prediction of argillaceous filling, i.e. the prediction is correct.
In step S4, the SHAP value of each index in the sample data is calculated, and after the interpretability analysis is performed on the machine learning model for predicting the integrity of the surrounding rock, the method needs to obtain a quantitative evaluation model of the integrity of the tunnel rock by using a multivariate instability index analysis method, so that the weight of each index needs to be determined, and the specific process is as follows.
In the prior art, multivariate unstable index analysis method is based on the measurement statistics to obtain unstable index (D) under a relative relationship t ) A comprehensive quantitative index for describing the intensity of an event, D t A larger value means that the event occurs more intensely.
The mathematical model of the standard multivariate instability index analysis is shown in the following formula (18):
Figure BDA0003612694750000171
in formula (18): n is the number of influencing factors, d n To be restlessBy a fixed exponential factor, W n Is an unstable exponential weight.
When the method is applied to the field of quantitative evaluation of the integrity degree of the tunnel rock mass based on digital drilling data, modeling complexity and evaluation effect accuracy are comprehensively considered, which means that drilling indexes closely related to the integrity degree and weights corresponding to the indexes need to be determined.
According to the equation (18) and fig. 8, the invention proposes an Interval Rock Fracture Index (IRFI), the unstable Index is DRM, RM, TPM, TM, DRV, and each exponential power is calculated according to the following equation (19):
Figure BDA0003612694750000172
in the formula, SHAP i The SHAP value representing the ith feature (i.e., index) of the machine learning model.
In step S5, the weight W of each index obtained as described above is used i (namely exponential power), obtaining a tunnel rock integrity quantitative evaluation model by using a multivariate unsteady index analysis method; finally, the IRFI quantitative formula determined in this embodiment is shown in equation (20):
IRFI=DRM 0.27 ×TPM 0.08 ×TM 0.08 ×RM 0.09 ×DRV 0.49 (20)
in order to verify the effectiveness of the formula (20) in quantitatively evaluating the integrity degree of the tunnel rock mass, an IRFI scattered point distribution diagram of a data set is drawn, and the result is shown in FIG. 10. In fig. 10, the abscissa is the number of sample data, i.e., serial number, the sample data of light blue identification is relatively intact rock mass, the sample data of dark blue identification is relatively broken rock mass, the sample data of purple identification is broken rock mass, and the sample data of brown identification is filled with argillaceous matter; the ordinate is the IRFI value calculated for each sample data.
As can be seen from FIG. 10, the IRFI value can distinguish relatively intact, relatively broken and broken rock bodies with good effect, and the comprehensive data coverage rate can reach about 90%. But it does not distinguish the filling of the argillaceous material due to the specificity of the type of the argillaceous filling data, as can be seen in fig. 8. Unlike the other three types of data, the type of data filled with argillaceous filling is basically only related to DRM (interval drilling rate mean), while IRFI is an integrated indicator. But also formally because of its specificity, we can distinguish it by adding discrimination conditions, and the final quantitative interpretation scheme is shown in table 1 below.
TABLE 1 quantitative interpretation table and accuracy based on model interpretability and IRFI index
Figure BDA0003612694750000181
Example 2
In order to verify the evaluation effect of the quantitative evaluation index of rock mass integrity based on the advanced drilling data as presented in table 2, advanced drilling data at the position ZK109+240 on the left tunnel face of the tunnel is selected for verification, specifically, 20m (as representative, the types of the integrity of surrounding rocks are many) is selected from 21-41 m of the number 2 hole, and the image of the original drilling data is shown in fig. 11.
The evaluation result of professional technicians in the advanced geological forecast is that the rock mass of 21-28 m is complete and broken, wherein the rock mass of 25m is suspected to be filled with the mud to fill the weak interlayer, the rock mass of 29-41 m is broken and broken, and the rock mass of 38-40 m is suspected to be developed with the mud to fill the weak interlayer. The results of the IRFI value discrimination are shown in table 2 below:
TABLE 2 quantitative determination results based on IRFI values
Depth of field DRM TPM TM RM DRV IRFI Quantitative interpretation
21.5 18.2 101.2 48.43 135.05 45.35 41.91 Is relatively broken
22 15.55 101.35 48.95 130.76 43.92 39.48 Is relatively broken
22.5 16.27 100.98 54.62 127.63 35.47 36.3 Is relatively broken
23 23.54 101.54 63.15 148.6 488.42 144.9 Crushing
23.5 20.79 101.57 56.94 138.43 201.8 90.34 Crushing
24 26.21 101.63 59.6 136.14 53.02 50.74 Is relatively broken
24.5 26.21 100.94 60.53 145.41 627.73 167.27 Crushing
25 115.35 101.13 64.1 138.6 5998.92 737.77 Filling with mud
25.5 23.85 101.31 64.88 137.07 21.85 30.56 Is more complete
26 21.87 101.51 63.19 148.62 2.49 11.28 Is more complete
26.5 22.17 101.21 59.82 140.73 12.97 24.75 Is more complete
27 20.95 101.41 59.14 146.86 22.16 30.61 Is more complete
27.5 23.84 101.59 58.16 145.51 28.47 36.84 Is relatively broken
28 24.19 101.66 56.05 135.27 62.43 53.41 Crushing
28.5 16.99 101.56 57.91 142.6 51.43 44.57 Is relatively broken
29 20.48 101.58 51.94 134.86 53.4 47.07 Is relatively broken
29.5 17.02 101.71 55.13 143.97 59.8 47.79 Is relatively broken
30 15.29 101.81 53.65 135.82 50.06 42.32 Is relatively broken
30.5 19.05 101.63 54.28 143.1 88.73 59.43 Crushing
31 22.29 101.7 51.03 135.04 47.14 45.31 Is relatively broken
31.5 19.79 101.96 52.17 139.12 49.24 45.01 Is relatively broken
32 23.59 102 57.61 147.46 36.75 41.56 Is relatively broken
32.5 18.1 101.73 55.14 141.09 35.82 37.93 Is relatively broken
33 19.92 102 55.36 147.97 72.42 54.83 Is relatively broken
33.5 26.03 102.24 54.72 136.01 75.04 59.45 Crushing
34 21.29 102.4 53.61 138.04 43.38 43.27 Is relatively broken
34.5 16.11 101.97 52.72 140.58 44.7 40.72 Is relatively broken
35 17.24 101.84 54.67 148.6 45.95 42.36 Is relatively broken
35.5 18.95 101.91 55.38 145.48 69.33 52.89 Is relatively broken
36 33.54 102.1 58.6 141.02 544.33 166.23 Crushing
36.5 32.35 101.88 66.28 148.88 573.5 171.27 Crushing
37 15.51 101.72 57.45 144.16 63.08 47.98 Is relatively broken
37.5 11.64 101.78 59.8 151.37 34.61 33.54 Is relatively broken
38 10.64 102.08 58.21 144.2 35.06 32.73 Is relatively broken
38.5 11.98 101.97 60.15 143.25 197.39 77.63 Crushing
39 217.77 101.84 63.72 142.3 1087.18 386.76 Filling with mud
39.5 187.77 101.79 69.83 147.03 2893.91 600.63 Filling with mud
40 138.19 101.59 67.81 145.84 10707.95 1032.73 Filling with mud
40.5 16.46 101.51 61.78 135.48 750.77 160.09 Crushing
41 15.78 101.86 65.17 136.3 583.14 140.9 Crushing
As can be seen from the table 2, the quantitative evaluation index of the integrity degree of the rock mass in the IRFI interval provided by the invention can provide a detailed and accurate evaluation result of the integrity degree of the rock mass based on advanced drilling data, and the forecasting precision meets the actual engineering requirement.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for quantitatively evaluating the integrity degree of a tunnel rock mass is characterized by comprising the following steps:
s1, acquiring engineering data;
s2, selecting and/or carrying out secondary calculation on the engineering data to obtain sample data;
s3, training a machine learning model for predicting the integrity of the surrounding rock by using the sample data and the XGboost model;
s4, calculating the SHAP value of each index in the sample data, performing interpretability analysis on the machine learning model for predicting the integrity degree of the surrounding rock, and determining the weight of each index;
s5, obtaining a tunnel rock integrity quantitative evaluation model by using a multivariate instability index analysis method according to the weight of each index obtained in the step S4;
and S6, evaluating the integrity degree of the rock mass on the actual engineering data by using the tunnel rock mass integrity degree quantitative evaluation model.
2. The method for quantitatively evaluating the integrity of the tunnel rock mass according to claim 1, wherein the engineering data comprises four primary indexes, namely a propelling speed, a propelling force, a torque and a rotating speed.
3. The method for quantitatively evaluating the integrity of the tunnel rock mass according to claim 2, wherein the engineering data is refined and/or secondarily calculated in step S2, and the method comprises one or more of the following steps:
s21, filtering abnormal values;
s22, segmenting the engineering data according to the same section distance, and calculating the mean value and variance of the four primary indexes in each segmentation paragraph to obtain eight secondary indexes;
s23, performing oversampling processing on the secondary index;
s23, performing feature correlation analysis on the primary index or the secondary index, and eliminating features with correlation degree higher than a first threshold value; and/or analyzing the feature importance of the primary index or the secondary index, and rejecting the index with the feature importance smaller than a second threshold value.
4. The method for quantitatively evaluating the integrity of a tunnel rock mass according to claim 3, wherein the first threshold value is 0.8.
5. The method for quantitatively evaluating the integrity of a tunnel rock mass according to claim 3, wherein the second threshold value is 0.1.
6. The method for quantitatively evaluating the integrity of tunnel rock mass according to claim 1, wherein the mathematical model D of multivariate unstable exponential analysis in step S5 t The following were used:
Figure FDA0003612694740000021
wherein n is the number of influencing factors, d 1 ~d n Is an index factor of instability, w 1 ~w n Is an unstable exponential weight.
7. The method for quantitatively evaluating the integrity of the tunnel rock mass according to claim 6, wherein the weight of each index is calculated according to the following formula:
Figure FDA0003612694740000022
wherein SHAP i Indicates the SHAP value of the i-th index.
8. The method according to claim 7, wherein in step S2, the sample data includes five indexes, namely a propulsion speed mean DRM, a propulsion force mean TPM, a torque mean TM, a rotation speed mean RM and a propulsion speed variance DRV.
9. The method for quantitatively evaluating the integrity of the tunnel rock mass according to claim 8, wherein the calculation formula IRFI of the model for quantitatively evaluating the integrity of the tunnel rock mass in the step S5 is as follows:
Figure FDA0003612694740000023
wherein, W 1 As a weight of the index DRM, W 2 To index the weight of the TPM, W 3 Is the weight of the indicator TM, W 4 Is a weight of the index RM, W 5 Is the weight of the DRV index.
10. The method for quantitatively evaluating the integrity of the tunnel rock mass according to claim 9, wherein the evaluation result of the quantitative evaluation model of the integrity of the tunnel rock mass in the step S6 comprises:
when DRM is more than or equal to 120, the evaluation result is argillaceous filling;
when the DRM is less than 120 and the IRFI value range is (0,31), the evaluation result is relatively complete;
when DRM is less than 120 and the IRFI value range is [31,55), the evaluation result is relatively broken;
when DRM is <120 and IRFI is > 55, the evaluation result is broken.
CN202210435391.4A 2022-04-24 2022-04-24 Method for quantitatively evaluating integrity degree of tunnel rock mass Pending CN114818493A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210435391.4A CN114818493A (en) 2022-04-24 2022-04-24 Method for quantitatively evaluating integrity degree of tunnel rock mass

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210435391.4A CN114818493A (en) 2022-04-24 2022-04-24 Method for quantitatively evaluating integrity degree of tunnel rock mass

Publications (1)

Publication Number Publication Date
CN114818493A true CN114818493A (en) 2022-07-29

Family

ID=82506561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210435391.4A Pending CN114818493A (en) 2022-04-24 2022-04-24 Method for quantitatively evaluating integrity degree of tunnel rock mass

Country Status (1)

Country Link
CN (1) CN114818493A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115840921A (en) * 2023-02-24 2023-03-24 中南大学 Rock mass quality grading method based on machine learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115840921A (en) * 2023-02-24 2023-03-24 中南大学 Rock mass quality grading method based on machine learning

Similar Documents

Publication Publication Date Title
CN110674841B (en) Logging curve identification method based on clustering algorithm
CN110442666B (en) Mineral resource prediction method and system based on neural network model
CN112529341B (en) Drilling well leakage probability prediction method based on naive Bayesian algorithm
Leng et al. A hybrid data mining method for tunnel engineering based on real-time monitoring data from tunnel boring machines
CN112948932A (en) Surrounding rock grade prediction method based on TSP forecast data and XGboost algorithm
Bashari et al. Estimation of deformation modulus of rock masses by using fuzzy clustering-based modeling
CN108470095B (en) TBM (tunnel boring machine) propulsion prediction method based on data-driven radial basis function model
CN116448419A (en) Zero sample bearing fault diagnosis method based on depth model high-dimensional parameter multi-target efficient optimization
CN108595803A (en) Shale gas well liquid loading pressure prediction method based on recurrent neural network
CN115017791A (en) Tunnel surrounding rock grade identification method and device
CN113779880A (en) Tunnel surrounding rock two-dimensional quality evaluation method based on advanced drilling data
CN115438897A (en) Industrial process product quality prediction method based on BLSTM neural network
CN115293316A (en) Prediction method for deep-buried thick coal seam water flowing fractured zone based on SSA-ELMAN
CN114818493A (en) Method for quantitatively evaluating integrity degree of tunnel rock mass
Gan et al. A new spatial modeling method for 3D formation drillability field using fuzzy c-means clustering and random forest
CN116911216B (en) Reservoir oil well productivity factor assessment and prediction method
CN117743978A (en) Rock mass integrity identification method based on gradient lifting decision tree and while-drilling parameters
Abbas et al. Unsupervised machine learning technique for classifying production zones in unconventional reservoirs
CN117093922A (en) Improved SVM-based complex fluid identification method for unbalanced sample oil reservoir
CN116188266A (en) Spatial interpolation method for geological investigation based on fuzzy theory and ensemble learning
Noor et al. Prediction map of rainfall classification using random forest and inverse distance weighted (IDW)
CN111026790A (en) Structure safety assessment and forecasting method based on data mining
CN111260029A (en) Credibility analysis method for air quality data
Syaputra The Implementation of Support Vector Machine Method with Genetic Algorithm in Predicting Energy Consumption for Reinforced Concrete Buildings
CN116432891A (en) Comprehensive evaluation method and system for application efficiency of drill bit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination