CN110674868A - Stratum lithology identification system and method based on high-dimensional drilling parameter information - Google Patents

Stratum lithology identification system and method based on high-dimensional drilling parameter information Download PDF

Info

Publication number
CN110674868A
CN110674868A CN201910898862.3A CN201910898862A CN110674868A CN 110674868 A CN110674868 A CN 110674868A CN 201910898862 A CN201910898862 A CN 201910898862A CN 110674868 A CN110674868 A CN 110674868A
Authority
CN
China
Prior art keywords
principal component
prediction
training samples
component data
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910898862.3A
Other languages
Chinese (zh)
Inventor
张宁
张幼振
姚克
邵俊杰
孙道明
李旺年
钟自成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Research Institute Co Ltd of CCTEG
Original Assignee
Xian Research Institute Co Ltd of CCTEG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Research Institute Co Ltd of CCTEG filed Critical Xian Research Institute Co Ltd of CCTEG
Priority to CN201910898862.3A priority Critical patent/CN110674868A/en
Publication of CN110674868A publication Critical patent/CN110674868A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a stratum lithology recognition system and a stratum lithology recognition method based on high-dimensional drilling parameter information, wherein the stratum lithology recognition system comprises a drilling test bed, a data hole construction test bed and a stratum lithology recognition test bed, wherein the drilling test bed is used for constructing data holes and obtaining high-dimensional drilling parameters, and each group of high-dimensional drilling parameters respectively form a corresponding training sample and a corresponding prediction sample; the data dimension reduction system is used for calculating high-dimensional drilling parameters in the training samples, determining the number of principal component data, obtaining the classification number of the preliminary training samples and the principal component data set of the training samples, and further obtaining the principal component data set of the prediction samples; the data clustering system is used for carrying out fuzzy kernel clustering on the principal component data sets of the training samples to obtain the optimal number K of the principal component data sets of the training samples to be clustered, and obtaining the principal component data sets of the training samples to be classified; and the prediction identification system is used for establishing a discrimination criterion for the principal component data set of the training samples determined to be classified, and classifying the prediction samples to obtain the lithology categories to which the prediction samples belong.

Description

Stratum lithology identification system and method based on high-dimensional drilling parameter information
Technical Field
The invention belongs to the field of geotechnical engineering survey, relates to a stratum lithology identification system, and particularly relates to a stratum lithology identification system and a stratum lithology identification method based on high-dimensional drilling parameter information.
Background
Because the breadth of our country is wide, the geological conditions are extremely complex, the rock properties and rock stratum combinations in the stratum have obvious spatial differences, and the rationality of the engineering design and the safety guarantee of the mining process are greatly influenced. At present, geological drilling and geophysical prospecting methods are generally adopted for stratum identification, the lithology identification of the stratum has the defects of low identification precision, extensive identification method and the like, the timely and accurate identification of the stratum characteristics is difficult to realize, the field engineering practice cannot be effectively guided, meanwhile, the real-time drilling parameter set data quantity obtained by drilling equipment is large, the dimension is high, the lithology characteristics of the stratum are generally related to various drilling parameters, information contained in each drilling parameter is crossed, some drilling parameters are even redundant, the lithology identification cannot be carried out by a multi-parameter linear method, a proper data processing and analyzing method is not available, at present, foreign and domestic and foreign scholars develop researches on the drilling parameter data sets by using various numerical simulation and test methods, and when single or multiple index parameters and threshold values of the index parameters are adopted for judgment, the information quantity is small, the threshold values of all parameters of different mines are different, when a plurality of indexes are close to the threshold values in different degrees, a good solution is not provided for comprehensive judgment, meanwhile, the calculated amount of actually measured data is huge, the parameter data have the problem of mutual interference, calculation and analysis cannot be carried out in real time, and therefore real-time and credible prediction cannot be made on the stratum.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a stratum lithology identification system and a stratum lithology identification method based on high-dimensional drilling parameter information, and solve the technical problem of insufficient identification precision in the prior art.
In order to solve the technical problems, the invention adopts the following technical scheme:
a stratum lithology recognition system based on high-dimensional drilling parameter information comprises a drilling test bed, a data dimension reduction system, a data clustering system and a prediction recognition system which are sequentially connected;
the drilling test bed is used for constructing data holes and obtaining high-dimensional drilling parameters, and each group of high-dimensional drilling parameters respectively form a corresponding training sample and a corresponding prediction sample;
the data dimension reduction system is used for calculating high-dimensional drilling parameters in the training sample and determining the number of principal component components; obtaining the classification number of the preliminary training sample and the principal component data set of the training sample; further obtaining a principal component data set of the prediction sample;
the data clustering system is used for carrying out fuzzy kernel clustering on the principal component data sets of the training samples to obtain the optimal number K of the principal component data sets of the training samples to be clustered, and obtaining the principal component data sets of the training samples to be classified;
the prediction identification system is used for establishing a discrimination criterion for the principal component data set of the training samples determined to be classified, classifying the prediction samples and obtaining the lithology categories to which the prediction samples belong.
The invention also has the following technical characteristics:
the drilling test bed comprises a hydraulic pump station, an operation platform, a flushing liquid circulating unit and a data acquisition unit which are respectively connected with a host.
The data dimension reduction system comprises a first input end, a first data processor and a first output end which are sequentially connected.
The data clustering system comprises a second input end, a second data processor and a second output end which are connected in sequence.
The prediction identification system comprises a third input end, a third data processor and a third output end which are sequentially connected.
The invention also provides a stratum lithology identification method based on the high-dimensional drilling parameter information, which adopts the stratum lithology identification system based on the high-dimensional drilling parameter information; the high-dimensional drilling parameters comprise the mechanical drilling speed, the rotary torque, the drilling pressure, the rotating speed, the rotary pressure and the pressure of a slurry pump.
The method specifically comprises the following steps:
determining the rock stratum characteristics and the prediction range of a typical stratum according to the production information of a target stratum region, constructing a data hole by adopting a host of a drilling test bed, and obtaining high-dimensional drilling parameters by adopting a data acquisition unit of the drilling test bed, wherein each group of high-dimensional drilling parameters respectively form a corresponding training sample and a prediction sample;
calculating high-dimensional drilling parameters in the training sample by using a data dimension reduction system to obtain correlation coefficients among the high-dimensional drilling parameters, then obtaining the contribution rate of each preset principal component, sequencing from large to small, and obtaining the accumulated contribution rate of each preset principal component; when the accumulated contribution rate of a certain preset principal component is greater than 90%, all preset principal components before the preset principal component are principal components, and finally the number of the principal components and the corresponding feature vectors are determined; performing weighted calculation by taking the contribution rate of each principal component as weight to obtain the weighted score of each training sample, sequencing the training samples from high to low according to the weighted score of each training sample, preliminarily classifying the training samples according to the weighted score condition of each training sample to obtain the classification number of the preliminary training samples, and obtaining a principal component data set of the training samples and a principal component data set of the prediction samples according to the feature vectors of the principal components;
thirdly, fuzzy kernel clustering is carried out on the principal component data sets of the training samples obtained in the second step by adopting a data clustering system, the classification number of the preliminary training samples obtained in the second step is used as the original clustering number, the fuzzy degree is set, a kernel function is constructed, a membership matrix is established, clustering of the principal component data sets of the training samples is finally completed through continuous iteration optimization parameters, the optimal number K of the principal component data sets of the training samples is obtained, the optimal number K is the lithologic classification number, and meanwhile, the clustering center of each lithologic classification and the corresponding data set of each lithologic classification are calculated to obtain the principal component data sets of the training samples with determined classifications;
and step four, establishing a judgment criterion for the principal component data sets of the training samples of the determined classification obtained in the step three by adopting a prediction recognition system, calculating the Mahalanobis distance between each principal component data set of the prediction samples obtained in the step two and the principal component data sets of the training samples of the determined classification by using a Mahalanobis distance judgment method, selecting the smallest Mahalanobis distance, and classifying the prediction samples to obtain the lithology categories to which the prediction samples belong.
Compared with the prior art, the invention has the following technical effects:
the method has the advantages of high identification precision, short clustering time and high data processing speed, effectively utilizes the high-dimensional drilling parameter data set, filters redundant and wrong parameter information, can identify the lithology and structural information of the stratum in real time, has important practical significance for predicting the lithology of the stratum by applying the proposed method, and has important significance for realizing dynamic intelligent detection of hidden disaster-causing factors of the rock stratum. The method can provide accurate information for stratum intelligent identification, provide reference and guidance for prediction of rock characteristics in rock engineering construction such as other well engineering, slope engineering and the like, and provide ideas and methods for processing high-dimensional data in engineering practice.
Drawings
Fig. 1 is a schematic structural diagram of a formation lithology identification system based on high-dimensional drilling parameter information according to the present invention.
Fig. 2 is a schematic diagram of the structure of the drilling test bed of the invention.
FIG. 3 is a schematic structural diagram of a data dimension reduction system according to the present invention.
Fig. 4 is a schematic structural diagram of the data clustering system of the present invention.
FIG. 5 is a schematic diagram of a predictive identification system according to the present invention.
Fig. 6 is a schematic diagram of a clustering effect of the data clustering system according to an actual application example of the present invention.
The meaning of the individual reference symbols in the figures is: the system comprises a drilling test bed 1, a data dimension reduction system 2, a data clustering system 3, a prediction identification system 4, a hydraulic pump station 5, an operation platform 6, a flushing liquid circulating system 7, a host computer 8, a data acquisition system 9, a first input end 10, a first data processing system 11, a first output end 12, a second input end 13, a second data processing system 14, a second output end 15, a third input end 16, a third data processing system 17 and a third output end 18.
The present invention will be explained in further detail with reference to examples.
Detailed Description
The invention aims to provide a stratum lithology recognition system and a stratum lithology recognition method based on high-dimensional drilling parameter information, which can process the high-dimensional drilling parameter information, reduce the mutual influence of drilling parameters, obtain accurate stratum lithology characteristics in real time and provide an effective means for scientific design and optimization of underground coal mine drilling construction parameters. The method can not only provide theoretical support for stratum lithology identification, but also provide reference and guidance for processing high-dimensional parameter data sets for other geotechnical engineering construction and the like, and simultaneously provide ideas and methods for fault diagnosis of drilling equipment.
The invention provides a stratum lithology recognition system based on high-dimensional drilling parameter information, which comprises a drilling test bed, a data dimension reduction system, a data clustering system and a prediction recognition system which are sequentially connected.
The drilling test bed is used for constructing data holes and obtaining high-dimensional drilling parameters, and each group of high-dimensional drilling parameters respectively form a corresponding training sample and a corresponding prediction sample;
the data dimension reduction system is used for completing dimension reduction of a high-dimensional drilling parameter information set to form new mutually-irrelevant comprehensive principal component parameters, simultaneously keeping the information of the original data set to the maximum extent, eliminating the influence of various parameter dimensions, calculating the high-dimensional drilling parameters in a training sample and determining the number of principal component components; obtaining the classification number of the preliminary training sample and the principal component data set of the training sample; further obtaining a principal component data set of the prediction sample;
the data clustering system is used for carrying out fuzzy kernel clustering on the principal component data sets of the training samples to obtain the optimal number K of the principal component data sets of the training samples to be clustered, and obtaining the principal component data sets of the training samples to be classified;
the prediction identification system calculates the Mahalanobis distance between the prediction sample and the principal component data set of the training sample determined and classified respectively by using a Mahalanobis distance judgment method, compares the Mahalanobis distance with the principal component data set of the training sample determined and classified, and identifies the lithology of the prediction sample.
The drilling test bed comprises a hydraulic pump station, an operation platform, a flushing liquid circulating unit and a data acquisition unit which are respectively connected with a host;
the main machine is used for driving drilling tool construction data to drill holes;
the hydraulic pump station is used for providing motive power for the main machine;
the operation platform is used for operating the control host;
the data acquisition unit is used for collecting and transmitting high-dimensional drilling parameters;
the flushing liquid circulating unit is used for cooling the drill bit and discharging slag;
the data dimension reduction system comprises a first input end, a first data processor and a first output end which are sequentially connected;
the input end is used for receiving training samples and prediction samples;
the data processor I is used for standardizing high-dimensional drilling parameter information data sets in training samples and prediction samples, calculating correlation coefficients, contribution rates and accumulated contribution rates, and finally determining the number of principal component components and corresponding feature vectors; obtaining the classification number of the preliminary training sample and the principal component data set of the training sample; further obtaining a principal component data set of the prediction sample;
and the first output end is mainly used for outputting the classification number of the preliminary training sample, the principal component data set of the training sample and the principal component data set of the prediction sample.
The data clustering system comprises a second input end, a second data processor and a second output end which are connected in sequence,
the input end II is used for receiving the classification number of the preliminary training sample obtained by the output end I and the principal component data set of the training sample;
the second data processor is used for taking the classification number of the preliminary training sample as the original clustering number, setting the ambiguity, constructing a kernel function, establishing a membership matrix, and calculating a clustering center through continuous iteration optimization parameters to obtain the optimal number K of the principal component data set clustering of the training sample and the principal component data set of the training sample for determining the classification;
and the second output end is mainly used for outputting the principal component data set of the training sample for determining classification.
The prediction identification system comprises a third input end, a third data processor and a third output end which are sequentially connected;
the input end III is used for receiving the principal component data set of the prediction sample obtained by the output device I and the principal component data set of the training sample for determining classification obtained by the output device II;
the third data processor establishes a discrimination function, establishes a discrimination criterion, calculates the Mahalanobis distance between each principal component data set of the prediction sample and the principal component data set of the training sample determined to be classified by a Mahalanobis distance discrimination method, selects the minimum Mahalanobis distance, classifies the prediction sample, and obtains the lithology type of the prediction sample;
and the third output end can be used for outputting the lithology identification result.
In the invention, the method for determining the precision rate and correcting the system comprises the following steps:
step 1, measuring the hardness and strength physical and mechanical parameters of each prediction sample through a field test to obtain the material attribute of the field prediction sample, determining the actual lithology type of the prediction sample, and comparing the actual lithology type with the prediction result to obtain the prediction accuracy of the prediction sample.
And 2, when the prediction accuracy is less than 99%, drilling through the construction data of the drilling test bed, adding a training sample data set, repeating the first step to the third step to obtain a new principal component data set of the training samples for determining classification, identifying the prediction samples by reusing the fourth step to obtain new prediction accuracy, continuously optimizing the prediction result by increasing the number of the training samples until the prediction accuracy is more than or equal to 99%, and otherwise, circulating the first step to the fourth step.
And 3, utilizing the obtained final optimized principal component data set of the determined and classified training sample to perform drilling construction in real time, obtaining a high-dimensional drilling parameter set as a prediction sample in real time, obtaining the principal component data set of the real-time prediction sample by using a data dimension reduction system, and performing real-time identification on the lithology of the stratum in which the data is located and obtaining the operation condition of the drilling equipment through the steps.
The following embodiments of the present invention are provided, and it should be noted that the present invention is not limited to the following embodiments, and all equivalent changes based on the technical solutions of the present invention are within the protection scope of the present invention.
Example 1:
according to the technical scheme, as shown in fig. 1 to 5, the formation lithology recognition system based on the high-dimensional drilling parameter information provided by the embodiment comprises a drilling test bed, a data dimension reduction system, a data clustering system and a prediction recognition system which are sequentially connected;
the drilling test bed is used for constructing data holes and obtaining high-dimensional drilling parameters, and each group of high-dimensional drilling parameters respectively form a corresponding training sample and a corresponding prediction sample;
the data dimension reduction system is used for calculating high-dimensional drilling parameters in the training samples and determining the number of principal component components; obtaining the classification number of the preliminary training sample and the principal component data set of the training sample; further obtaining a principal component data set of the prediction sample;
the data clustering system is used for carrying out fuzzy kernel clustering on the principal component data sets of the training samples to obtain the optimal number K of the principal component data sets of the training samples to be clustered, and obtaining the principal component data sets of the training samples to be classified;
and the prediction identification system is used for establishing a discrimination criterion for the principal component data set of the training samples determined to be classified, and classifying the prediction samples to obtain the lithology categories to which the prediction samples belong.
Specifically, the drilling test bed comprises a hydraulic pump station, an operation platform, a flushing fluid circulating unit and a data acquisition unit which are respectively connected with a host.
Specifically, the data dimension reduction system comprises a first input end, a first data processor and a first output end which are sequentially connected.
Specifically, the data clustering system comprises a second input end, a second data processor and a second output end which are connected in sequence.
Specifically, the prediction identification system comprises a third input end, a third data processor and a third output end which are sequentially connected.
Example 2:
according to the technical scheme, the embodiment provides a formation lithology identification method based on high-dimensional drilling parameter information, and the method adopts a formation lithology identification system based on high-dimensional drilling parameter information as in embodiment 1; the high-dimensional drilling parameters comprise the mechanical drilling speed, the rotary torque, the drilling pressure, the rotating speed, the rotary pressure and the pressure of a slurry pump. The method specifically comprises the following steps:
determining the rock stratum characteristics and the prediction range of a typical stratum according to the production information of a target stratum region, constructing a data hole by adopting a host of a drilling test bed, and obtaining high-dimensional drilling parameters by adopting a data acquisition unit of the drilling test bed, wherein each group of high-dimensional drilling parameters respectively form a corresponding training sample and a prediction sample;
calculating high-dimensional drilling parameters in the training sample by using a data dimension reduction system to obtain correlation coefficients among the high-dimensional drilling parameters, then obtaining the contribution rate of each preset principal component, sequencing from large to small, and obtaining the accumulated contribution rate of each preset principal component; when the accumulated contribution rate of a certain preset principal component is greater than 90%, all preset principal components before the preset principal component are principal components, and finally the number of the principal components and corresponding feature vectors are determined; and performing weighted calculation by taking the contribution rate of each principal component as weight to obtain the weighted score of each training sample, sequencing the training samples from high to low according to the weighted score of each training sample, preliminarily classifying the training samples according to the weighted score condition of each training sample to obtain the classification number of the preliminary training samples, and obtaining a principal component data set of the training samples and a principal component data set of the prediction samples according to the feature vectors of the principal components.
In the second step, the method specifically comprises the following steps:
step 2.1, data standardization of high-dimensional drilling parameters:
in order to make the result not affected by the dimension, the high dimensional drilling parameters are firstly normalized so that the mean value of each index data is 0 and the standard deviation is 1. The method for standardization treatment comprises the following steps:
Figure BDA0002211153400000101
wherein the content of the first and second substances,
Figure BDA0002211153400000102
wherein alpha isijThe index value of the jth high-dimensional drilling parameter of the ith evaluation object in the data set, i is the serial number of the evaluation object corresponding to a certain high-dimensional drilling parameter in the data set, j is the serial number of the high-dimensional drilling parameter in the data set, and mujIs alphaijAverage value of (1), sjIs alphaijThe variance of (c).
Step 2.2, calculating the correlation coefficient of the high-dimensional drilling parameters:
and calculating correlation coefficients among the high-dimensional drilling parameters, and constructing a correlation coefficient matrix. Let the matrix of correlation coefficients be R (R)wq) Wherein
Figure BDA0002211153400000103
Wherein r iswqAnd the correlation coefficients of the w-th high-dimensional drilling parameter and the q-th high-dimensional drilling parameter in the correlation coefficient matrix are obtained, wherein w and q are serial numbers of the high-dimensional drilling parameters in the data set.
Step 2.3, constructing preset principal component components:
calculating a correlation coefficient matrix R (R)wq) Characteristic value λ of1≥λ2≥...≥λNNot less than 0 and its corresponding feature vector mu12,…,μNRecord muj=(μ1j2j,…,μNj)TAnd performing linear combination:
Figure BDA0002211153400000111
wherein, yNAnd an index value representing the Nth preset principal component.
Step 2.4, selecting principal component:
and calculating the contribution rate of the preset principal component, wherein the size of the contribution rate reflects the influence of the preset principal component. And calculating the cumulative contribution rates of the preset principal component components as follows:
Figure BDA0002211153400000112
Figure BDA0002211153400000113
wherein, bjRepresenting the contribution of a predetermined principal component, alphajRepresenting the cumulative contribution of the preset principal component.
When the contribution rate α is accumulatedjWhen the cumulative contribution rate is close to 1, extracting preset principal component corresponding to characteristic value with cumulative contribution rate larger than 90%, and selecting p (p is less than or equal to N) index variables y1,y2,…,ypAnd p principal component, wherein p represents the serial number of the preset principal component corresponding to the characteristic value with the accumulated contribution rate larger than 90%, namely the number of the principal component.
Step 2.5 obtaining the weighted score of the training sample:
given a weighting score of Z, then
Figure BDA0002211153400000114
And sorting the training samples from high to low according to the weighted score of each training sample, preliminarily classifying the training samples according to the weighted score condition of each training sample, obtaining the classification number q of the preliminary training samples, and obtaining the principal component data set of the training samples and the principal component data set of the prediction samples according to the feature vectors of the principal components.
And step three, performing fuzzy kernel clustering on the principal component data sets of the training samples obtained in the step two by adopting a data clustering system, setting the fuzzy degree by taking the classification number of the primary training samples obtained in the step two as the original clustering number, constructing a kernel function, establishing a membership matrix, finally completing clustering of the principal component data sets of the training samples through continuously iterating and optimizing parameters, obtaining the optimal number K of the clustering of the principal component data sets of the training samples, wherein the optimal number K is the lithological classification number, and simultaneously calculating the clustering center of each lithological classification and the corresponding data set of each lithological classification to obtain the principal component data sets of the training samples with determined classifications.
The third step specifically comprises the following steps:
step 3.1, setting the number of original classifications as q, setting the ambiguity m and the target function precision epsilon, setting parameters of a proper kernel function and constructing the kernel function;
step 3.2, establishing a membership matrix, and initializing the membership matrix;
step 3.3, calculating a clustering center;
Figure BDA0002211153400000121
in the formula: x is the number ofiThe sample i in the original feature space is 1,2, …, n; mu.siuFor the ith sample xiDegree of membership to class u, u 1, ·, q; mu.miu∈[0,1](ii) a m is the ambiguity; v. ofaThe cluster center of the class a in the high-dimensional feature space is 1,2, …, q;
Figure BDA0002211153400000122
is the ith sample x in the high-dimensional feature spaceiThe distance of (c).
Minimizing the objective function may be accomplished by deriving the partial derivative of the membership matrix to zero, then
Figure BDA0002211153400000123
r=1,2,…,n;s=1,2,…,q
In the formula: mu.srsIs the r-th sample xrFor the degree of membership of the s-th class,for the r-th sample x in the high-dimensional feature spacerDistance from the cluster center of the jth class in the high dimensional feature space.
And 3.4, comparing the iterated membership matrix according to the matrix norm, stopping iteration if convergence occurs, and returning to the step 3.3 if the convergence occurs.
And obtaining the optimal number K of the clustering of the principal component data sets of the training samples, wherein the optimal number K is the number of lithology classifications, and simultaneously calculating the clustering center of each lithology classification and the corresponding data set of each lithology classification to obtain the principal component data sets of the training samples with determined classifications.
And step four, establishing a judgment criterion for the principal component data sets of the training samples of the determined classification obtained in the step three by adopting a prediction recognition system, calculating the Mahalanobis distance between each principal component data set of the prediction samples obtained in the step two and the principal component data sets of the training samples of the determined classification by using a Mahalanobis distance judgment method, selecting the smallest Mahalanobis distance, and classifying the prediction samples to obtain the lithology categories to which the prediction samples belong.
The Mahalanobis distance judgment method in the fourth step comprises the following steps:
the principal component data sets of the prediction samples and the principal component data sets of the training samples determining the classification are calculated pairwise, and the description is given by taking two principal component data sets of the training samples determining the classification A, B as an example.
Step 4.1, calculating A, B two types of mean vectors and covariance matrix;
ma=mean(A),mb=mean(B),S1=cov(A),S2=cov(B)
step 4.2, calculating the total covariance matrix;
Figure BDA0002211153400000131
wherein n is1、n2A, B, ma is the mean vector of A, mb is the mean vector of B, S1A covariance matrix of A, S2Is the covariance matrix of B, and S is the covariance matrix of the whole.
Step 4.3, calculating the difference d between the Markov square distances of the principal component data sets x to A, B of the prediction sample;
d=(x-ma)S-1(x-ma)T-(x-mb)S-1(x-mb)T
step 4.4, if d is less than 0, x belongs to the class A; if d >0, then x belongs to class B.
Application example:
taking a certain coal mine roadway field test as an example for analysis. A total of 2900 sets of high dimensional drilling parameters were measured using the drilling rig, consisting of 2900 training samples and 50 test samples.
Firstly, the high-dimensional drilling parameter data standardization of a training sample is completed, in order to eliminate the influence of various parameter dimensions, the measured high-dimensional drilling parameter is standardized, and the standardized variables of the mechanical drilling speed, the rotary torque, the drilling pressure, the rotating speed, the rotary pressure and the mud pump pressure are defined as x respectively1、x2、x3、x4、x5、x6Defining each preset principal component as y1、y2、y3、y4、y5、y6A matrix of correlation coefficients can be calculated, the results of which are shown in table 1 below.
TABLE 1 training sample correlation coefficient Table
Figure BDA0002211153400000141
And then, calculating the contribution rate and the accumulated contribution rate of each preset principal component through the data dimension reduction system, as shown in table 2.
TABLE 2 Preset principal component results Table
Figure BDA0002211153400000142
According to the relevant empirical principle, the number of the selected principal component is determined to be 3 according to the principle that the cumulative contribution rate is greater than 90%, and then the feature vector of the principal component of the training sample can be obtained as shown in the following table 3.
TABLE 3 feature vectors of principal component of training samples
Figure BDA0002211153400000151
The first principal component mainly reflects x2(slewing torque); the second principal component mainly reflects x1(rate of penetration); the third principal component mainly reflects x5(gyration pressure), the training samples can be calculated and reduced in dimension through the table 3 to obtain the principal component data set of the training samples. And obtaining the classification number of the preliminary training sample as 3 through the weighted score, and further obtaining a prediction sample principal component data set.
By selecting a proper kernel function, clustering the principal component data sets of the training samples by using a data clustering system to obtain 3 principal component data sets of the training samples determined and classified, and obtaining clustering centers respectively as follows: v. of1=(0.1262,-1.0885,-0.7335)T,v2The term "T" refers to (1.6849, -0.7538, -1.2834) T, v3 refers to (0.4876, -0.8157, 0.7405) T, and corresponds to sandstone, coal and mudstone layers, respectively.
The clustering effect is shown in fig. 6, mahalanobis distances between each principal component data set of the prediction sample and 3 principal component data sets of the training samples determined and classified are calculated through a mahalanobis distance judgment method, the prediction samples are classified to obtain lithology categories to which the prediction samples belong, the lithology categories are compared with actual lithology categories of the prediction samples, the prediction results are continuously optimized until the prediction accuracy is greater than or equal to 99%, and the prediction results are shown in the following table 4.
TABLE 4 prediction Effect table
Figure BDA0002211153400000152
Figure BDA0002211153400000161
In this embodiment, while, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as may be understood by those of ordinary skill in the art.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is noted that references in the specification to "one embodiment," "an example embodiment," "some embodiments," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (7)

1. A stratum lithology recognition system based on high-dimensional drilling parameter information is characterized by comprising a drilling test bed, a data dimension reduction system, a data clustering system and a prediction recognition system which are sequentially connected;
the drilling test bed is used for constructing data holes and obtaining high-dimensional drilling parameters, and each group of high-dimensional drilling parameters respectively form a corresponding training sample and a corresponding prediction sample;
the data dimension reduction system is used for calculating high-dimensional drilling parameters in the training sample and determining the number of principal component components; obtaining the classification number of the preliminary training sample and the principal component data set of the training sample; further obtaining a principal component data set of the prediction sample;
the data clustering system is used for carrying out fuzzy kernel clustering on the principal component data sets of the training samples to obtain the optimal number K of the principal component data sets of the training samples to be clustered, and obtaining the principal component data sets of the training samples to be classified;
the prediction identification system is used for establishing a discrimination criterion for the principal component data set of the training samples determined to be classified, classifying the prediction samples and obtaining the lithology categories to which the prediction samples belong.
2. The formation lithology recognition system based on high-dimensional drilling parameter information as claimed in claim 1, wherein the drilling test bench comprises a hydraulic pump station, an operation bench, a flushing liquid circulation unit and a data acquisition unit which are respectively connected with a host machine.
3. The formation lithology recognition system based on high-dimensional drilling parameter information as claimed in claim 1, wherein the data dimension reduction system comprises a first input end, a first data processor and a first output end which are connected in sequence.
4. The system for identifying the lithology of the stratum based on the high-dimensional drilling parameter information as claimed in claim 1, wherein the data clustering system comprises a second input end, a second data processor and a second output end which are connected in sequence.
5. The formation lithology recognition system of claim 1, wherein the predictive recognition system comprises a third input terminal, a third data processor and a third output terminal connected in series.
6. A stratum lithology recognition method based on high-dimensional drilling parameter information is characterized in that the method adopts the stratum lithology recognition system based on the high-dimensional drilling parameter information; the high-dimensional drilling parameters comprise the mechanical drilling speed, the rotary torque, the drilling pressure, the rotating speed, the rotary pressure and the pressure of a slurry pump.
7. The method for identifying the lithology of the stratum based on the high-dimensional drilling parameter information as claimed in claim 6, wherein the method comprises the following steps:
determining the rock stratum characteristics and the prediction range of a typical stratum according to the production information of a target stratum region, constructing a data hole by adopting a host of a drilling test bed, and obtaining high-dimensional drilling parameters by adopting a data acquisition unit of the drilling test bed, wherein each group of high-dimensional drilling parameters respectively form a corresponding training sample and a prediction sample;
calculating high-dimensional drilling parameters in the training sample by using a data dimension reduction system to obtain correlation coefficients among the high-dimensional drilling parameters, then obtaining the contribution rate of each preset principal component, sequencing from large to small, and obtaining the accumulated contribution rate of each preset principal component; when the accumulated contribution rate of a certain preset principal component is greater than 90%, all preset principal components before the preset principal component are principal components, and finally the number of the principal components and the corresponding feature vectors are determined; performing weighted calculation by taking the contribution rate of each principal component as weight to obtain the weighted score of each training sample, sequencing the training samples from high to low according to the weighted score of each training sample, preliminarily classifying the training samples according to the weighted score condition of each training sample to obtain the classification number of the preliminary training samples, and obtaining a principal component data set of the training samples and a principal component data set of the prediction samples according to the feature vectors of the principal components;
thirdly, fuzzy kernel clustering is carried out on the principal component data sets of the training samples obtained in the second step by adopting a data clustering system, the classification number of the preliminary training samples obtained in the second step is used as the original clustering number, the fuzzy degree is set, a kernel function is constructed, a membership matrix is established, clustering of the principal component data sets of the training samples is finally completed through continuous iteration optimization parameters, the optimal number K of the principal component data sets of the training samples is obtained, the optimal number K is the lithologic classification number, and meanwhile, the clustering center of each lithologic classification and the corresponding data set of each lithologic classification are calculated to obtain the principal component data sets of the training samples with determined classifications;
and step four, establishing a judgment criterion for the principal component data sets of the training samples of the determined classification obtained in the step three by adopting a prediction recognition system, calculating the Mahalanobis distance between each principal component data set of the prediction samples obtained in the step two and the principal component data sets of the training samples of the determined classification by using a Mahalanobis distance judgment method, selecting the smallest Mahalanobis distance, and classifying the prediction samples to obtain the lithology categories to which the prediction samples belong.
CN201910898862.3A 2019-09-23 2019-09-23 Stratum lithology identification system and method based on high-dimensional drilling parameter information Pending CN110674868A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910898862.3A CN110674868A (en) 2019-09-23 2019-09-23 Stratum lithology identification system and method based on high-dimensional drilling parameter information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910898862.3A CN110674868A (en) 2019-09-23 2019-09-23 Stratum lithology identification system and method based on high-dimensional drilling parameter information

Publications (1)

Publication Number Publication Date
CN110674868A true CN110674868A (en) 2020-01-10

Family

ID=69077228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910898862.3A Pending CN110674868A (en) 2019-09-23 2019-09-23 Stratum lithology identification system and method based on high-dimensional drilling parameter information

Country Status (1)

Country Link
CN (1) CN110674868A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378998A (en) * 2021-07-12 2021-09-10 西南石油大学 Stratum lithology while-drilling identification method based on machine learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106371427A (en) * 2016-10-28 2017-02-01 浙江大学 Industrial process fault classification method based on analytic hierarchy process and fuzzy fusion
CN109034179A (en) * 2018-05-30 2018-12-18 河南理工大学 A kind of rock stratum classification method based on mahalanobis distance IDTW
CN109388816A (en) * 2017-08-07 2019-02-26 中国石油化工股份有限公司 A kind of hierarchical identification method of complex lithology
CN109635461A (en) * 2018-12-18 2019-04-16 中国铁建重工集团有限公司 A kind of application carrys out the method and system of automatic identification Grades of Surrounding Rock with brill parameter

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106371427A (en) * 2016-10-28 2017-02-01 浙江大学 Industrial process fault classification method based on analytic hierarchy process and fuzzy fusion
CN109388816A (en) * 2017-08-07 2019-02-26 中国石油化工股份有限公司 A kind of hierarchical identification method of complex lithology
CN109034179A (en) * 2018-05-30 2018-12-18 河南理工大学 A kind of rock stratum classification method based on mahalanobis distance IDTW
CN109635461A (en) * 2018-12-18 2019-04-16 中国铁建重工集团有限公司 A kind of application carrys out the method and system of automatic identification Grades of Surrounding Rock with brill parameter

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378998A (en) * 2021-07-12 2021-09-10 西南石油大学 Stratum lithology while-drilling identification method based on machine learning

Similar Documents

Publication Publication Date Title
CN105760673B (en) A kind of fluvial depositional reservoir seismic-sensitive parameterized template analysis method
CN107122860B (en) Rock burst danger level prediction method based on grid search and extreme learning machine
CN112529341B (en) Drilling well leakage probability prediction method based on naive Bayesian algorithm
Moosavi et al. Auto-detection interpretation model for horizontal oil wells using pressure transient responses
CN107356958A (en) A kind of fluvial depositional reservoir substep seismic facies Forecasting Methodology based on geological information constraint
Yan et al. A real-time intelligent lithology identification method based on a dynamic felling strategy weighted random forest algorithm
CN113792936A (en) Intelligent lithology while drilling identification method, system, equipment and storage medium
CN114139458B (en) Drilling parameter optimization method based on machine learning
CN114723095A (en) Missing well logging curve prediction method and device
CN113610945A (en) Ground stress curve prediction method based on hybrid neural network
Zhang et al. Geological Type Recognition by Machine Learning on In‐Situ Data of EPB Tunnel Boring Machines
CN115438823A (en) Borehole wall instability mechanism analysis and prediction method and system
CN115329657A (en) Drilling parameter optimization method and device
CN111562285A (en) Mine water inrush source identification method and system based on big data and deep learning
Bajolvand et al. Optimization of controllable drilling parameters using a novel geomechanics-based workflow
CN117150875A (en) Pre-drilling logging curve prediction method based on deep learning
CN114488311A (en) Transverse wave time difference prediction method based on SSA-ELM algorithm
CN110674868A (en) Stratum lithology identification system and method based on high-dimensional drilling parameter information
CN117786794A (en) Shield tunneling existing tunnel deformation optimization control method and system
CN117093922A (en) Improved SVM-based complex fluid identification method for unbalanced sample oil reservoir
CN116796231A (en) Method and device for automatically dividing lithology and comparing lithology based on logging curve
CN110552693A (en) layer interface identification method of induction logging curve based on deep neural network
CN116432891A (en) Comprehensive evaluation method and system for application efficiency of drill bit
Nezhad et al. Automatic Interpretation of Oil and Gas Well Cement Evaluation Logs Using Fuzzy Convolutional Neural Networks
CN111274736A (en) Water flowing fractured zone prediction method based on supervised learning neural network algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110