CN116933151A - Method for distinguishing deposit types based on sphalerite trace elements - Google Patents
Method for distinguishing deposit types based on sphalerite trace elements Download PDFInfo
- Publication number
- CN116933151A CN116933151A CN202310533691.0A CN202310533691A CN116933151A CN 116933151 A CN116933151 A CN 116933151A CN 202310533691 A CN202310533691 A CN 202310533691A CN 116933151 A CN116933151 A CN 116933151A
- Authority
- CN
- China
- Prior art keywords
- sphalerite
- deposit
- data
- type
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 229910052950 sphalerite Inorganic materials 0.000 title claims abstract description 82
- 239000011573 trace mineral Substances 0.000 title claims abstract description 71
- 235000013619 trace mineral Nutrition 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title claims abstract description 37
- 229910052500 inorganic mineral Inorganic materials 0.000 claims abstract description 36
- 235000010755 mineral Nutrition 0.000 claims abstract description 36
- 239000011707 mineral Substances 0.000 claims abstract description 36
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 238000010801 machine learning Methods 0.000 claims abstract description 22
- WGPCGCOKHWGKJJ-UHFFFAOYSA-N sulfanylidenezinc Chemical compound [Zn]=S WGPCGCOKHWGKJJ-UHFFFAOYSA-N 0.000 claims abstract description 15
- 229910052984 zinc sulfide Inorganic materials 0.000 claims abstract description 15
- 238000003066 decision tree Methods 0.000 claims description 27
- 238000007637 random forest analysis Methods 0.000 claims description 23
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 229910052802 copper Inorganic materials 0.000 claims description 13
- 229910052709 silver Inorganic materials 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000005070 sampling Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000011160 research Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 150000004763 sulfides Chemical class 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims description 2
- 238000001540 jet deposition Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 claims 1
- 238000010276 construction Methods 0.000 abstract description 4
- JQJCSZOEVBFDKO-UHFFFAOYSA-N lead zinc Chemical compound [Zn].[Pb] JQJCSZOEVBFDKO-UHFFFAOYSA-N 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- SZVJSHCCFOBDDC-UHFFFAOYSA-N iron(II,III) oxide Inorganic materials O=[Fe]O[Fe]O[Fe]=O SZVJSHCCFOBDDC-UHFFFAOYSA-N 0.000 description 3
- 229910052683 pyrite Inorganic materials 0.000 description 3
- NIFIFKQPDTWWGU-UHFFFAOYSA-N pyrite Chemical compound [Fe+2].[S-][S-] NIFIFKQPDTWWGU-UHFFFAOYSA-N 0.000 description 3
- 239000011028 pyrite Substances 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 229910052964 arsenopyrite Inorganic materials 0.000 description 2
- MJLGNAGLHAQFHV-UHFFFAOYSA-N arsenopyrite Chemical compound [S-2].[Fe+3].[As-] MJLGNAGLHAQFHV-UHFFFAOYSA-N 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 229910052951 chalcopyrite Inorganic materials 0.000 description 2
- DVRDHUBQLOKMHZ-UHFFFAOYSA-N chalcopyrite Chemical compound [S-2].[S-2].[Fe+2].[Cu+2] DVRDHUBQLOKMHZ-UHFFFAOYSA-N 0.000 description 2
- 229910052949 galena Inorganic materials 0.000 description 2
- 229910052742 iron Inorganic materials 0.000 description 2
- XCAUINMIESBTBL-UHFFFAOYSA-N lead(ii) sulfide Chemical compound [Pb]=S XCAUINMIESBTBL-UHFFFAOYSA-N 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000031068 symbiosis, encompassing mutualism through parasitism Effects 0.000 description 2
- 241000283070 Equus zebra Species 0.000 description 1
- 229910002551 Fe-Mn Inorganic materials 0.000 description 1
- 229910003307 Ni-Cd Inorganic materials 0.000 description 1
- UCKMPCXJQFINFW-UHFFFAOYSA-N Sulphide Chemical compound [S-2] UCKMPCXJQFINFW-UHFFFAOYSA-N 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical group [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 239000002223 garnet Substances 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000009616 inductively coupled plasma Methods 0.000 description 1
- 238000000608 laser ablation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000010453 quartz Substances 0.000 description 1
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicon dioxide Inorganic materials O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Mining
Abstract
The invention relates to a method for distinguishing deposit types based on zinc blende microelements; belonging to the technical field of mineral deposit prospecting. The invention firstly provides the construction of the sphalerite trace element database, the construction of the sphalerite trace element training set and the sphalerite trace element test set and the combination of a machine learning method, takes sphalerite trace elements widely existing in various mineral deposits as marks for judging the mineral deposit types, improves the mineral deposit prospecting efficiency and accuracy related to sphalerite, and solves the problems of high mineral deposit deep side prospecting difficulty and high cost.
Description
Technical Field
The invention relates to a method for distinguishing deposit types based on zinc blende microelements; belonging to the technical field of mineral deposit prospecting.
Background
Determining deposit causes is still one of the most critical but challenging problems in petrography research, and correctly determining deposit causes helps to better understand regional large-scale ore-forming processes, and earlier application of deposit models can significantly improve exploration efficiency. Different types of deposits are marked by different sources of mineral matter, physicochemical conditions and mineral formation processes, all of which will significantly affect the trace element composition of the mineral. Mineral trace element chemistry is therefore widely used to determine ore causes, the most common minerals including garnet, quartz, pyrite and magnetite. However, the same minerals from different ore genetic types may have similar microelement geochemistry and thus it is difficult to determine the deposit type.
Sphalerite is the most important zinc-bearing ore mineral and is ubiquitous in many types of deposits, including volcanic block sulfides (VMS), misibbean (MVT), porphyry (Porphyry), hydrothermal (EPI), jet-deposit (SEDEX) and Skarn deposits (Skarn). Sphalerite can contain a variety of trace elements by displacement, the content of which can distinguish between deposit types. Over the last several decades, studies have been conducted to classify mineral deposit types using sphalerite trace elements. The traditional method is to strengthen the discrimination of the deposit types of the sphalerite microelements through a binary diagram of Mn-Fe, co/Ni-Cd/Fe, cd/Fe-Mn, ge-In and a ternary diagram of Cd-Mn-1000 Ge. However, the existing discrimination diagrams cannot accurately distinguish different deposit types due to the fact that sphalerite microelements from different deposit types are similar in composition.
Search and find: so far, no report related to the accurate and efficient discrimination of deposit types by constructing a sphalerite trace element database, a sphalerite trace element training set and a sphalerite trace element testing set and combining a machine learning method is known.
Disclosure of Invention
The invention firstly provides the construction of the sphalerite trace element database, the construction of the sphalerite trace element training set and the sphalerite trace element testing set and the combination of a machine learning method, takes the content of the sphalerite trace elements widely existing in various mineral deposits as a mark for judging the mineral deposit types, improves the mining efficiency and accuracy of the mineral deposits related to sphalerite, and solves the problems of high mining difficulty and high cost at the deep side of the mineral deposit.
The invention discloses a method for accurately distinguishing deposit types based on sphalerite trace elements, which specifically comprises the following steps:
step one, establishing a zinc blende trace element database
Collecting sphalerite trace element data from globally published literature;
sorting the collected data to create at least 3000 sets of trace element databases from the six deposit types;
the database comprises deposit names, deposit positions, deposit types and sphalerite trace element contents;
in the database, the names of ore deposits, the types of the ore deposits and the content of the sphalerite microelements are in a corresponding relation, namely, one ore deposit; the mineral deposit types and the contents of sphalerite microelements in the mineral deposit are in corresponding relation.
The six deposit types include at least: volcanic block sulfides (VMS), michibix Valley Types (MVT), porphyry, hydrothermal (EPI), jet-deposit types (SEDEX), and Skarn deposits (Skarn);
the sphalerite trace elements at least comprise at least 10 of Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu;
the database is an important content for learning a subsequent machine learning model, is a key for determining the deposit type discrimination, and needs a large amount of zincblende trace element content when the database is built, and the zincblende trace element content is accurately divided into different deposit types;
step two data preprocessing
And (3) performing nearest neighbor interpolation and center logarithmic ratio conversion on the data of various sphalerite microelements in the sphalerite microelements database established in the step one, so that the data covariance is unchanged and accords with normal distribution. The data of various sphalerite trace elements include Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu content;
when the data is preprocessed, the data of the trace elements of the determined types are processed under the condition of determining the deposit types, so that the data of the trace elements in the deposit types become normalized distribution; the data includes content data;
step three, building a training set test set
For training and testing Random forest (Random forest) and lifting gradient decision tree (Gradient Boosting) classifiers, a test set and training set are established, and at least 200, preferably 200-400, random forest and gradient lifting classifiers are extracted from each deposit type for which a sphalerite trace element database is established; to avoid favoring more data classes, the same amount of data is randomly extracted from each deposit type by a random function; the rest data are used for testing to obtain a training and tested classification matrix; training and testing a classification matrix for evaluating accuracy of model training, each column representing a predicted deposit type, the total number of columns representing the number of data predicted as the deposit type; each row represents the true deposit type of data, and the total number of data for each row represents the number of data instances for that deposit type.
Step four, establishing a machine learning model
And establishing a machine learning model by using a random forest and gradient lifting algorithm. Random forest and gradient lifting adopts a bootstrap sampling method to randomly extract training samples from a sample set to generate a decision tree and a training subset. When constructing a decision tree, optimally dividing each node in the decision tree; the quality of node segmentation is therefore very important for creating decision trees. And (3) performing model super-parameter tuning by using cross verification, and stopping splitting when the depth of the decision tree is greater than or equal to 4500, preferably 5000 and the generated child node appears N times, wherein the parameter is the optimal parameter of the model. The N is less than or equal to 6;
the model is randomly sampled in an original data set to form n different sample data sets, then n different decision tree models are built according to the data sets, and finally a final result is obtained according to voting conditions of the decision tree models.
In the invention, for different data types, the super parameters of the algorithms need to be adjusted to achieve the optimal effect, and the key parameters of the machine learning model are as follows: n_evastiators=5000, max_depth=3, min_samples_split=6;
step five, evaluating reliability of the model
Receiver Operating Characteristics (ROC) of random forest and gradient lifting algorithm models are obtained using orange software, by describing true and false positive rates, and by plotting the true positive rate on the y-axis and the false positive rate on the x-axis to obtain a ROC curve, the area under the ROC curve (AUC), which is typically used as a measure of classifier performance. The AUC value ranges from 0 to 1, and the model with a reliable model AUC value greater than 0.5, i.e., an AUC value greater than 0.5, is considered to be a reliable model, the closer to the 1 model the more reliable.
Step six, distinguishing the deposit type
Obtaining the content of each trace element in the sphalerite of the type to be judged, predicting the deposit type by utilizing the reliable machine learning model obtained in the step five, namely, establishing a machine learning model by using a random forest and gradient lifting algorithm to obtain a classification matrix of the trace elements in the sphalerite of the type to be judged, and judging the deposit type according to the classification matrix.
In the second step, the missing value refers to the fact that the content is lower than the detection limit of a testing instrument and part of research does not carry out testing work on individual elements, and the missing value can possibly change the mean value and variance estimation in analysis, so that the invention interpolates by using a k nearest neighbor method, uses center logarithmic transformation, and enables data to accord with normal distribution.
The database also comprises mineral symbiotic combination parameters; the mineral symbiotic combination parameter is that geological information of mineral symbiosis is parameterized, namely that the symbiotic minerals existing in a deposit are marked as 1, the non-existing symbiotic minerals are marked as 0, and the symbiotic minerals comprise at least one of chalcopyrite, pyrite, galena, arsenopyrite and magnetite.
Preferably, the method for accurately distinguishing the deposit type based on sphalerite trace elements of the invention is characterized in that when the data is preprocessed,
the elements Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu with the loss value of the content of the microelements of the sphalerite less than 40% are selected;
the missing value means that the content is lower than the detection limit of a testing instrument and the test work is not carried out on the individual elements by partial researches;
for the missing values of sphalerite microelements (Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu) in the database, interpolation is carried out in data processing software XLSTAT by adopting a nearest neighbor method based on Euclidean distance without changing covariance of a data set;
in order to make the sphalerite microelements conform to normal distribution, center-to-log ratio conversion is performed on the sphalerite microelements in the ioGAS.
Preferably, the method for accurately judging the deposit type based on sphalerite trace elements is provided, and during model training,
a random forest and gradient lifting algorithm is used for establishing a machine learning model, and a sphalerite trace element training set is used for training;
random forest and gradient lifting adopts a bootstrap sampling method to randomly extract training samples from a sample set to generate a decision tree and a training subset. When constructing a decision tree, optimally dividing each node in the decision tree; the quality of node segmentation is therefore very important for creating decision trees. When the decision tree is 4500 or more, preferably 5000 or more, the depth is 3 or more, and the generated child nodes appear N times, splitting is stopped, and the parameter is the optimal parameter of the model. And N is less than or equal to 6. In the technical process, the invention finds that six deposit types are very difficult to judge through the content of the sphalerite microelements, and because the sphalerite microelements of the six deposit types have similar content, higher accuracy can be obtained only through multiple parameter adjustment and constraint by adding geological conditions.
In the technical process, the invention also tries to classify the deposit types by utilizing a plurality of machine learning methods, including Random Forests, gradient lifting decision trees, artificial neural networks, lasso algorithms, support vector machines, k-nearest neighbors (Random forces, gradient Boosting, artificial Neural Networks, least Absolute Shrinkage and Selection Operator, support Vector Machines, k-Nearest Neighbors) and the like; however, the machine learning model established by the random forest method and the gradient lifting decision tree method is found to be more reliable and accurate.
In the technical process, particularly, the VMS deposit and other deposits have similar trace element content and are highly overlapped with other types of deposits, so that the VMS deposit and other deposit types are difficult to distinguish, and the result is found to be obviously improved through parameterization of geological information.
Drawings
Fig. 1 is a schematic flow chart of an implementation for accurately distinguishing the type of a deposit based on sphalerite trace elements.
Detailed Description
Example 1
S1, data collection:
according to recently published literature 4095 sets of sphalerite trace element data were collected for 86 deposits worldwide, these 86 deposits including 11 shallow hot fluid deposits, 27 misischibi valley deposits, 4 zebra deposits, 5 jet deposit deposits, 26 skarn deposits and 12 volcanic lump sulfide deposits, the element statistics for each deposit type deposit being shown in table 1; the database has the advantage of wide data sources and covers sphalerite geochemical data of global deposits, which ensures that the usability of the used model is not limited to any one region.
The database comprises deposit names, deposit positions, deposit types, mineral symbiotic combination parameters and sphalerite trace element contents;
in the database, the names of ore deposits, the types of the ore deposits and the content of microelements of sphalerite are in a corresponding relation, namely, one ore deposit; the mineral deposit types and the content of sphalerite microelements in the mineral deposit are in corresponding relation.
The mineral symbiotic combination parameter is parameterization of geological information of mineral symbiosis, namely, the symbiotic minerals existing in a deposit are marked as 1, the non-existing symbiotic minerals are marked as 0, and the symbiotic minerals comprise at least one of chalcopyrite, pyrite, galena, arsenopyrite and magnetite;
the sphalerite microelements at least comprise Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu;
the database is an important content for learning a subsequent machine learning model, is a key for determining the deposit type discrimination, and needs a large amount of zincblende trace element content when the database is built, and the zincblende trace element content is accurately divided into different deposit types;
in the database, the content of each trace element in each ore bed type is shown in table 1;
table 1 database of trace element content for six deposit types
N=number; MIN = minimum; MAX = maximum; MEAN = average;
s2, data preprocessing:
and (3) performing nearest neighbor interpolation and center logarithmic ratio conversion on the data of various sphalerite microelements in the sphalerite microelements database established in the step (S1) so that the data covariance is unchanged and accords with normal distribution. The data of various sphalerite trace elements include Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu content;
when the data is preprocessed, the data of the trace elements of the determined types are processed under the condition of determining the deposit types, so that the data of the trace elements in the deposit types become normalized distribution; the data includes content data;
the specific operation is as follows:
for each deposit type, the following operations are performed after the deposit type is determined:
1. the elements Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu with the loss value of the zinc blende microelements less than 40 percent are selected;
2. for the missing values of sphalerite microelements (Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu) in the database, interpolation is carried out in data processing software XLSTAT by adopting a nearest neighbor method based on Euclidean distance without changing covariance of a data set;
3. in order that the content of each element in the zincblende microelements (Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu) accords with normal distribution, carrying out center logarithmic transformation on the zincblende microelements in the ioGAS;
s3, establishing a zinc blende trace element test set and a training set
1. Randomly selecting 300 groups of sphalerite trace element data from each deposit type in the data pre-processed database by using a random function to establish a training set;
2. the data of the residual sphalerite trace elements of each deposit type are used to build a test set;
s4, model training
1. Establishing a machine learning model by using a random forest and gradient lifting algorithm, and training by using the established training set;
2. random forest and gradient lifting adopts a bootstrap sampling method to randomly extract training samples from a sample set to generate a decision tree and a training subset. When constructing a decision tree, optimally dividing each node in the decision tree; the quality of node segmentation is therefore very important for creating decision trees. When the decision tree is 5000 and the depth is 3, and the generated child nodes appear 6 times, stopping splitting, wherein the parameters are optimal parameters of the model (n_identifiers=5000, max_depth=3 and min_samples_split=6);
3. the random forest and gradient lifting identification deposit type classification matrix (table 1) of the sphalerite trace element test set is obtained, and the overall classification accuracy is 93.02% and 92.82%, respectively.
TABLE 2 sphalerite microelements measurement set classification matrix
S5, evaluating reliability of the model
1. The reliability of the machine learning model is evaluated using the test set.
The performance of both models was evaluated with AUC values obtained from Receiver Operating Characteristics (ROC) curves. And (3) acquiring receiver operation characteristic curves of the random forest and gradient lifting algorithm model by using orange software, so as to acquire AUC values of 0.989 and 0.991 of the random forest and gradient lifting identification deposit types, which shows that the two machine learning models have higher reliability.
Example 2
S1.s2.s3.s4 example 1 procedure is identical and S5, S6 are specifically set forth herein
S5, acquiring a sphalerite trace data set of a lead-zinc deposit of the pool
1. Obtaining a zinc blende sample of a lead-zinc ore deposit of a clean water pond through field sampling, and manufacturing a laser sheet of the zinc blende sample;
2. and obtaining the zinc blende trace element content set of the lead-zinc ore deposit of the clean water pond by a trace element analyzer laser ablation inductively coupled plasma mass spectrometer (LA-ICP-MS).
S6, ore deposit type prediction
A machine learning model is established by utilizing a random forest and gradient lifting algorithm, the type of the lead-zinc ore deposit of the clean water pond is predicted, the obtained lead-zinc ore deposit sphalerite trace element set is established by utilizing the random forest and gradient lifting algorithm, a classification matrix (table 2) of the clean water pond sphalerite trace element is obtained,
TABLE 3 clear water pond sphalerite microelement classification matrix
The deposit type was judged to be a Misischibi Valley Type (MVT) deposit according to table 2.
Claims (6)
1. A method for accurately distinguishing the type of a deposit based on sphalerite trace elements is characterized in that; the method comprises the following steps:
step one, establishing a zinc blende trace element database
Collecting sphalerite trace element data from globally published literature;
sorting the collected data to create at least 3000 sets of trace element databases from the six deposit types;
the database comprises deposit names, deposit positions, deposit types and sphalerite trace element contents;
in the database, the names of ore deposits, the types of the ore deposits and the content of microelements of sphalerite are in corresponding relation, namely, one ore deposit; the mineral deposit types and the contents of sphalerite microelements in the mineral deposit are in corresponding relation;
the six deposit types include at least: volcanic block sulfides, michibijou type, porphyry type, shallow hydrothermal type, jet deposition type, and skarn deposit;
the sphalerite trace elements at least comprise at least 10 of Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu;
step two data preprocessing
Performing nearest neighbor interpolation and center logarithmic ratio conversion on the data of various sphalerite microelements in the sphalerite microelements database established in the first step, so that the covariance of the data is unchanged and accords with normal distribution; the data of various sphalerite trace elements include Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu content;
when the data is preprocessed, under the condition that the deposit type is determined, processing the data of the trace elements of the determined type so that the data of the trace elements in the deposit type become normalized distribution; the data includes content data;
step three, building a training set test set
For training and testing random forest and gradient lifting decision tree classifiers, a test set and a training set are established, and at least 200, preferably 200-400, random forest and gradient lifting classifiers are extracted from each deposit type for which a sphalerite trace element database is established; to avoid favoring more data classes, the same amount of data is randomly extracted from each deposit type by a random function; the rest data are used for testing to obtain a training and tested classification matrix;
step four, establishing a machine learning model
Establishing a machine learning model by utilizing a random forest and gradient lifting algorithm; randomly extracting training samples from a sample set by adopting a bootstrap sampling method to generate a decision tree and a training subset; when constructing a decision tree, optimally dividing each node in the decision tree; performing model super-parameter tuning by using cross verification, stopping splitting when the decision tree is 4500 or more, preferably 5000 or more and the depth is 3 or more and the generated child nodes appear N times, wherein the parameter is the optimal parameter of the model; the N is less than or equal to 6;
randomly sampling the model in an original data set to form n different sample data sets, constructing n different decision tree models according to the data sets, and finally obtaining a final result according to voting conditions of the decision tree models;
step five, evaluating the reliability of the model:
acquiring receiver operation characteristic curves of random forest and gradient lifting algorithm models, namely ROC curves, by describing true positive rate and false positive rate, obtaining the ROC curves by drawing the true positive rate on a y axis and drawing the false positive rate on an x axis, and taking the area under the curve AUC as a measurement standard of classifier performance; the AUC value range is 0-1, and the AUC value of the reliable model is more than 0.5, namely, the model with the AUC value more than 0.5 is considered to be a reliable model, and the closer to the 1 model, the more reliable;
step six, distinguishing the deposit type
Obtaining the content of each trace element in the sphalerite of the type to be judged, predicting the deposit type by utilizing the reliable machine learning model obtained in the step five, namely, establishing a machine learning model by using a random forest and gradient lifting algorithm to obtain a classification matrix of the trace elements in the sphalerite of the type to be judged, and judging the deposit type according to the classification matrix.
2. The method for accurately distinguishing the deposit type based on sphalerite trace elements according to claim 1, wherein the method is characterized by comprising the following steps: in the second step, interpolation is carried out by using a k nearest neighbor method, and central logarithmic transformation is used to enable data to accord with normal distribution.
3. The method for accurately distinguishing the deposit type based on sphalerite trace elements according to claim 1, wherein the method is characterized by comprising the following steps: in the course of the pre-processing of the data,
the elements Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu with the loss value of the content of the microelements of the sphalerite less than 40% are selected;
the missing value means that the content is lower than the detection limit of a testing instrument and the test work is not carried out on the individual elements by partial researches;
for the missing values of the sphalerite microelements Ag, as, cd, co, ga, ge, sb, pb, fe, mn, in, sn and Cu in the database, interpolation is carried out in the data processing software XLSTAT by adopting a nearest neighbor method based on Euclidean distance, wherein the covariance of the data set is not changed;
in order to make the sphalerite microelements conform to normal distribution, center-to-log ratio conversion is performed on the sphalerite microelements in the ioGAS.
4. The method for accurately distinguishing the deposit type based on sphalerite trace elements according to claim 1, wherein the method is characterized by comprising the following steps: in the third step, training and testing a classification matrix for evaluating the accuracy of model training, each column representing a predicted deposit type, the total number of columns representing the number of data predicted as the deposit type; each row represents the true deposit type of data, and the total number of data for each row represents the number of data instances for that deposit type.
5. The method for accurately distinguishing the deposit type based on sphalerite trace elements according to claim 1, wherein the method is characterized by comprising the following steps: during model training, training is carried out by using the existing sphalerite trace element training set.
6. The method for accurately distinguishing the deposit type based on sphalerite trace elements according to claim 1, wherein the method is characterized by comprising the following steps: the key parameters of the machine learning model are as follows: n_evastiators=5000, max_depth=3, min_samples_split=6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310533691.0A CN116933151A (en) | 2023-05-12 | 2023-05-12 | Method for distinguishing deposit types based on sphalerite trace elements |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310533691.0A CN116933151A (en) | 2023-05-12 | 2023-05-12 | Method for distinguishing deposit types based on sphalerite trace elements |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116933151A true CN116933151A (en) | 2023-10-24 |
Family
ID=88388491
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310533691.0A Pending CN116933151A (en) | 2023-05-12 | 2023-05-12 | Method for distinguishing deposit types based on sphalerite trace elements |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116933151A (en) |
-
2023
- 2023-05-12 CN CN202310533691.0A patent/CN116933151A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977028A (en) | A kind of Software Defects Predict Methods based on genetic algorithm and random forest | |
CN106557778A (en) | Generic object detection method and device, data processing equipment and terminal device | |
Saba et al. | Improved statistical features for cursive character recognition | |
CN103106265B (en) | Similar image sorting technique and system | |
CN104463199A (en) | Rock fragment size classification method based on multiple features and segmentation recorrection | |
CN100507509C (en) | Oil gas water multiphase flow type identification method based on main component analysis and supporting vector machine | |
US20160193630A1 (en) | High capacity cascade-type mineral sorting machine and method | |
CN112232399B (en) | Automobile seat defect detection method based on multi-feature fusion machine learning | |
CN107316036A (en) | A kind of insect recognition methods based on cascade classifier | |
Koch et al. | Automated drill core mineralogical characterization method for texture classification and modal mineralogy estimation for geometallurgy | |
CN111242202A (en) | Method for monitoring wear state of turning tool based on metric learning | |
CN110532946A (en) | A method of the green vehicle spindle-type that is open to traffic is identified based on convolutional neural networks | |
CN107682109A (en) | A kind of interference signal classifying identification method suitable for UAV Communication system | |
CN115409797A (en) | PCB defect image detection method based on improved deep learning algorithm | |
CN112991271A (en) | Aluminum profile surface defect visual detection method based on improved yolov3 | |
CN113516228A (en) | Network anomaly detection method based on deep neural network | |
CN109859199B (en) | Method for detecting quality of freshwater seedless pearls through SD-OCT image | |
CN105354583B (en) | Unbalanced data sorting technique based on local mean value | |
CN104134073B (en) | One kind is based on the normalized remote sensing image list class sorting technique of a class | |
CN116933151A (en) | Method for distinguishing deposit types based on sphalerite trace elements | |
CN109612961B (en) | Open set identification method of coastal environment micro-plastic | |
CN112817954A (en) | Missing value interpolation method based on multi-method ensemble learning | |
JPH05302897A (en) | Equipment for surface inspection | |
CN114627333A (en) | Zinc flotation froth image classification algorithm and system for improving deep active learning | |
CN113627531B (en) | Method for determining pear ring rot resistance based on support vector machine classification algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |