CN111624681A - Hurricane intensity change prediction method based on data mining - Google Patents

Hurricane intensity change prediction method based on data mining Download PDF

Info

Publication number
CN111624681A
CN111624681A CN202010454683.3A CN202010454683A CN111624681A CN 111624681 A CN111624681 A CN 111624681A CN 202010454683 A CN202010454683 A CN 202010454683A CN 111624681 A CN111624681 A CN 111624681A
Authority
CN
China
Prior art keywords
data
hurricane
hours
classification
experiment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010454683.3A
Other languages
Chinese (zh)
Inventor
杨祺铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202010454683.3A priority Critical patent/CN111624681A/en
Publication of CN111624681A publication Critical patent/CN111624681A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/10Devices for predicting weather conditions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather

Landscapes

  • Environmental & Geological Engineering (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Atmospheric Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Ecology (AREA)
  • Environmental Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a hurricane intensity change prediction method based on a data mining model, which comprises the following steps: the method comprises the following steps: acquiring hurricane meteorological data and preprocessing the hurricane meteorological data; step two: finding a suitable classification algorithm and exploring the possibilities of RI-type hurricane classification; step three: putting the data test set into a hurricane intensity prediction model for integrated training; step four: selecting an optimal hurricane intensity prediction model from the ensemble learning according to an evaluation index system; step five: the hurricane wind power classification prediction experiment is carried out 6 hours, 12 hours and 18 hours in the future of the hurricane; the invention establishes a hurricane intensity model with better performance and less complexity and capable of basically and accurately predicting by utilizing a data mining and integrated learning method on a Weka platform, does not depend on meteorology and dynamics knowledge, does not care about a hurricane physical model and a prediction model of formation reasons, ensures the time for timely early warning and planning a disaster relief scheme, enables people to know the arrival of a hurricane in advance, well prepares for prevention and reduces economic loss.

Description

Hurricane intensity change prediction method based on data mining
Technical Field
The invention relates to the technical field of prediction of hurricane intensity change, in particular to a hurricane intensity change prediction method based on data mining.
Background
Tropical cyclones are cyclonic loops that are generated on tropical and subtropical seas, where cyclones with central wind speeds up to 33 meters per second and above are called typhoons or hurricanes. Although hurricane energy increases rainfall in arid areas while balancing heat, it presents a significant hazard, such as destroying houses, trees, and flooding, that threatens the safety of people's lives and property, as well as placing an economic burden on the country.
However, the cause of hurricanes is not fully understood in current research, and the factors that affect the increase in hurricane intensity include many aspects, some of which are unknown. The existing hurricane forecasting modes are mainly divided into three categories, namely a statistical mode, a statistical-dynamic mode and a numerical mode, when the methods are used for calculating the Maximum Possible Intensity (MPI) of different hurricanes, a proper scheme needs to be selected according to expert experience, and the intensity prediction results of different calculation schemes for the same hurricane event are different. Therefore, there is a need to utilize scientific means to explore a general prediction model for hurricanes that does not depend on meteorology, dynamics knowledge, and does not concern hurricane physical structure and formation cause, so as to reduce the loss caused by hurricanes.
In the wave of continuous innovation of information technology, various industries generate a large amount of data with different types and different structures, a lot of unknown but useful information is hidden in the data, and data mining is a process of searching useful information hidden in the data from the large amount of noisy heterogeneous data through an algorithm. By means of a data mining technology, ambient airflow data when a hurricane is generated, some internal structures of the hurricane and ocean data of a falling surface are explored, the change rule of the hurricane is obtained, a model capable of basically and accurately predicting the change of the hurricane strength is finally obtained, reference information is provided for the government, the time for timely early warning and preparing a disaster relief scheme is ensured, people are informed of the arrival of the hurricane in advance, and therefore the protection can be made, and economic loss is reduced.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a hurricane intensity change prediction method based on a data mining model aiming at the problems that the existing typhoon prediction method has unstable results, incompleteness and accuracy are to be improved.
The technical scheme provided by the invention is as follows: a hurricane intensity variation prediction method based on a data mining model comprises the following steps:
step 1: acquiring and preprocessing hurricane meteorological data
The data preprocessing comprises five parts of hurricane intensity classification, data cleaning, data format conversion, data segmentation and classification data processing into prediction classification data;
step 2: finding suitable classification algorithms and exploring the possibilities of RI type hurricane classification
By utilizing an RI strategy, setting an attribute of 'whether the data set is RI type' as a classification attribute when the hurricane intensity change is predicted, and then training by utilizing a classification algorithm;
and step 3: putting the data test set into a hurricane intensity prediction model for integrated training
And reasonably selecting an algorithm as a Bagging-based classifier to learn the data set so as to obtain an optimal hurricane intensity change prediction model.
And 4, step 4: selecting optimal hurricane intensity prediction model from ensemble learning according to evaluation index system
Sorting 10 prediction functions obtained by the trained system according to classification accuracy, selecting 5 prediction functions with the highest accuracy rate, adding the prediction functions into a decision group, and voting and selecting classification results by considering various indexes;
and 5: hurricane wind force grading prediction experiment for 6 hours, 12 hours and 18 hours in future of hurricane
The experiment of step 4 was conducted based on the selected best set of algorithms of step 3, exploring the ability to predict hurricane wind ratings 6 hours, 12 hours, and 18 hours into the future.
As an improvement, the specific implementation process of the step 1 comprises the following sub-steps:
step 1.1: dividing hurricane intensity into 12 levels according to the size of the central wind speed, and establishing a new data item VCLASS in a data table;
step 1.2: deleting the attribute columns with the data deletion rate higher than 1%, and completing the deletion values of the attribute columns lower than 1% by using a Weka.
Step 1.3: discretizing hurricane rating data using the Weka. filters. unsupervised. attri-bute. numerics to normanal on the Weka platform;
step 1.4: taking the data obtained in the step 1.3 as initial data for a data mining experiment, and separating a training set and a test set from the initial data;
step 1.5: processing the classified data into predicted data, processing the hurricane data according to the name of the hurricane, processing x pieces of data of each hurricane, setting the predicted values as data after 6 hours, 12 hours and 18 hours, namely the values of VCLASS attribute items of data with the prediction levels of i +1, i +2 and i +3 at 6 hours, 12 hours and 18 hours of the ith piece of data, then deleting 3 pieces of data at the tail of each hurricane x piece of data, and finally obtaining 3 groups of data sets and training sets, wherein the data sets are respectively the predicted levels after 6 hours, 12 hours and 18 hours.
As an improvement, the specific implementation process of the step 2 comprises the following sub-steps:
step 2.1: 5 algorithms, REPTree, LMT (Logistic model tree), J48(C4.5), IBk (kNN) and MultilayerPerceptron (BP neural network), were selected for classification, finding suitable classification algorithms and exploring the possibility of RI type hurricane classification.
As an improvement, the specific implementation process of the step 3 comprises the following sub-steps:
step 3.1: dividing the experiment into an experiment 1 and an experiment 2, selecting a ten-fold cross validation method for all experiment groups in the selection of a data mining test method, randomly dividing an input data set into 10 parts by a system, selecting 9 parts of the 10 parts of the input data set as training data in turn, taking 1 part of the input data set as test data to perform the experiment, setting 10 base classifiers for each experiment group to participate in the training in the aspect of parameter setting of a Bagging framework, and adopting the same Bagging setting for all the experiment groups, wherein the experiment is realized by carrying out secondary development on a Bagging function based on Weka;
step 3.1.1: the parameter setting and parameter meaning of five classification algorithms in Weka basically adopt default parameter setting, the modification part is that IBk algorithm sets k value to 5, rossValidate is set to True, the program is allowed to select the optimal k value to classify unknown points between 1-k by a cross validation method in the running process, distance weighting selects 1/distance, and GUI of MultilayerPerceptron is set to True;
step 3.2: experiment 1 is to compare the performance of various algorithms as Bagging base classifiers in 5 algorithms of REPTree, LMT (Logistic model tree), J48(C4.5), IBk (kNN) and MultilayerPerceptron (BP neural network), and find out the algorithm with the accuracy of more than 85% from the performances to carry out experiment 2;
step 3.2.1: five algorithms of REPTree, LMT, J48, IBk and MultilayerPerceptron are obtained and used as Bagging-based classifier algorithm to carry out ensemble learning on the data set;
step 3.2.2, judging whether the algorithm can be adopted according to the indexes of the ten-fold cross validation, trying to adjust parameters to improve the classification accuracy, and selecting a proper algorithm as a base classifier to carry out integrated training after the algorithm achieves proper accuracy through adjustment;
step 3.2.3: establishing a comprehensive evaluation system which is composed of classification accuracy serving as a main index and F-Measure, average absolute error, root mean square error and AUC value serving as auxiliary reference indexes;
step 3.3.1: experiment 2 is to use the combination of two algorithms, three algorithms and four algorithms as the basic classifier of Bagging for integrated training aiming at the proper algorithm obtained in experiment 1;
step 3.3.2: and judging and selecting the data mining test by using a ten-fold cross validation method.
As an improvement, the specific implementation process of the step 4 comprises the following sub-steps:
step 4.1: and selecting the optimal hurricane intensity change prediction model according to an evaluation index system, such as classification accuracy, a confusion matrix, and comparative analysis of the integrated hurricane intensity prediction model by considering indexes such as F-Measure, average absolute error, root mean square error, AUC value and the like.
As an improvement, step 5, the model for obtaining the optimal hurricane intensity variation is the LMT-MultilayerPerceptron model, and the process is ended.
Compared with the prior art, the invention has the advantages that: the method uses a data mining method to analyze a large amount of western Pacific hurricane data, firstly finds out a proper classifier through RI type classification experiments and hurricane wind power strength classification for integrated training, and integrates a plurality of classical single classifier algorithms through Bagging integrated learning to obtain a good hurricane prediction model.
Drawings
FIG. 1 is a general flow chart of the process of the present invention.
FIG. 2 is a comparison of the classification problem approach.
FIG. 3 is a sample preference versus prediction graph.
FIG. 4 is a schematic diagram of a hurricane force prediction model based on Bagging method.
Detailed Description
The following examples are included to provide further detailed description of the present invention and to provide those skilled in the art with a more complete, concise, and exact understanding of the principles and spirit of the invention.
Referring to fig. 1-4, a method for predicting hurricane intensity variations based on a data mining model comprises the following steps:
step 1: acquiring and preprocessing hurricane meteorological data
Because of the difficulty of data gathering, some hurricane data is not successfully gathered, missing, and some data items are meaningless, thus requiring preprocessing of the hurricane data. The data preprocessing comprises five parts of hurricane intensity classification, data cleaning, data format conversion, data segmentation and classification data processing into prediction classification data.
Step 2: finding suitable classification algorithms and exploring the possibilities of RI type hurricane classification
We used the RI strategy proposed by kaplananddematia et al, and set the attribute of "whether RI type" in the data set as a classification attribute when predicting hurricane intensity variations, and then trained using a classification algorithm.
And step 3: putting the data test set into a hurricane intensity prediction model for integrated training
And reasonably selecting an algorithm as a Bagging-based classifier to learn the data set so as to obtain an optimal hurricane intensity change prediction model.
And 4, step 4: selecting optimal hurricane intensity prediction model from ensemble learning according to evaluation index system
And sorting the 10 prediction functions obtained by the trained system according to the classification accuracy, selecting the 5 prediction functions with the highest accuracy rate, adding the 5 prediction functions into a decision group, and voting and selecting the classification result by considering various indexes.
And 5: hurricane wind force grading prediction experiment for 6 hours, 12 hours and 18 hours in future of hurricane
The experiment of step 4 was conducted based on the selected best set of algorithms of step 3, exploring the ability to predict hurricane wind ratings 6 hours, 12 hours, and 18 hours into the future.
The specific implementation process of the step 1 comprises the following substeps:
step 1.1: hurricane intensity is divided into 12 levels according to the size of the central wind speed and a new data item VCLASS is created in the data table.
Step 1.2: attribute columns with data loss rate higher than 1% are deleted, and the loss values of attribute columns lower than 1% are completed by using the function of Weka.
Step 1.3: hurricane rating data was discretized using the Weka. filters. unsupervised. attri-bute. numerics to normanal on the Weka platform.
Step 1.4: and (4) taking the data obtained in the step 1.3 as initial data for a data mining experiment, and separating a training set and a test set from the initial data.
Step 1.5: processing the classified data into predicted data, processing the hurricane data according to the name of the hurricane, processing x pieces of data of each hurricane, setting the predicted values as data after 6 hours, 12 hours and 18 hours, namely the values of VCLASS attribute items of data with the prediction levels of i +1, i +2 and i +3 at 6 hours, 12 hours and 18 hours of the ith piece of data, then deleting 3 pieces of data at the tail of each hurricane x piece of data, and finally obtaining 3 groups of data sets and training sets, wherein the data sets are respectively the predicted levels after 6 hours, 12 hours and 18 hours.
The specific implementation process of the step 2 comprises the following substeps:
step 2.1: 5 algorithms, REPTree, LMT (Logistic model tree), J48(C4.5), IBk (kNN) and MultilayerPerceptron (BP neural network), were selected for classification, finding suitable classification algorithms and exploring the possibility of RI type hurricane classification.
The specific implementation process of the step 3 comprises the following substeps:
step 3.1: the experiments were divided into experiment 1 and experiment 2. In the selection of the data mining test method, a ten-fold cross validation method is selected for all experimental groups. The system randomly divides the input data set into 10 parts, and selects 9 parts as training data and 1 part as test data in turn to carry out experiments. In the aspect of parameter setting of the Bagging framework, 10 base classifiers are set for each experiment group to participate in training, and all the experiment groups adopt the same Bagging setting. This experiment was carried out by a secondary development of the Bagging function based on Weka.
Step 3.1.1: the parameter setting and parameter meaning of five classification algorithms in Weka basically adopt default parameter setting, the modification part is IBk algorithm to set k value to 5, rossValidate to True, the program is allowed to select the optimal k value to classify unknown points between 1 and k by a cross validation method in the running process, distance weighting selects 1/distance, and GUI of MultilayerPerceptron is set to True.
Step 3.2: experiment 1 is to compare the performance of various algorithms as Bagging-based classifiers among 5 algorithms of REPTree, LMT (Logistic model tree), J48(C4.5), IBk (kNN) and MultilayerPerceptron (BP neural network), and to find out the algorithm with the accuracy of more than 85% from the performance of the Bagging-based classifier, and carry out experiment 2.
Step 3.2.1: five algorithms of REPTree, LMT, J48, IBk and MultilayerPerceptron are obtained to be used as Bagging-based classifier algorithm to carry out ensemble learning on the data set.
And 3.2.2, judging whether the algorithm can be adopted according to the indexes of the ten-fold cross validation, and trying to adjust parameters to improve the classification accuracy. After the algorithm is adjusted to reach the proper accuracy, the proper algorithm is selected as a base classifier for integrated training.
Step 3.2.3: and establishing a comprehensive evaluation system which is composed of classification accuracy serving as a main index and F-Measure, average absolute error, root mean square error and AUC (AUC) values serving as auxiliary reference indexes.
Step 3.3.1: experiment 2 is a suitable algorithm obtained by aiming at experiment 1, and the logic model tree algorithm is used as a main algorithm, and other three algorithms are expanded into bag-based classifier sequences with equal numbers to form 6 combinations, namely an LMT-multilayerPerceptron model, an LMT-J48 model, an LMT-REPTree model, an LMT-multilayerPerceptron-REPTree model, an LMT-multilayerperperperpton-J48 model and an LMT-multilayerPerceptron-REPTree-J48 model. And inputting the test set into Bagging to train the models, wherein the training result is the prediction models with the same number as the base classifier sequences.
Step 3.3.2: and judging and selecting the data mining test by using a ten-fold cross validation method.
The specific implementation process of the step 4 comprises the following substeps:
step 4.1: and selecting the optimal hurricane intensity change prediction model according to an evaluation index system, such as classification accuracy, a confusion matrix, and comparative analysis of the integrated hurricane intensity prediction model by considering indexes such as F-Measure, average absolute error, root mean square error, AUC value and the like.
Step 4.1.1 classification accuracy refers to the percentage of the correct result of model prediction in the total number of samples, which is used to evaluate the classification model. The accuracy calculation formula is shown below, where TP (true positive case) is that the positive class samples are correctly predicted as the positive class, TN (true negative case) is that the negative class samples are correctly predicted as the negative class, FP (false positive case) is that the negative class samples are incorrectly predicted as the positive class, and FN (false negative case) is that the positive class samples are incorrectly predicted as the negative class.
Figure BDA0002508780140000061
Figure BDA0002508780140000062
Figure BDA0002508780140000063
Step 4.1.2: it is not enough to use only classification accuracy as an index for measuring a hurricane prediction model, but accuracy and recall ratio are mutually influenced, and it is difficult to simultaneously satisfy that both the ratios are high, so an F-Measure concept is introduced, the F-Measure is a weighted harmonic mean value of accuracy and recall ratio, and the formula is shown as follows, when alpha is 1, F-Measure is 2PR/(P + R), and when F-Measure is higher, the model performance is better.
Figure BDA0002508780140000064
Step 4.1.3: the ROC curve is a two-dimensional curve drawn by taking FPR as an abscissa and TPR as an ordinate. Wherein TPR (true normal rate) is recall rate and FPR (false positive rate) is FP/(FP + TN). The AUC value (AreaUnderCurve) is defined as the area under the ROC curve enclosed by the coordinate axes. And the ROC curve is generally positioned above the straight line of y-x, so the value range of AUC is generally between 0.5 and 1, and the higher the AUC value accuracy is, namely the closer the ROC curve is to the upper left corner, the better the classification effect of the classifier is.
Step 4.1.4: the Mean Absolute Error (MAE) is the average of the absolute values of the deviations of all individual predictors from the predicted arithmetic mean. The average absolute error avoids the mutual cancellation of positive and negative values of the error, so that the actual situation of the error of the predicted value can be better reflected, and the formula is represented as follows:
Figure BDA0002508780140000071
step 4.1.5: root Mean Square Error (RMSE) is the square root of the ratio of the sum of the squares of the predicted values to the deviations from truth to the number of predictions n. The root mean square error is very sensitive to extra or extra small errors occurring in the prediction and is therefore suitable for measuring the accuracy of the model. Is formulated as:
Figure BDA0002508780140000072
and 5, obtaining that the optimal hurricane intensity change model is an LMT-MultilayerPerceptron model, and ending.
The above examples of the present invention are merely examples for clearly illustrating the present invention and are not intended to limit the embodiments of the present invention. All such modifications and variations are within the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. A hurricane intensity variation prediction method based on a data mining model is characterized by comprising the following steps:
step 1: acquiring and preprocessing hurricane meteorological data
The data preprocessing comprises five parts of hurricane intensity classification, data cleaning, data format conversion, data segmentation and classification data processing into prediction classification data;
step 2: finding suitable classification algorithms and exploring the possibilities of RI type hurricane classification
By utilizing an RI strategy, setting an attribute of 'whether the data set is RI type' as a classification attribute when the hurricane intensity change is predicted, and then training by utilizing a classification algorithm;
and step 3: putting the data test set into a hurricane intensity prediction model for integrated training
And reasonably selecting an algorithm as a Bagging-based classifier to learn the data set so as to obtain an optimal hurricane intensity change prediction model.
And 4, step 4: selecting optimal hurricane intensity prediction model from ensemble learning according to evaluation index system
Sorting 10 prediction functions obtained by the trained system according to classification accuracy, selecting 5 prediction functions with the highest accuracy rate, adding the prediction functions into a decision group, and voting and selecting classification results by considering various indexes;
and 5: hurricane wind force grading prediction experiment for 6 hours, 12 hours and 18 hours in future of hurricane
The experiment of step 4 was conducted based on the selected best set of algorithms of step 3, exploring the ability to predict hurricane wind ratings 6 hours, 12 hours, and 18 hours into the future.
2. The data mining model-based hurricane force change prediction method of claim 1, wherein the detailed implementation procedure of step 1 comprises the following sub-steps:
step 1.1: dividing hurricane intensity into 12 levels according to the size of the central wind speed, and establishing a new data item VCLASS in a data table;
step 1.2: deleting the attribute columns with the data deletion rate higher than 1%, and completing the deletion values of the attribute columns lower than 1% by using a Weka.
Step 1.3: discretizing hurricane rating data using the Weka. filters. unsupervised. attri-bute. numerics to normanal on the Weka platform;
step 1.4: taking the data obtained in the step 1.3 as initial data for a data mining experiment, and separating a training set and a test set from the initial data;
step 1.5: processing the classified data into predicted data, processing the hurricane data according to the name of the hurricane, processing x pieces of data of each hurricane, setting the predicted values as data after 6 hours, 12 hours and 18 hours, namely the values of VCLASS attribute items of data with the prediction levels of i +1, i +2 and i +3 at 6 hours, 12 hours and 18 hours of the ith piece of data, then deleting 3 pieces of data at the tail of each hurricane x piece of data, and finally obtaining 3 groups of data sets and training sets, wherein the data sets are respectively the predicted levels after 6 hours, 12 hours and 18 hours.
3. The data mining model-based hurricane force change prediction method of claim 1, wherein the detailed implementation procedure of step 2 comprises the following sub-steps:
step 2.1: 5 algorithms, REPTree, LMT (Logistic model tree), J48(C4.5), IBk (kNN) and MultilayerPerceptron (BP neural network), were selected for classification, finding suitable classification algorithms and exploring the possibility of RI type hurricane classification.
4. The data mining model-based hurricane force change prediction method of claim 1, wherein the detailed implementation procedure of step 3 comprises the following sub-steps:
step 3.1: dividing the experiment into an experiment 1 and an experiment 2, selecting a ten-fold cross validation method for all experiment groups in the selection of a data mining test method, randomly dividing an input data set into 10 parts by a system, selecting 9 parts of the 10 parts of the input data set as training data in turn, taking 1 part of the input data set as test data to perform the experiment, setting 10 base classifiers for each experiment group to participate in the training in the aspect of parameter setting of a Bagging framework, and adopting the same Bagging setting for all the experiment groups, wherein the experiment is realized by carrying out secondary development on a Bagging function based on Weka;
step 3.1.1: the parameter setting and parameter meaning of five classification algorithms in Weka basically adopt default parameter setting, the modification part is that IBk algorithm sets k value to 5, rossValidate is set to True, the program is allowed to select the optimal k value to classify unknown points between 1-k by a cross validation method in the running process, distance weighting selects 1/distance, and GUI of MultilayerPerceptron is set to True;
step 3.2: experiment 1 is to compare the performance of various algorithms as Bagging base classifiers in 5 algorithms of REPTree, LMT (Logistic model tree), J48(C4.5), IBk (kNN) and MultilayerPerceptron (BP neural network), and find out the algorithm with the accuracy of more than 85% from the performances to carry out experiment 2;
step 3.2.1: five algorithms of REPTree, LMT, J48, IBk and MultilayerPerceptron are obtained and used as Bagging-based classifier algorithm to carry out ensemble learning on the data set;
step 3.2.2, judging whether the algorithm can be adopted according to the indexes of the ten-fold cross validation, trying to adjust parameters to improve the classification accuracy, and selecting a proper algorithm as a base classifier to carry out integrated training after the algorithm achieves proper accuracy through adjustment;
step 3.2.3: establishing a comprehensive evaluation system which is composed of classification accuracy serving as a main index and F-Measure, average absolute error, root mean square error and AUC value serving as auxiliary reference indexes;
step 3.3.1: experiment 2 is to use the combination of two algorithms, three algorithms and four algorithms as the basic classifier of Bagging for integrated training aiming at the proper algorithm obtained in experiment 1;
step 3.3.2: and judging and selecting the data mining test by using a ten-fold cross validation method.
5. The data mining model-based hurricane force change prediction method of claim 1, wherein the detailed implementation procedure of step 4 comprises the following sub-steps:
step 4.1: and selecting the optimal hurricane intensity change prediction model according to an evaluation index system, such as classification accuracy, a confusion matrix, and comparative analysis of the integrated hurricane intensity prediction model by considering indexes such as F-Measure, average absolute error, root mean square error, AUC value and the like.
6. A data mining model-based hurricane intensity variation prediction method as per claim 1, wherein step 5, deriving the optimal hurricane intensity variation model is LMT-MultilayerPerceptron model, ending.
CN202010454683.3A 2020-05-26 2020-05-26 Hurricane intensity change prediction method based on data mining Pending CN111624681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010454683.3A CN111624681A (en) 2020-05-26 2020-05-26 Hurricane intensity change prediction method based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010454683.3A CN111624681A (en) 2020-05-26 2020-05-26 Hurricane intensity change prediction method based on data mining

Publications (1)

Publication Number Publication Date
CN111624681A true CN111624681A (en) 2020-09-04

Family

ID=72258192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010454683.3A Pending CN111624681A (en) 2020-05-26 2020-05-26 Hurricane intensity change prediction method based on data mining

Country Status (1)

Country Link
CN (1) CN111624681A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255772A (en) * 2021-05-27 2021-08-13 北京玻色量子科技有限公司 Data analysis method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004346653A (en) * 2003-05-23 2004-12-09 Kyushu Univ System, method and program for forecasting earth-flow disaster
WO2012046959A1 (en) * 2010-10-07 2012-04-12 서울대학교산학협력단 Prediction model for summer typhoon number and track for each group
CN104932035A (en) * 2015-05-26 2015-09-23 中国科学院深圳先进技术研究院 Typhoon intensity prediction method and system
WO2016057859A1 (en) * 2014-10-10 2016-04-14 The Penn State Research Foundation Identifying visual storm signatures form satellite images
CN107179566A (en) * 2017-05-12 2017-09-19 周调彪 The self study modification method and system of a kind of district weather forecasting
WO2017193153A1 (en) * 2016-05-11 2017-11-16 Commonwealth Scientific And Industrial Research Organisation Solar power forecasting
CN109063939A (en) * 2018-11-01 2018-12-21 华中科技大学 A kind of wind speed forecasting method and system based on neighborhood door shot and long term memory network
CN109902885A (en) * 2019-04-09 2019-06-18 中国人民解放军国防科技大学 Typhoon prediction method based on deep learning mixed CNN-LSTM model
CN110824586A (en) * 2019-10-23 2020-02-21 上海理工大学 Rainfall prediction method based on improved decision tree algorithm
CN110837137A (en) * 2019-11-07 2020-02-25 刘健华 Typhoon prediction alarm method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004346653A (en) * 2003-05-23 2004-12-09 Kyushu Univ System, method and program for forecasting earth-flow disaster
WO2012046959A1 (en) * 2010-10-07 2012-04-12 서울대학교산학협력단 Prediction model for summer typhoon number and track for each group
WO2016057859A1 (en) * 2014-10-10 2016-04-14 The Penn State Research Foundation Identifying visual storm signatures form satellite images
CN104932035A (en) * 2015-05-26 2015-09-23 中国科学院深圳先进技术研究院 Typhoon intensity prediction method and system
WO2017193153A1 (en) * 2016-05-11 2017-11-16 Commonwealth Scientific And Industrial Research Organisation Solar power forecasting
CN107179566A (en) * 2017-05-12 2017-09-19 周调彪 The self study modification method and system of a kind of district weather forecasting
CN109063939A (en) * 2018-11-01 2018-12-21 华中科技大学 A kind of wind speed forecasting method and system based on neighborhood door shot and long term memory network
CN109902885A (en) * 2019-04-09 2019-06-18 中国人民解放军国防科技大学 Typhoon prediction method based on deep learning mixed CNN-LSTM model
CN110824586A (en) * 2019-10-23 2020-02-21 上海理工大学 Rainfall prediction method based on improved decision tree algorithm
CN110837137A (en) * 2019-11-07 2020-02-25 刘健华 Typhoon prediction alarm method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RUIXIN YANG: ""A Systematic Classification Investigation of Rapid Intensification of Atlantic Tropical Cyclones with the SHIPS Database"", 《AMERICAN METEOROLOGICAL SOCIETY》 *
SHUHAN YANG,QINGXIANG MENG: ""Hurricane intensity prediction based on time series data mining"", 《2019 10TH INTERNATIONAL WORKSHOP ON THE ANALYSIS OF MULTITEMPORAL REMOTE SENSING IMAGES(MULTITEMP)》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255772A (en) * 2021-05-27 2021-08-13 北京玻色量子科技有限公司 Data analysis method and device

Similar Documents

Publication Publication Date Title
CN106779087B (en) A kind of general-purpose machinery learning data analysis platform
CN109508360B (en) Geographical multivariate stream data space-time autocorrelation analysis method based on cellular automaton
CN110070141A (en) A kind of network inbreak detection method
CN108363810A (en) Text classification method and device
CN111834010B (en) Virus detection false negative identification method based on attribute reduction and XGBoost
CN111339478B (en) Meteorological data quality assessment method based on improved fuzzy analytic hierarchy process
CN110442143A (en) A kind of unmanned plane situation data clustering method based on combination multiple target dove group's optimization
CN113052225A (en) Alarm convergence method and device based on clustering algorithm and time sequence association rule
Pang et al. Improving deep forest by screening
CN109460872B (en) Mobile communication user loss imbalance data prediction method
CN111624681A (en) Hurricane intensity change prediction method based on data mining
CN107195297A (en) A kind of normalized TSP question flock of birds speech recognition system of fused data
CN111584010B (en) Key protein identification method based on capsule neural network and ensemble learning
CN116702132A (en) Network intrusion detection method and system
CN112508363A (en) Deep learning-based power information system state analysis method and device
CN108491968A (en) Based on agricultural product quality and safety emergency resources scheduling model computational methods
Lin et al. A new density-based scheme for clustering based on genetic algorithm
Kim et al. Anomaly pattern detection in streaming data based on the transformation to multiple binary-valued data streams
CN110554429A (en) Earthquake fault identification method based on variable neighborhood sliding window machine learning
CN113852612A (en) Network intrusion detection method based on random forest
CN114443851A (en) Improved knowledge graph generation method based on probability calibration
CN107609983A (en) A kind of topological structure and the community discovery method of nodal community comprehensive analysis
CN116150038B (en) Neuron sensitivity-based white-box test sample generation method
CN111953701B (en) Abnormal flow detection method based on multi-dimensional feature fusion and stack integrated learning
Khalid et al. A robust ensemble based approach to combine heterogeneous classifiers in the presence of class label noise

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200904