CN114187977A - Equipment material spheroidization degree prediction method and system - Google Patents

Equipment material spheroidization degree prediction method and system Download PDF

Info

Publication number
CN114187977A
CN114187977A CN202111496811.1A CN202111496811A CN114187977A CN 114187977 A CN114187977 A CN 114187977A CN 202111496811 A CN202111496811 A CN 202111496811A CN 114187977 A CN114187977 A CN 114187977A
Authority
CN
China
Prior art keywords
samples
individual
sample set
data sample
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111496811.1A
Other languages
Chinese (zh)
Other versions
CN114187977B (en
Inventor
李如
曹逻炜
李光海
陈良超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Special Equipment Inspection and Research Institute
Original Assignee
China Special Equipment Inspection and Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Special Equipment Inspection and Research Institute filed Critical China Special Equipment Inspection and Research Institute
Priority to CN202111496811.1A priority Critical patent/CN114187977B/en
Publication of CN114187977A publication Critical patent/CN114187977A/en
Application granted granted Critical
Publication of CN114187977B publication Critical patent/CN114187977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Genetics & Genomics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Physiology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method and a system for predicting equipment material spheroidization degree. The method firstly expands the minority samples based on the Borderline-SMOTE algorithm, then determines the hyper-parameters of the SVM model through the differential evolution algorithm, and improves the model prediction precision.

Description

Equipment material spheroidization degree prediction method and system
Technical Field
The invention relates to the technical field of equipment material performance monitoring, in particular to a method and a system for predicting equipment material spheroidization degree.
Background
Spheroidization refers to a process that the form of cementite in pearlite is gradually changed from a lamellar structure into a sphere in the long-term high-temperature (440 ℃ -760 ℃) operation process of steel materials such as carbon steel, alloy steel and the like. The material spheroidization refers to a process that the form of cementite in pearlite is gradually changed into a spherical form from a lamellar structure in the long-term high-temperature (440-760 ℃) operation process of steel materials such as carbon steel, alloy steel and the like, and the material degradation damage commonly seen in high-temperature equipment such as power station boilers and the like is caused. Generally, the spheroidization of materials is influenced by temperature and stress, and the creep damage speed is accelerated to a certain extent, so that the mechanical properties of equipment and high-temperature parts thereof are reduced, and serious accidents such as equipment deformation, pipe explosion and cracking are caused. The spheroidization grade is used as an important evaluation index of equipment spheroidization, and has important significance for determining the safety of the equipment. In addition, the equipment risk can be evaluated by combining the material spheroidization grade prediction result with the creep damage condition, so that corresponding protective measures are established.
The material spheroidizing is material deterioration and damage commonly found in high-temperature equipment such as power station boilers and the like, and can cause reduction of mechanical properties of the equipment, thereby causing serious accidents such as equipment deformation, tube explosion and cracking and the like. The spheroidization grade is used as an important evaluation index of equipment spheroidization, and has important significance for determining the safety of the equipment.
At present, the project is generally graded by manually comparing the standard spheroidization metallographic atlas of the common steel. Scholars at home and abroad develop a series of spheroidization grade judgment researches based on methods such as fractal dimension, gray level co-occurrence matrix, nonlinear ultrasonic detection and the like. Compared with analysis methods which are complex in operation and not beneficial to large-area application, such as metallographic detection and microscopic image recognition, a data prediction model represented by an Artificial Neural Network (ANN) and a Support Vector Machine (SVM) is successfully applied to multiple fields of medical health assessment, fault diagnosis, corrosion damage prediction and the like, and a new thought is provided for research of spheroidization grade prediction. However, due to the existence of a few types of samples, the data of the existing equipment materials cannot obtain a high-precision model so as to realize accurate prediction of the spheroidization degree of the equipment materials.
Disclosure of Invention
In view of this, the present invention provides a method and a system for predicting the spheroidization degree of an equipment material, so as to overcome the technical defect that a high-precision model cannot be obtained due to the existence of a few types of samples in the data of the equipment material, so as to achieve accurate prediction of the spheroidization degree of the equipment material.
In order to achieve the purpose, the invention provides the following scheme:
a method for predicting the spheroidization degree of equipment materials, comprising the following steps:
acquiring equipment material parameters with known spheroidization degrees, and constructing a data sample set;
adopting a Borderline-SMOTE algorithm to oversample a few samples in the data sample set, so that the difference value of the proportion of the few samples to the majority samples in the data sample set is smaller than a preset threshold value, and obtaining an expanded data sample set;
constructing an SVM model for predicting the spheroidization degree of the equipment material;
determining the hyperparameters of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set, and obtaining the SVM model with the hyperparameters determined;
training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain a trained SVM model;
and predicting the spheroidization degree of the equipment material based on the trained SVM model.
Optionally, the oversampling is performed on the minority samples in the data sample set by using the Borderline-SMOTE algorithm, so that a difference value between the ratios of the minority samples and the majority samples in the data sample set is smaller than a preset threshold, and the expanded data sample set is obtained, which specifically includes:
determining a plurality of neighbor samples of each minority sample according to Euclidean distances between each minority sample and all samples except the minority sample in the data sample set;
determining the number of samples belonging to a majority class in a plurality of adjacent samples of each minority class sample, and respectively using the determined number as a boundary judgment index of each minority class sample;
setting a few types of samples with boundary judgment indexes in a preset range as boundary samples;
according to each boundary sample, using formula xnew=x+λ×(xi-x) # (1), new minority are generatedA class sample;
where x is a boundary sample, xiIs the ith neighbor of the boundary sample, xnewNew minority samples;
and adding all the generated new few types of samples to the data sample set to obtain an expanded data sample set.
Optionally, the oversampling is performed on the minority samples in the data sample set by using the Borderline-SMOTE algorithm, so that a difference value between the ratios of the minority samples and the majority samples in the data sample set is smaller than a preset threshold, and the expanded data sample set is obtained, and then the method further includes:
and carrying out normalization processing on each sample in the expanded data sample set.
Optionally, the determining, based on the expanded data sample set, a hyper-parameter of the SVM model by using a differential evolution algorithm to obtain the SVM model after determining the hyper-parameter specifically includes:
taking parameters in the SVM model as genes, initializing a population, and setting the initialized population as a current population;
determining the optimal individual of the current population based on the expanded data sample set by taking a 10-fold cross validation result of the SVM model as a fitness index;
judging whether a termination condition is met or not, and obtaining a judgment result;
if the judgment result shows no, a formula is used
Figure BDA0003401014780000031
Carrying out variation on individuals in the current population to obtain variant individuals; wherein,
Figure BDA0003401014780000032
all three individuals in the current population are present,
Figure BDA0003401014780000033
representing the ith variant individual, and F representing a differential scaling factor;
using a formula according to each variant individual
Figure BDA0003401014780000034
Performing crossing to obtain crossed individuals; wherein,
Figure BDA0003401014780000035
a jth gene representing a crossover individual of the ith individual in the current population,
Figure BDA0003401014780000036
a j-th gene representing a variant individual of the i-th individual in the current population,
Figure BDA0003401014780000037
j gene of the ith individual in the current population, and CR represents the cross probability;
using a formula based on the cross individuals of each individual
Figure BDA0003401014780000038
Selecting individuals of the next generation population; wherein,
Figure BDA0003401014780000039
represents the ith individual in the next generation population,
Figure BDA00034010147800000310
represents the crossed individuals of the ith individual in the current population,
Figure BDA0003401014780000041
represents the ith individual in the current population,
Figure BDA0003401014780000042
and
Figure BDA0003401014780000043
respectively representing the fitness indexes of the ith individual and the crossed individual of the ith individual in the current population;
setting the next generation population as the current population, and returning to the step of determining the optimal individual of the current population by taking a 10-fold cross validation result of the SVM model as a fitness index and based on the expanded data sample set;
and if the judgment result shows that the population is the optimal population, outputting the optimal individuals of the current population.
A system for predicting a degree of equipment material spheroidization, the prediction system comprising:
the data sample set construction module is used for acquiring equipment material parameters with known spheroidization degrees and constructing a data sample set;
the sample expansion module is used for oversampling a few samples in the data sample set by adopting a Borderline-SMOTE algorithm, so that the difference value of the proportion of the few samples to the majority samples in the data sample set is smaller than a preset threshold value, and obtaining an expanded data sample set;
the SVM model construction module is used for constructing an SVM model for predicting the spheroidization degree of the equipment material;
the super-parameter determining module of the SVM model is used for determining the super-parameters of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set to obtain the SVM model with the determined super-parameters;
the SVM model training module is used for training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain a trained SVM model;
and the spheroidization degree prediction module is used for predicting the spheroidization degree of the equipment material based on the trained SVM model.
Optionally, the sample expansion module specifically includes:
a neighbor sample selection submodule, configured to determine multiple neighbor samples of each minority sample according to euclidean distances between each minority sample and all samples in the data sample set except the minority sample;
the boundary judgment index determining submodule is used for determining the number of samples belonging to a majority in a plurality of adjacent samples of each sample of the minority, and the samples are respectively used as the boundary judgment indexes of each sample of the minority;
the boundary sample determining submodule is used for setting a few types of samples with boundary judgment indexes within a preset range as boundary samples;
a minority sample generation submodule for using a formula x according to each boundary samplenew=x+λ×(xi-x) generating new minority class samples; where x is a boundary sample, xiIs the ith neighbor of the boundary sample, xnewNew minority samples;
and the sample set expansion submodule is used for adding all the generated new few types of samples into the data sample set to obtain an expanded data sample set.
Optionally, the prediction system further includes:
and the normalization module is used for normalizing each sample in the expanded data sample set.
Optionally, the hyper-parameter determining module of the SVM model specifically includes:
the initialization submodule is used for taking parameters in the SVM model as genes, initializing a population and setting the initialized population as a current population;
the optimal individual selection submodule is used for determining the optimal individual of the current population based on the expanded data sample set by taking the 10-fold cross validation result of the SVM model as a fitness index;
the judgment submodule is used for judging whether the termination condition is met or not and obtaining a judgment result;
a variation submodule for utilizing a formula if the judgment result indicates no
Figure BDA0003401014780000051
Figure BDA0003401014780000052
Carrying out variation on individuals in the current population to obtain variant individuals; wherein,
Figure BDA0003401014780000053
all three individuals in the current population are present,
Figure BDA0003401014780000054
representing the ith variant individual, and F representing a differential scaling factor;
a cross sub-module for using a formula according to each variant individual
Figure BDA0003401014780000055
Performing crossing to obtain crossed individuals; wherein,
Figure BDA0003401014780000056
a jth gene representing a crossover individual of the ith individual in the current population,
Figure BDA0003401014780000057
a j-th gene representing a variant individual of the i-th individual in the current population,
Figure BDA0003401014780000058
j gene of the ith individual in the current population, and CR represents the cross probability;
an individual selection submodule for utilizing a formula based on the cross individuals of each individual
Figure BDA0003401014780000059
Figure BDA00034010147800000510
Selecting individuals of the next generation population; wherein,
Figure BDA00034010147800000511
represents the ith individual in the next generation population,
Figure BDA00034010147800000512
represents the crossed individuals of the ith individual in the current population,
Figure BDA00034010147800000513
represents the ith individual in the current population,
Figure BDA0003401014780000061
and
Figure BDA0003401014780000062
respectively representing the fitness indexes of the ith individual and the crossed individual of the ith individual in the current population;
the iteration calling submodule is used for setting the next generation population as the current population and calling the optimal individual selection submodule;
and the output submodule is used for outputting the optimal individual of the current population if the judgment result shows that the optimal individual is the current population.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention discloses a method for predicting equipment material spheroidization degree, which comprises the following steps: acquiring equipment material parameters with known spheroidization degrees, and constructing a data sample set; adopting a Borderline-SMOTE algorithm to oversample a few samples in the data sample set, so that the difference value of the proportion of the few samples to the majority samples in the data sample set is smaller than a preset threshold value, and obtaining an expanded data sample set; constructing an SVM model for predicting the spheroidization degree of the equipment material; determining the hyperparameters of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set, and obtaining the SVM model with the hyperparameters determined; training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain a trained SVM model; and predicting the spheroidization degree of the equipment material based on the trained SVM model. The method firstly expands the minority samples based on the Borderline-SMOTE algorithm, then determines the hyper-parameters of the SVM model through the differential evolution algorithm, and improves the model prediction precision.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a method for predicting spheroidization of equipment materials according to embodiment 1 of the present invention;
fig. 2 is a schematic diagram of a method for predicting the spheroidization degree of the equipment material according to embodiment 1 of the present invention;
FIG. 3 is a sample distribution diagram before and after the imbalance adjustment provided in embodiment 3 of the present invention;
FIG. 4 is a schematic diagram of a confusion matrix for the nodularity prediction before and after data balancing according to embodiment 3 of the present invention; wherein, fig. 4(a) is a schematic diagram of confusion matrix of nodularization level prediction before data balance, and fig. 4(b) is a schematic diagram of confusion matrix of nodularization level prediction after data balance;
fig. 5 is a trend comparison graph of each classification model and the evaluation value provided in embodiment 3 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a method and a system for predicting the spheroidization degree of equipment materials, which are used for overcoming the technical defect that a high-precision model cannot be obtained due to the existence of a few types of samples in the data of the equipment materials so as to realize the accurate prediction of the spheroidization degree of the equipment materials and realize the accurate prediction of the spheroidization degree of the equipment materials.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The SVM is good at processing the nonlinear problems of small samples and high-dimensional data sets, has strong generalization capability and the like, and is just suitable for equipment spheroidization detection data sets with small data volume and complex parameter relations. And the selection of the hyper-parameters is important for the performance of the SVM model and the influence of a prediction result.
Aiming at the classification prediction problem of the equipment spheroidization grade, the technology establishes a multi-classification prediction model based on the SVM, solves the problem of sample imbalance existing in the spheroidization detection data set by applying Borderline-SMOTE algorithm to perform oversampling adjustment, and determines the optimal parameters required by the SVM model by adopting a Differential Evolution (DE) algorithm to finally obtain the DE-SVM spheroidization grade prediction model with higher classification accuracy and better spheroidization identification capability.
Example 1
As shown in fig. 1 and 2, the present invention provides a method for predicting a spheroidization degree of an equipment material, the method comprising the steps of:
step 101, obtaining the equipment material parameters with known spheroidization degree, and constructing a data sample set.
And 102, oversampling a few samples in the data sample set by adopting a Borderline-SMOTE algorithm, and obtaining an expanded data sample set, wherein the difference value of the proportions of the few samples and the majority samples in the data sample set is smaller than a preset threshold value.
The distribution of data samples of each grade is counted, a Borderline-SMOTE algorithm is applied to oversampling a few classes of samples in an original data set (generally, a class with too small number of samples is called a 'few class', and a class with too large number of samples is called a 'majority class'), so that the proportion of the few classes of samples and the proportion of the majority classes of samples tend to be consistent, and the specific flow is as follows:
determining the number of neighbors: calculating the Euclidean distance between each minority sample x and the whole sample, and determining k neighbors of the sample x according to the distribution situation, wherein the default k is 5;
identifying boundary samples: assume that there are t majority class samples in k neighbors (0 ≦ t ≦ k):
when t is k, judging the sample x as a noise sample;
when in use
Figure BDA0003401014780000081
Judging that the sample x is a boundary sample, and classifying the x into a danger set;
when in use
Figure BDA0003401014780000082
And then, judging that the sample x is a safe sample.
A few samples were synthesized: finding k minority neighbors of the sample x in the set danger, and randomly selecting one neighbor xiSynthesizing a new minority class sample x according to equation (1)newWhere λ is a random number from 0 to 1.
xnew=x+λ×(xi-x) (1)
The method synthesizes the minority samples to adjust the proportion of each class sample, balance the data set and avoid the problem that the follow-up model cannot predict the minority samples due to sample imbalance. Each category can be successfully identified after adjustment. In addition, after the sample proportion is adjusted to be consistent, most of the samples are not distinguished from a few of the samples, and the samples are only divided through the data set.
(2) Normalization treatment: and (3) carrying out normalization processing on all sample data obtained after the unbalance adjustment according to a formula (2) so that the numerical value is mapped between intervals [0,1 ].
Figure BDA0003401014780000083
(3) Data set partitioning: according to the ratio of 8:2, 20% of data in the balanced data set is randomly selected as a test set for final verification of the model, and the rest 80% of data is selected as a training set for subsequent model optimization training.
And 103, constructing an SVM model for predicting the spheroidization degree of the equipment material.
And step 104, determining the hyperparameters of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set, and obtaining the SVM model with the hyperparameters determined.
And (4) applying a DE algorithm to the training set data obtained by division to optimize two important parameters of C and gamma in the SVM model. The DE algorithm aims to obtain an optimal solution by simulating the genetic evolution process of individuals in a population, including the processes of mutation, crossover and selection, while each individual is represented in a vector form (i.e., x)i={xi1,xi2,…,xiDWhere i is 1, 2, …, N represents the population size number, D is the dimension of the individual attribute), and the values in the vector represent the parameters to be solved in turn, and the evolution flow is as follows.
(1) Population initialization: setting control parameters of a partial DE algorithm, assuming that the population size N is 50, and the number of variables needing to be optimized is 2, so that the corresponding individual attribute dimension D is 2, and the expression form of an individual vector is xi={xi1,xi2X in the solving processi1,xi2Corresponding to the values of C and gamma, respectively, the differential scaling factor F is 0.5, the crossover probability CR is 0.3, and the value range [ Umin, Umax ] of each individual]That is, the corresponding value boundaries of C and gamma are both [0.01,100 ]]。
Firstly, 50 initial population individuals are randomly generated in a search space range
Figure BDA0003401014780000091
Each individual is a D-dimensional vector, and the values of each dimension are as follows:
xij=Umin+λ(Umax-Umin) (3)
wherein i is 1, 2, …, N, j is 1, 2, …, D, λ is a random number from 0 to 1.
(2) And (4) initial population evaluation, namely calculating the fitness value of each individual in the initial population. The fitness value is an index for evaluating the quality degree of individuals in the population and is calculated mainly according to a fitness function of the problem. The method mainly adopts a 10-fold cross validation result of the SVM model as a fitness index, namely training set data obtained in the last step are divided into 10 parts at random equally, 9 parts of training models and 1 part of validation models are taken in turn, and finally the average accuracy of 10 tests is taken for evaluation, the higher the average accuracy is, the better the population individuals are, the optimal individuals of the initial population are obtained through evaluation and are used in the subsequent condition judgment or individual selection process.
(3) And judging whether the termination condition is met. The set termination condition of the technology is population iteration for 100 times, if the maximum iteration times is reached (other termination conditions can be set according to actual conditions, such as the calculated individual fitness value reaches a set threshold value but is not related to the study), evolution is terminated, the obtained optimal individual is used as the optimal solution to be output, and therefore the optimal parameters C and gamma are obtained; and if the condition is not met, continuing the next iterative evolution process.
(4) The process of individual variation. After the initial evaluation, the algorithm will make each individual according to the difference strategy of formula (4) at the current iteration number (i.e. the current iteration number G is 0)
Figure BDA0003401014780000101
Mutating to obtain corresponding variant individual
Figure BDA0003401014780000102
Figure BDA0003401014780000103
Wherein i ≠ r1≠r2≠r3
Figure BDA0003401014780000104
All three random individuals under the current iteration number, and F is a differential scaling factor and is used for controlling the influence of the vector deviation of the two individuals on the mutation individual.
(5) And (4) carrying out individual crossing process. Then, the current population of individuals
Figure BDA0003401014780000105
And variant individuals
Figure BDA0003401014780000106
According to the publicFormula (5) for exchanging information to generate cross individuals
Figure BDA0003401014780000107
(6) An individual selection process. Based on greedy strategy, the population individuals under the current iteration times are selected
Figure BDA0003401014780000108
And newly generated crossover individuals
Figure BDA0003401014780000109
Comparing the fitness values of the population groups, and selecting the individuals with better fitness as the next generation population individuals according to the formula (6)
Figure BDA00034010147800001010
Where f represents the fitness function.
Figure BDA00034010147800001011
And (3) each individual of the original population is evolved in three stages of steps (4) to (6), a new generation of population is established and formed, the iteration times G is G +1, then, the step (3) is returned to carry out a new round of judgment and iterative evolution until the initial set termination condition is met, and the final iteration-obtained super parameter value corresponding to the optimal individual is output.
And 105, training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain the trained SVM model.
And substituting the optimized optimal parameters into the original SVM model, inputting the optimized optimal parameters into a test set for verification, and finally obtaining the trained SVM model.
And 106, predicting the spheroidization degree of the equipment material based on the trained SVM model.
Example 2
The present invention also provides an equipment material spheroidization degree prediction system, which comprises:
and the data sample set construction module is used for acquiring the equipment material parameters with known spheroidization degrees and constructing a data sample set.
And the sample expansion module is used for oversampling a few samples in the data sample set by adopting a Borderline-SMOTE algorithm, so that the difference value of the proportion of the few samples to the majority samples in the data sample set is smaller than a preset threshold value, and obtaining the expanded data sample set.
The sample expansion module specifically comprises: a neighbor sample selecting submodule for determining a boundary judgment index determining submodule for determining the number of the majority samples in the plurality of neighbor samples of each minority sample as the boundary judgment index of each minority sample according to the Euclidean distance between each minority sample and all the samples except the minority sample in the data sample set; the boundary sample determining submodule is used for setting a few types of samples with boundary judgment indexes within a preset range as boundary samples; a minority sample generation submodule for using a formula x according to each boundary samplenew=x+λ×(xi-x) generating new minority class samples; where x is a boundary sample, xiIs the ith neighbor of the boundary sample, xnewNew minority samples; and the sample set expansion submodule is used for adding all the generated new few types of samples into the data sample set to obtain an expanded data sample set.
The prediction system further comprises: and the normalization module is used for normalizing each sample in the expanded data sample set.
And the SVM model construction module is used for constructing an SVM model for predicting the spheroidization degree of the equipment material.
And the hyperparameter determination module of the SVM model is used for determining the hyperparameter of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set so as to obtain the SVM model with the hyperparameter determined.
The hyper-parameter determining module of the SVM model specifically comprises: an initialization submodule for taking the parameters in the SVM model as the basisTherefore, initializing a population, and setting the initialized population as a current population; the optimal individual selection submodule is used for determining the optimal individual of the current population based on the expanded data sample set by taking the 10-fold cross validation result of the SVM model as a fitness index; the judgment submodule is used for judging whether the termination condition is met or not and obtaining a judgment result; a variation submodule for utilizing a formula if the judgment result indicates no
Figure BDA0003401014780000111
Carrying out variation on individuals in the current population to obtain variant individuals; wherein,
Figure BDA0003401014780000112
all three individuals in the current population are present,
Figure BDA0003401014780000113
representing the ith variant individual, and F representing a differential scaling factor; a cross sub-module for using a formula according to each variant individual
Figure BDA0003401014780000121
Performing crossing to obtain crossed individuals; wherein,
Figure BDA0003401014780000122
a jth gene representing a crossover individual of the ith individual in the current population,
Figure BDA0003401014780000123
a j-th gene representing a variant individual of the i-th individual in the current population,
Figure BDA0003401014780000124
j gene of the ith individual in the current population, and CR represents the cross probability; an individual selection submodule for utilizing a formula based on the cross individuals of each individual
Figure BDA0003401014780000125
Selecting next generation population(ii) an individual; wherein,
Figure BDA0003401014780000126
represents the ith individual in the next generation population,
Figure BDA0003401014780000127
represents the crossed individuals of the ith individual in the current population,
Figure BDA0003401014780000128
represents the ith individual in the current population,
Figure BDA0003401014780000129
and
Figure BDA00034010147800001210
respectively representing the fitness indexes of the ith individual and the crossed individual of the ith individual in the current population; the iteration calling submodule is used for setting the next generation population as the current population and calling the optimal individual selection submodule; an output submodule for outputting the optimal individual of the current population if the judgment result shows yes
And the SVM model training module is used for training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain the trained SVM model.
And the spheroidization degree prediction module is used for predicting the spheroidization degree of the equipment material based on the trained SVM model.
Example 3
In order to explain the technical effects of the present invention, embodiment 2 of the present invention provides the following specific embodiments.
Taking pearlite spheroidization of 12Cr1MoV steel as an example, the method selects 115 groups of data in total of samples with spheroidization grades of 4-5 to verify by collecting historical detection data of 12Cr1MoV steel components of each power plant and according to the severity of pearlite spheroidization and the influence condition of pearlite spheroidization on equipment. The data parameters comprise the accumulated running time (8.9 multiplied by 104-31.2 multiplied by 104h) of the equipment, the working pressure (8.83-9.82 Mpa), the working temperature (510-550 ℃), the material hardness (94-168 HB) and the spheroidization grades (4 grades, 4.5 grades and 5 grades).
Based on the 12CrlMoV steel spheroidization rating standard for the DL/T773-2016 thermal power plant and the relevant standards such as the GB/T30580-2014 power station boiler main pressure-bearing part service life evaluation technical guide rule, the following risk evaluation criteria are obtained, as shown in Table 1.
TABLE 1 evaluation and division table for spheroidization risk of material
Figure BDA0003401014780000131
And (3) mainly selecting three indexes of confusion matrix, ACCURACy (ACC) and Kappa coefficient to evaluate the prediction result of the model. The confusion matrix can clearly show the classification and identification conditions of the model on each label data, and simultaneously judges the unbalance phenomenon of the data set. The accuracy can visually reflect the condition that various data samples are correctly classified. The Kappa coefficient is used for representing the consistency problem of the model prediction result and the actual value, the value range of the coefficient is [ -1,1], and generally, the higher the coefficient value is, the better the classification performance of the model is realized.
As a result of performing non-equilibrium adjustment preprocessing according to the statistical distribution of the original data sample grades, as shown in fig. 3, the proportion of the spheroidization grades (i.e. 4.0 grade: 4.5 grade: 5.0 grade) of the original data set sample is about 1:2.5:1, and the data of each category after adjustment reaches equilibrium distribution.
In order to verify the influence of the unbalanced data set on the model prediction result, the prediction conditions of the original SVM model before and after oversampling by the Borderline-SMOTE algorithm are compared, and the result is shown in fig. 4 and table 2.
TABLE 2 SVM model evaluation results before and after Borderline-SMOTE equilibrium processing (not using DE algorithm)
Figure BDA0003401014780000141
FIG. 4 shows that after the data samples are adjusted, various sample data can be effectively identified. From table 2, it can be seen that after data balance processing, both the accuracy of the data set and the Kappa coefficient are significantly improved, and the Kappa coefficient is even improved by about one time, which indicates the effectiveness of Borderline-SMOTE oversampling on data sample identification, but the accuracy of the model still needs to be improved.
In order to further improve the model classification prediction accuracy, a DE Algorithm is applied to optimize two important hyper-parameters, namely C and gamma, in an SVM model, and the DE Algorithm is compared with common intelligent Optimization algorithms, such as a Genetic Algorithm (GA), a Particle Swarm Optimization (PSO), an Artificial Fish Swarm Optimization (AFSA), and the like. All optimization algorithms adopt the accuracy average value of the models after ten-fold cross validation as an optimization evaluation index, the population scale of each model is 50, the iteration times are 100 times, and finally, the prediction results corresponding to optimization are shown in fig. 5 and table 3.
TABLE 3 SVM model evaluation results after parameter adjustment for each optimization algorithm
Figure BDA0003401014780000142
Figure BDA0003401014780000151
From table 2, the data prediction precision of each model is remarkably improved after parameter adjustment through various optimization algorithms, and the accuracy rate can reach more than 70%, wherein the performance of the DE-SVM model is best, the accuracy rates of a training set and a test set are respectively improved by about 26% and 29%, the Kappa coefficient can reach 0.756, and a better classification prediction level is shown.
In summary, the technology is based on an SVM algorithm, a Borderline-SMOTE algorithm is applied to solve the problem of sample imbalance of the spheroidization grade data set, parameters are adjusted by adopting a DE intelligent optimization algorithm, the established DE-SVM model can effectively predict the spheroidization grade state of the material, the accuracy rate can reach 83.8%, the better spheroidization grade recognition capability is shown, and a certain technical support can be provided for rapid diagnosis and prediction of equipment spheroidization damage conditions.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (8)

1. A method for predicting the spheroidization degree of equipment materials is characterized by comprising the following steps:
acquiring equipment material parameters with known spheroidization degrees, and constructing a data sample set;
adopting a Borderline-SMOTE algorithm to oversample a few samples in the data sample set, so that the difference value of the proportion of the few samples to the majority samples in the data sample set is smaller than a preset threshold value, and obtaining an expanded data sample set;
constructing an SVM model for predicting the spheroidization degree of the equipment material;
determining the hyperparameters of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set, and obtaining the SVM model with the hyperparameters determined;
training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain a trained SVM model;
and predicting the spheroidization degree of the equipment material based on the trained SVM model.
2. The method for predicting spheroidization degree of equipment materials according to claim 1, wherein the method for predicting spheroidization degree of equipment materials is characterized in that a Borderline-SMOTE algorithm is adopted to oversample a few types of samples in a data sample set, so that a difference value of proportions of the few types of samples and the majority types of samples in the data sample set is smaller than a preset threshold value, and an expanded data sample set is obtained, and specifically comprises the following steps:
determining a plurality of neighbor samples of each minority sample according to Euclidean distances between each minority sample and all samples except the minority sample in the data sample set;
determining the number of samples belonging to a majority class in a plurality of adjacent samples of each minority class sample, and respectively using the determined number as a boundary judgment index of each minority class sample;
setting a few types of samples with boundary judgment indexes in a preset range as boundary samples;
according to each boundary sample, using formula xnew=x+λ×(xi-x) generating new minority class samples;
where x is a boundary sample, xiIs the ith neighbor of the boundary sample, xnewFor the new few classes of samples, λ represents the sample generation coefficient;
and adding all the generated new few types of samples to the data sample set to obtain an expanded data sample set.
3. The method for predicting the spheroidization degree of the equipment material according to claim 1 or 2, wherein the Borderline-SMOTE algorithm is adopted to oversample the minority samples in the data sample set, so that the difference of the proportion of the minority samples to the majority samples in the data sample set is smaller than a preset threshold value, an expanded data sample set is obtained, and then the method further comprises the following steps:
and carrying out normalization processing on each sample in the expanded data sample set.
4. The method for predicting the spheroidization degree of the equipment material according to claim 1, wherein the step of determining the hyperparameter of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set to obtain the SVM model with the determined hyperparameter specifically comprises the steps of:
taking parameters in the SVM model as genes, initializing a population, and setting the initialized population as a current population;
determining the optimal individual of the current population based on the expanded data sample set by taking a 10-fold cross validation result of the SVM model as a fitness index;
judging whether a termination condition is met or not, and obtaining a judgment result;
if the judgment result shows no, a formula is used
Figure FDA0003401014770000021
Carrying out variation on individuals in the current population to obtain variant individuals; wherein,
Figure FDA0003401014770000022
all three individuals in the current population are present,
Figure FDA0003401014770000023
representing the ith variant individual, and F representing a differential scaling factor;
using a formula according to each variant individual
Figure FDA0003401014770000024
Performing crossing to obtain crossed individuals; wherein,
Figure FDA0003401014770000025
well showing the jth gene of the crossed individuals of the ith individual in the current population,
Figure FDA0003401014770000026
a j-th gene representing a variant individual of the i-th individual in the current population,
Figure FDA0003401014770000027
j gene of the ith individual in the current population, and CR represents the cross probability;
using a formula based on the cross individuals of each individual
Figure FDA0003401014770000028
Selecting individuals of the next generation population; wherein,
Figure FDA0003401014770000029
represents the ith individual in the next generation population,
Figure FDA00034010147700000210
represents the crossed individuals of the ith individual in the current population,
Figure FDA00034010147700000211
represents the ith individual in the current population,
Figure FDA00034010147700000212
and
Figure FDA00034010147700000213
respectively representing the fitness indexes of the ith individual and the crossed individual of the ith individual in the current population;
setting the next generation population as the current population, and returning to the step of determining the optimal individual of the current population by taking a 10-fold cross validation result of the SVM model as a fitness index and based on the expanded data sample set;
and if the judgment result shows that the population is the optimal population, outputting the optimal individuals of the current population.
5. A system for predicting a degree of spheroidization of an apparatus material, the system comprising:
the data sample set construction module is used for acquiring equipment material parameters with known requisition degrees and constructing a data sample set;
the sample expansion module is used for oversampling a few samples in the data sample set by adopting a Borderline-SMOTE algorithm, so that the difference value of the proportion of the few samples to the majority samples in the data sample set is smaller than a preset threshold value, and obtaining an expanded data sample set;
the SVM model construction module is used for constructing an SVM model for predicting the material conversion degree of the equipment;
the super-parameter determining module of the SVM model is used for determining the super-parameters of the SVM model by adopting a differential evolution algorithm based on the expanded data sample set to obtain the SVM model with the determined super-parameters;
the SVM model training module is used for training the SVM model with the determined hyper-parameters based on the expanded data sample set to obtain a trained SVM model;
and the spheroidization degree prediction module is used for predicting the spheroidization degree of the equipment material based on the trained SVM model.
6. The system for predicting spheroidization degree of equipment materials according to claim 5, wherein the sample expansion module specifically comprises:
a neighbor sample selection submodule, configured to determine multiple neighbor samples of each minority sample according to euclidean distances between each minority sample and all samples in the data sample set except the minority sample;
the boundary judgment index determining submodule is used for determining the number of samples belonging to a majority in a plurality of adjacent samples of each sample of the minority, and the samples are respectively used as the boundary judgment indexes of each sample of the minority;
the boundary sample determining submodule is used for setting a few types of samples with boundary judgment indexes within a preset range as boundary samples;
a minority sample generation submodule for using a formula x according to each boundary samplenew=x+λ×(xi-x) generating new minority class samples; where x is a boundary sample, xiIs the ith neighbor of the boundary sample, xnewFor the new few classes of samples, λ represents the sample generation coefficient;
and the sample set expansion submodule is used for adding all the generated new few types of samples into the data sample set to obtain an expanded data sample set.
7. The system for predicting the spheroidization degree of the equipment material according to claim 5 or 6, wherein the system for predicting further comprises:
and the normalization module is used for normalizing each sample in the expanded data sample set.
8. The system for predicting the spheroidization degree of the equipment material according to claim 5, wherein the hyper-parameter determining module of the SVM model specifically comprises:
the initialization submodule is used for taking parameters in the SVM model as genes, initializing a population and setting the initialized population as a current population;
the optimal individual selection submodule is used for determining the optimal individual of the current population based on the expanded data sample set by taking the 10-fold cross validation result of the SVM model as a fitness index;
the judgment submodule is used for judging whether the termination condition is met or not and obtaining a judgment result;
a variation submodule for utilizing a formula if the judgment result indicates no
Figure FDA0003401014770000041
Figure FDA0003401014770000042
Carrying out variation on individuals in the current population to obtain variant individuals; wherein,
Figure FDA0003401014770000043
all three individuals in the current population are present,
Figure FDA0003401014770000044
representing the ith variant individual, and F representing a differential scaling factor;
cross sub-modules for eachIndividual variation individual, using formula
Figure FDA0003401014770000045
Performing crossing to obtain crossed individuals; wherein,
Figure FDA0003401014770000046
a jth gene representing a crossover individual of the ith individual in the current population,
Figure FDA0003401014770000047
a j-th gene representing a variant individual of the i-th individual in the current population,
Figure FDA0003401014770000048
j gene of the ith individual in the current population, and CR represents the cross probability;
an individual selection submodule for utilizing a formula based on the cross individuals of each individual
Figure FDA0003401014770000049
Figure FDA00034010147700000410
Selecting individuals of the next generation population; wherein,
Figure FDA00034010147700000411
represents the ith individual in the next generation population,
Figure FDA0003401014770000051
represents the crossed individuals of the ith individual in the current population,
Figure FDA0003401014770000052
represents the ith individual in the current population,
Figure FDA0003401014770000053
and
Figure FDA0003401014770000054
respectively representing the fitness indexes of the ith individual and the crossed individual of the ith individual in the current population;
the iteration calling submodule is used for setting the next generation population as the current population and calling the optimal individual selection submodule;
and the output submodule is used for outputting the optimal individual of the current population if the judgment result shows that the optimal individual is the current population.
CN202111496811.1A 2021-12-09 2021-12-09 Equipment material spheroidization degree prediction method and system Active CN114187977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111496811.1A CN114187977B (en) 2021-12-09 2021-12-09 Equipment material spheroidization degree prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111496811.1A CN114187977B (en) 2021-12-09 2021-12-09 Equipment material spheroidization degree prediction method and system

Publications (2)

Publication Number Publication Date
CN114187977A true CN114187977A (en) 2022-03-15
CN114187977B CN114187977B (en) 2024-07-12

Family

ID=80542880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111496811.1A Active CN114187977B (en) 2021-12-09 2021-12-09 Equipment material spheroidization degree prediction method and system

Country Status (1)

Country Link
CN (1) CN114187977B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2090670A1 (en) * 2007-12-05 2009-08-19 Casa Maristas Azterlan Method for predicting spheroidisation degree in defined zones of spheroidal graphitic cast iron pieces
CN112257942A (en) * 2020-10-29 2021-01-22 中国特种设备检测研究院 Stress corrosion cracking prediction method and system
CN112700167A (en) * 2021-01-14 2021-04-23 华南理工大学 Product quality index prediction method based on differential evolution

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2090670A1 (en) * 2007-12-05 2009-08-19 Casa Maristas Azterlan Method for predicting spheroidisation degree in defined zones of spheroidal graphitic cast iron pieces
CN112257942A (en) * 2020-10-29 2021-01-22 中国特种设备检测研究院 Stress corrosion cracking prediction method and system
CN112700167A (en) * 2021-01-14 2021-04-23 华南理工大学 Product quality index prediction method based on differential evolution

Also Published As

Publication number Publication date
CN114187977B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
CN108647470B (en) Initial leakage positioning method based on clustering and deep belief network
CN109740859A (en) Transformer condition evaluation and system based on Principal Component Analysis and support vector machines
CN111523778A (en) Power grid operation safety assessment method based on particle swarm algorithm and gradient lifting tree
CN109710661A (en) Based on Global Genetic Simulated Annealing Algorithm to the method for high-pressure heater state analysis
CN111340248A (en) Transformer fault diagnosis method and system based on intelligent integration algorithm
CN112147432A (en) BiLSTM module based on attention mechanism, transformer state diagnosis method and system
CN113344288B (en) Cascade hydropower station group water level prediction method and device and computer readable storage medium
CN112149750A (en) Water supply network pipe burst identification data driving method
CN116010884A (en) Fault diagnosis method of SSA-LightGBM oil-immersed transformer based on principal component analysis
CN113205125A (en) XGboost-based extra-high voltage converter valve operation state evaluation method
CN111695288B (en) Transformer fault diagnosis method based on Apriori-BP algorithm
CN110442954A (en) The super high strength stainless steel design method of lower machine learning is instructed based on physical metallurgy
CN114004153A (en) Penetration depth prediction method based on multi-source data fusion
CN116595445A (en) Transformer fault diagnosis method based on random forest feature optimization and improved support vector machine
CN115239077A (en) Low-voltage transformer district electricity stealing user identification method based on improved whale optimization algorithm
CN110826714A (en) Dynamic regulation and control method for rocky foundation pit blasting parameters
CN114021432A (en) Stress corrosion cracking crack propagation rate prediction method and system
CN109685136A (en) A kind of high-pressure heater status data analysis method
CN117473424A (en) Transformer fault diagnosis method, system, equipment and medium based on random forest
CN114187977B (en) Equipment material spheroidization degree prediction method and system
CN116562169A (en) Power transformer fault diagnosis method based on deep learning
CN112241832A (en) Product quality grading evaluation standard design method and system
CN109886316B (en) Transformer state parameter combination prediction method based on cloud system similarity weight distribution
CN114818927A (en) Data-driven equipment corrosion prediction method
CN114252266A (en) Rolling bearing performance degradation evaluation method based on DBN-SVDD model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant