CN114970709B - Improved GA-based data-driven AHU multi-fault diagnosis feature selection method - Google Patents
Improved GA-based data-driven AHU multi-fault diagnosis feature selection method Download PDFInfo
- Publication number
- CN114970709B CN114970709B CN202210555420.0A CN202210555420A CN114970709B CN 114970709 B CN114970709 B CN 114970709B CN 202210555420 A CN202210555420 A CN 202210555420A CN 114970709 B CN114970709 B CN 114970709B
- Authority
- CN
- China
- Prior art keywords
- sample data
- population
- feature
- fault diagnosis
- ahu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003745 diagnosis Methods 0.000 title claims abstract description 51
- 238000010187 selection method Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000012360 testing method Methods 0.000 claims description 25
- 238000000034 method Methods 0.000 claims description 17
- 230000006978 adaptation Effects 0.000 claims description 14
- 230000035772 mutation Effects 0.000 claims description 7
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 210000000349 chromosome Anatomy 0.000 claims description 3
- 238000003066 decision tree Methods 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 238000007637 random forest analysis Methods 0.000 claims description 3
- 230000008030 elimination Effects 0.000 claims 1
- 238000003379 elimination reaction Methods 0.000 claims 1
- 230000000875 corresponding effect Effects 0.000 description 6
- 238000001816 cooling Methods 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000009423 ventilation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
- G06F18/2111—Selection of the most significant subset of features by using evolutionary computational techniques, e.g. genetic algorithms
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F24—HEATING; RANGES; VENTILATING
- F24F—AIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
- F24F11/00—Control or safety arrangements
- F24F11/30—Control or safety arrangements for purposes related to the operation of the system, e.g. for safety or monitoring
- F24F11/32—Responding to malfunctions or emergencies
- F24F11/38—Failure diagnosis
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F24—HEATING; RANGES; VENTILATING
- F24F—AIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
- F24F11/00—Control or safety arrangements
- F24F11/89—Arrangement or mounting of control or safety devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Physiology (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Combustion & Propulsion (AREA)
- Mechanical Engineering (AREA)
- Genetics & Genomics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of feature selection of AHU fault diagnosis sample data, and discloses a data-driven AHU multi-fault diagnosis feature selection method based on improved GA, which comprises the following steps: determining faults, collecting sample data, preprocessing the data, eliminating redundant features, initializing improved GA parameters, initializing heuristic improved GA population, selecting features of the sample data, establishing a fault diagnosis model, and training and obtaining an optimal feature subset. The data-driven AHU multi-fault diagnosis feature selection method based on the improved GA can accurately select certain feature elements with high fault correlation, can effectively reduce the dimension of sample data, reduces the calculated amount and improves the accuracy of AHU fault diagnosis.
Description
Technical Field
The invention relates to the technical field of feature selection of AHU fault diagnosis sample data, in particular to a data-driven AHU multi-fault diagnosis feature selection method based on improved GA.
Background
An Air Handling Unit (AHU) is used as a core component of an HVAC (heating ventilation system), plays a crucial role in ensuring that the HVAC can work normally, and has the characteristics of long working time and large load. And therefore is also the most prone subsystem to failure in HVAC, it is necessary to diagnose the failure in order to improve the safety and reliability of HVAC for easy service and maintenance. In recent years, the data-driven AHU fault diagnosis method can establish a fault diagnosis model by finding a valuable rule from a large amount of historical data through self-learning, has higher accuracy, has low dependence on expert knowledge and an AHU mathematical model, and gradually becomes a hot spot.
Because of the characteristics of huge, complex and more parameters of the AHU system, how to select effective system parameters in the AHU system for constructing a fault diagnosis model becomes a research key and a difficulty of an AHU fault diagnosis method based on data driving. If useful sample data cannot be acquired, it is difficult to establish an accurate AHU fault diagnosis model by using the useful sample data, and the diagnosis accuracy of the established fault diagnosis model is not high due to too few and too many characteristic elements selected by the sample data and low correlation of faults. In addition, too many feature elements can result in a computational dimensionality disaster, so it is necessary to explore a feature element that can be effectively selected for use in certain fault diagnostics.
Disclosure of Invention
The invention aims to overcome the defects of the technology, and provides a data-driven AHU multi-fault diagnosis feature selection method based on an improved GA, which can accurately select certain feature elements with higher fault correlation, effectively reduce the dimension of sample data, reduce the calculated amount and improve the accuracy of AHU fault diagnosis.
In order to achieve the above object, the data-driven AHU multi-fault diagnosis feature selection method based on an improved GA according to the present invention includes the steps of:
a) Determining a fault: determining faults in the AHU, wherein a fault diagnosis model needs to be established;
b) Sample data acquisition: firstly, normal sample data acquisition is carried out, then faults are artificially applied to the AHU, and the sample data acquisition of the faults is carried out to obtain sample data;
c) Data preprocessing: tagging the sample data and normalizing the sample data;
D) Rejecting redundant features: removing redundant features in the sample data by using a Pearson correlation coefficient method;
e) Improved GA parameter initialization: the method comprises the steps of population size, maximum iteration times, variation probability, crossover probability, expected feature number and variance of normal distribution obeyed by the feature number;
F) Heuristic improvement GA population initialization: adopting binary coding to generate a population with the feature number of the feature subset satisfying the mean value as an expected feature number and the variance as the variance of normal distribution obeyed by the feature number;
G) Feature selection is performed on sample data: decoding individuals in the population, and carrying out feature selection on the sample data through the feature subset to obtain sample data after feature selection;
H) Establishing a fault diagnosis model and training: a part of the sample data is randomly divided into training sets, the training sets are input into a classifier for training to obtain a fault diagnosis model, the part of the sample data, from which the training sets are removed, is used as a test set for testing the fault diagnosis model, a test result is output, the test result comprises an accuracy rate, a false alarm rate and a false alarm rate, the test result is returned to an adaptation function of an improved GA, an individual adaptation value is calculated, selection and cross variation are carried out according to the individual adaptation value, and the cross probability and the variation probability are adaptively adjusted along with the individual adaptation value and convergence speed;
L) obtaining an optimal feature subset: and (3) returning to the step G) after updating the population until the iteration termination condition is met, and decoding the optimal individuals in the population of the last generation to obtain a feature subset which is the optimal feature subset.
Preferably, in the step a), system parameters that are abnormal after the AHU fails are selected, defined as potential feature elements { x 1,x2,...,xn } for fault diagnosis, and corresponding types of sensors are arranged at corresponding positions of the system parameters and transmitted to the computer through the data acquisition module.
Preferably, in the step B), the sample data in each type of fault is marked as X i, i is the serial number of the sample data, X i=(x1,x2,...,xn)T,x1,x2,...,xn is the characteristic element for fault diagnosis, in the step C), the actual label l c of the fault class is added to the sample data, the sample data X i=(x1,x2,...,xn,lc)T s.t.lc epsilon {1, 2..the., k }, k is the total number of fault classes, and then all the sample data are normalized to the (0, 1) interval, in the step D), redundant features in the sample data are removed by using a Pearson correlation coefficient method, a Pearson correlation coefficient ρ between any two feature elements is calculated, if the absolute value of the Pearson correlation coefficient is greater than or equal to a set threshold T ρ, any one feature element is removed, and the mathematical expression of the Pearson correlation coefficient ρ is as follows:
Where x a and x b represent any two feature elements, E (-) represents the desire, and the sample data after feature selection becomes Y i=(y1,y2,...,yp,lc)T s.t. The number of features removed is (n-p).
Preferably, in the step E), the population size p=50, the maximum iteration number T i =100, the variation probability P m 0 =0.1, the crossover probability P c 0 =0.8, the expected feature number N g =n/2, N being the total number of feature elements in the sample data, and the variance σ 2 =1.
Preferably, in the step G), the individuals in the population are decoded, the corresponding feature subset is selected, 1 represents selection, 0 represents rejection, and the sample data after feature selection is Z i=(z1,z2,...,zq,lc)T s.t.Q is the number of 1 in the chromosome coding of individuals in the population.
Preferably, in the step H), the training set is 60% -80% of the sample data, the rest is the test set, and the classifier is one of SVM, ANN, decision tree or random forest.
Preferably, in the step H), the accuracy rateFalse alarm rateRate of missing report/>Where N test is the total number of sample data in the test set,/>Predicting the number of correct sample data in the test set, namely the number of the matching of the predicted label of the sample data and the actual label l c,/>To predict the number of sample data where both the tag and the actual tag l c are faulty,/>To predict the amount of sample data for which both the tag and the actual tag l c are normal,To predict the number of sample data for which the tag is faulty and the actual tag l c is normal,/>To predict that the tag is normal, the actual tag l c is the number of failed sample data.
Preferably, in the step H), the fitness function of the improved GA is:
f(vh)=λ*ac+μ*wb+ε*lb s.t.λ+μ+ε=1,h∈{1,2,...,P}
Wherein v h is the individual in the population, P is the population size, lambda is the weight coefficient of the accuracy ac, mu is the weight coefficient of the false positive rate wb, epsilon is the weight coefficient of the false negative rate lb.
Preferably, in the step H), the crossover probability and the mutation probability are adaptively adjusted according to the following strategies:
Wherein T is the current iteration number, namely algebra of the population, p c and p m are respectively the crossover probability and mutation probability of the individual v h in the population of the T generation, xi is the adjustment coefficient of the crossover probability, And/>The upper limit values of the crossover probability and the mutation probability are respectively shown as f max which is the maximum fitness value in the T generation population, f min which is the minimum fitness value in the T generation population, and f avg which is the average fitness value in the T generation population,/>For fitness value of individual v h in the T-th generation population, f g is the desired optimal fitness value,Second order backward difference of maximum fitness,/>For the second-order backward difference of the maximum fitness in the iterative process, the calculation formula is as follows:
Wherein the method comprises the steps of F max (0) is the maximum fitness value in the initial population.
Preferably, in the step L), the iteration termination condition is: the iteration number T of the population is equal to the maximum iteration number T i or the second-order backward difference of the maximum adaptability of the populationLess than or equal to a set threshold/>Or the maximum fitness value f max of the population is greater than or equal to the expected optimal fitness value f g.
Compared with the prior art, the invention has the following advantages:
1. the method can accurately select a characteristic element with higher fault correlation, effectively reduce the dimension of sample data, reduce the calculated amount and improve the accuracy of AHU fault diagnosis;
2. The improved GA is used for fault diagnosis feature selection, so that the selection effect can be effectively improved, and the selection time is shortened.
Drawings
FIG. 1 is a schematic flow diagram of a diagnostic model in a data driven AHU multi-fault diagnostic feature selection method based on an improved GA of the present invention;
FIG. 2 is a schematic flow chart of the improved GA of the present invention.
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific examples.
As shown in fig. 1 and 2, a data-driven AHU multi-fault diagnosis feature selection method based on an improved GA includes the steps of:
A) Determining a fault: determining faults in the AHU, such as coil pollution scaling, fan faults, controller faults, fresh air valve faults, return air valve faults, cooling coil faults and the like, selecting system parameters which are abnormal after the faults of the AHU occur, namely, the system parameters are highly correlated with the faults and defined as potential characteristic elements { x 1,x2,...,xn } for fault diagnosis, arranging corresponding types of sensors at corresponding positions of the system parameters, transmitting the sensors to a computer through a data acquisition module, and normally arranging greenhouse sensors at air supply pipelines or outlets for acquisition of air supply temperature and humidity;
B) Sample data acquisition: firstly, carrying out normal sample data acquisition, then manually applying faults to the AHU, carrying out fault sample data acquisition to obtain sample data, wherein the sample data in each type of faults are marked as X i, i is the serial number of the sample data, and X i=(x1,x2,...,xn)T,x1,x2,...,xn is a characteristic element for fault diagnosis;
C) Data preprocessing: adding an actual label l c of the fault class to the sample data, wherein the sample data X i=(x1,x2,...,xn,lc)T s.t.lc epsilon {1,2,.. K }, k is the total number of the fault classes, and then normalizing all the sample data to a (0, 1) interval;
D) Rejecting redundant features: and removing redundant features in the sample data by using a Pearson correlation coefficient method, calculating a Pearson correlation coefficient rho between any two feature elements, and removing any one feature element if the absolute value of the Pearson correlation coefficient is greater than or equal to a set threshold T ρ, wherein the mathematical expression of the Pearson correlation coefficient rho is as follows:
Where x a and x b represent any two feature elements, E (-) represents the desire, and the sample data after feature selection becomes Y i=(y1,y2,...,yp,lc)T s.t. The number of the removed characteristics is (n-p);
E) Improved GA parameter initialization: the method comprises the steps of population size, maximum iteration times, variation probability, cross probability, expected feature number and variance of normal distribution obeyed by the feature number, wherein in the embodiment, the population size p=50, the maximum iteration times T i =100, the variation probability P m 0 =0.1, the cross probability P c 0 =0.8, the expected feature number N g =n/2, N is the total number of feature elements in sample data, and the variance sigma 2 =1;
F) Heuristic improvement GA population initialization: generating a normally distributed population with the feature number of the feature subset meeting the average value of N g = N/2, and the variance of sigma 2 = 1 by adopting binary coding;
G) Feature selection is performed on sample data: and decoding individuals in the population, selecting a corresponding feature subset, wherein 1 represents selection, 0 represents rejection, and sample data after feature selection is Z i=(z1,z2,...,zq,lc)T s.t. Q is the number of 1 in the individual chromosome codes in the population;
H) Establishing a fault diagnosis model and training: a part of the sample data is randomly divided into training sets, the training sets are input into a classifier for training to obtain a fault diagnosis model, the part of the sample data, from which the training sets are removed, is used as a test set, the training set is 60% -80% of the sample data, the classifier is one of an SVM, an ANN, a decision tree or a random forest, the fault diagnosis model is tested, a test result is output, the test result comprises an accuracy rate, a false alarm rate and a false alarm rate, the test result is returned to an adaptation function of an improved GA, an individual adaptation value is calculated, selection and cross variation are carried out according to the individual adaptation value, and the cross probability and the variation probability are adaptively adjusted along with the individual adaptation value and the convergence speed, wherein the accuracy rate is the same as the individual adaptation value False alarm rate/>Rate of missing reportWhere N test is the total number of sample data in the test set,/>For the number of sample data in the test set that are predicted to be correct, i.e. the number of sample data for which the predicted tag matches the actual tag l c,To predict the number of sample data where both the tag and the actual tag l c are faulty,/>For the number of sample data for which both the predictive tag and the actual tag l c are normal,/>To predict the number of sample data for which the tag is faulty and the actual tag l c is normal,/>To predict the number of sample data for which the label is normal and the actual label l c is faulty, the fitness function of the improved GA is:
f(vh)=λ*ac+μ*wb+ε*lb s.t.λ+μ+ε=1,h∈{1,2,...,P}
Wherein v h is the individual in the population, P is the population size, lambda is the weight coefficient of the accuracy ac, mu is the weight coefficient of the false positive rate wb, epsilon is the weight coefficient of the false negative rate lb, and the crossover probability and the variation probability are adaptively adjusted according to the following strategies:
Wherein T is the current iteration number, namely algebra of the population, p c and p m are respectively the crossover probability and mutation probability of the individual v h in the population of the T generation, xi is the adjustment coefficient of the crossover probability, And/>The upper limit values of the crossover probability and the mutation probability are respectively shown as f max which is the maximum fitness value in the T generation population, f min which is the minimum fitness value in the T generation population, and f avg which is the average fitness value in the T generation population,/>For fitness value of individual v h in the T-th generation population, f g is the desired optimal fitness value,Second order backward difference of maximum fitness,/>For the second-order backward difference of the maximum fitness in the iterative process, the calculation formula is as follows:
Wherein the method comprises the steps of F max (0) is the maximum fitness value in the initial population;
L) obtaining an optimal feature subset: returning to the step G) after updating the population until the iteration termination condition is met, wherein the feature subset obtained by decoding the optimal individuals in the population of the last generation is the optimal feature subset, and the iteration termination condition is as follows: the iteration number T of the population is equal to the maximum iteration number T i or the second-order backward difference of the maximum adaptability of the population Less than or equal to a set threshold/>Or the maximum fitness value f max of the population is greater than or equal to the expected optimal fitness value f g.
The invention discloses a data-driven AHU multi-fault diagnosis feature selection method based on improved GA, wherein a feature subset is selected by a binary coding method, 1 represents selection, and 0 represents discarding. And constructing an adaptability function of the GA according to the diagnosis accuracy rate or/and the false alarm rate of the fault diagnosis model, updating the population through cross variation, and performing iterative optimization to obtain a final optimal feature subset, so that feature elements which are most suitable for fault diagnosis of a certain fault are selected.
Claims (8)
1. A data-driven AHU multi-fault diagnosis feature selection method based on improved GA is characterized in that: the method comprises the following steps:
a) Determining a fault: determining faults in the AHU, wherein a fault diagnosis model needs to be established;
b) Sample data acquisition: firstly, normal sample data acquisition is carried out, then faults are artificially applied to the AHU, and the sample data acquisition of the faults is carried out to obtain sample data;
c) Data preprocessing: tagging the sample data and normalizing the sample data;
D) Rejecting redundant features: removing redundant features in the sample data by using a Pearson correlation coefficient method;
e) Improved GA parameter initialization: the method comprises the steps of population size, maximum iteration times, variation probability, crossover probability, expected feature number and variance of normal distribution obeyed by the feature number;
F) Heuristic improvement GA population initialization: adopting binary coding to generate a population with the feature number of the feature subset satisfying the mean value as an expected feature number and the variance as the variance of normal distribution obeyed by the feature number;
G) Feature selection is performed on sample data: decoding individuals in the population, and carrying out feature selection on the sample data through the feature subset to obtain sample data after feature selection;
H) Establishing a fault diagnosis model and training: the method comprises the steps of dividing part of sample data into training sets randomly, inputting the training sets into a classifier for training to obtain a fault diagnosis model, using the part of the sample data, from which the training sets are removed, as a test set, testing the fault diagnosis model, outputting a test result, wherein the test result comprises an accuracy rate, a false alarm rate and a false alarm rate, returning the test result to an adaptation function of an improved GA, calculating an adaptation value of an individual, selecting and carrying out cross variation according to the adaptation value of the individual, and adaptively adjusting the cross probability and variation probability along with the adaptation value and convergence speed of the individual, wherein the adaptation function of the improved GA is as follows:
f(vh)=λ*ac+μ*wb+ε*lb s.t.λ+μ+ε=1,h∈{1,2,...,P}
Wherein v h is the individual in the population, P is the population size, lambda is the weight coefficient of the accuracy ac, mu is the weight coefficient of the false positive rate wb, epsilon is the weight coefficient of the false negative rate lb, and the crossover probability and the variation probability are adaptively adjusted according to the following strategies:
Wherein T is the current iteration number, namely algebra of the population, p c and p m are respectively the crossover probability and mutation probability of the individual v h in the population of the T generation, xi is the adjustment coefficient of the crossover probability, And/>The upper limit values of the crossover probability and the mutation probability are respectively shown as f max which is the maximum fitness value in the T generation population, f min which is the minimum fitness value in the T generation population, and f avg which is the average fitness value in the T generation population,/>For fitness value of individual v h in the T-th generation population, f g is the desired optimal fitness value,Second order backward difference of maximum fitness,/>For the second-order backward difference of the maximum fitness in the iterative process, the calculation formula is as follows:
Wherein the method comprises the steps of F max (0) is the maximum fitness value in the initial population;
L) obtaining an optimal feature subset: and (3) returning to the step G) after updating the population until the iteration termination condition is met, and decoding the optimal individuals in the population of the last generation to obtain a feature subset which is the optimal feature subset.
2. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 1, wherein: in the step A), system parameters which are abnormal after the AHU has faults are selected, the system parameters are defined as potential characteristic elements { x 1,x2,...,xn } for fault diagnosis, corresponding types of sensors are arranged at corresponding positions of the system parameters, and the sensors are transmitted to a computer through a data acquisition module.
3. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 2, wherein: in the step B), the sample data in each type of faults is recorded as X i, i is the serial number of the sample data, X i=(x1,x2,...,xn)T,x1,x2,...,xn is a feature element used for fault diagnosis, in the step C), an actual label l c of a fault class is added to the sample data, the sample data X i=(x1,x2,...,xn,lc)T s.t.lc epsilon {1, 2..the k } is the total number of fault classes, then all the sample data are normalized to a (0, 1) interval, in the step D), redundant features in the sample data are removed by using a Pearson correlation coefficient method, a Pearson correlation coefficient ρ between any two feature elements is calculated, if the absolute value of the Pearson correlation coefficient is greater than or equal to a set threshold T ρ, any one feature element is removed, and the mathematical expression of the Pearson correlation coefficient ρ is as follows:
Wherein x a and x b represent any two characteristic elements, E (-) represents the expectation, and sample data after characteristic selection becomes The number of features removed is (n-p).
4. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 3, wherein: in the step E), the population size p=50, the maximum iteration number T i =100, the variation probability P m 0 =0.1, the crossover probability P c 0 =0.8, the expected feature number N g =n/2, N being the total number of feature elements in the sample data, and the variance σ 2 =1.
5. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 4, wherein: in the step G), decoding the individuals in the population, selecting the corresponding feature subset, wherein 1 represents selection, 0 represents elimination, and the sample data after feature selection isQ is the number of 1 in the chromosome coding of individuals in the population.
6. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 5, wherein: in the step H), the training set is 60% -80% of sample data, the rest is a test set, and the classifier is one of SVM, ANN, decision tree or random forest.
7. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 6, wherein: in the step H), the accuracy rateFalse alarm rate/>Rate of missing reportWhere N test is the total number of sample data in the test set,/>For the number of sample data in the test set that are predicted to be correct, i.e. the number of sample data for which the predicted tag matches the actual tag l c,To predict the number of sample data where both the tag and the actual tag l c are faulty,/>For the number of sample data for which both the predictive tag and the actual tag l c are normal,/>To predict the number of sample data for which the tag is faulty and the actual tag l c is normal,/>To predict that the tag is normal, the actual tag l c is the number of failed sample data.
8. The improved GA-based data-driven AHU multi-fault diagnosis feature selection method of claim 7, wherein: in the step L), the iteration termination condition is: the iteration number T of the population is equal to the maximum iteration number T i or the second-order backward difference of the maximum adaptability of the populationLess than or equal to a set threshold/>Or the maximum fitness value f max of the population is greater than or equal to the expected optimal fitness value f g.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210555420.0A CN114970709B (en) | 2022-05-20 | 2022-05-20 | Improved GA-based data-driven AHU multi-fault diagnosis feature selection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210555420.0A CN114970709B (en) | 2022-05-20 | 2022-05-20 | Improved GA-based data-driven AHU multi-fault diagnosis feature selection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114970709A CN114970709A (en) | 2022-08-30 |
CN114970709B true CN114970709B (en) | 2024-06-14 |
Family
ID=82986230
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210555420.0A Active CN114970709B (en) | 2022-05-20 | 2022-05-20 | Improved GA-based data-driven AHU multi-fault diagnosis feature selection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114970709B (en) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6240804B1 (en) * | 2017-04-13 | 2017-11-29 | 大▲連▼大学 | Filtered feature selection algorithm based on improved information measurement and GA |
CN107169514A (en) * | 2017-05-05 | 2017-09-15 | 清华大学 | The method for building up of diagnosing fault of power transformer model |
CN112183598A (en) * | 2020-09-21 | 2021-01-05 | 西安理工大学 | Feature selection method based on genetic algorithm |
-
2022
- 2022-05-20 CN CN202210555420.0A patent/CN114970709B/en active Active
Non-Patent Citations (2)
Title |
---|
Automated Fault Detection and Diagnosis for an Air Handling Unit Based on a GA-Trained RBF Network;Yonghong Huang;《2006 International Conference on Communications, Circuits and Systems》;20060630;全文 * |
基于GA-BP的煤矿大型机电设备D-S数据融合故障诊断的研究;马宪民;梁兰;张永强;施乐平;;煤炭技术;20160110(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114970709A (en) | 2022-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116757534B (en) | Intelligent refrigerator reliability analysis method based on neural training network | |
CN111582542B (en) | Power load prediction method and system based on anomaly repair | |
CN111275136B (en) | Fault prediction system based on small sample and early warning method thereof | |
CN115827411B (en) | On-line monitoring and operation and maintenance assessment system and method for automation equipment | |
CN114997745B (en) | Photovoltaic fault diagnosis tracing method based on depth feature extraction | |
CN112365009A (en) | Secondary equipment abnormity diagnosis method based on deep learning network | |
CN115859777A (en) | Method for predicting service life of product system in multiple fault modes | |
CN115730228A (en) | Central air conditioner energy consumption analysis method based on BP neural network | |
CN114553671A (en) | Diagnosis method for power communication network fault alarm | |
CN114529067A (en) | Method for performing predictive maintenance on electric vehicle battery based on big data machine learning | |
CN114970709B (en) | Improved GA-based data-driven AHU multi-fault diagnosis feature selection method | |
CN113837096B (en) | Rolling bearing fault diagnosis method based on GA random forest | |
CN116796617A (en) | Rolling bearing equipment residual life prediction method and system based on data identification | |
CN115130746A (en) | Method and device for constructing fault early warning model of frequency converter and electronic equipment | |
CN111967593A (en) | Method and system for processing abnormal data based on modeling | |
CN113642784A (en) | Wind power ultra-short term prediction method considering fan state | |
CN112380041B (en) | Xgboost-based failure prediction method for command communication equipment | |
CN118035923B (en) | Power grid wave recording abnormal signal identification method | |
CN114187977B (en) | Equipment material spheroidization degree prediction method and system | |
CN117592789B (en) | Power grid environment fire risk assessment method and equipment based on time sequence analysis | |
Martínez Viol et al. | HVAC early fault detection using a fuzzy logic based approach | |
Xie et al. | Using feature selection techniques to determine best feature subset in prediction of window behaviour | |
CN117318008A (en) | Power distribution network reliability assessment method considering source-load probability characteristics | |
Lin et al. | PSO Hammerstein model based PM2. 5 concentration forecasting | |
PRIETO | CHAPTER THIRTEEN HVAC EARLY FAULT DETECTION USING A FUZZY LOGIC-BASED APPROACH VICTOR MARTINEZ-VIOL1, EVA M. URBANO1 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |