CN115329663A

CN115329663A - Key feature selection method and device for processing power load monitoring sparse data

Info

Publication number: CN115329663A
Application number: CN202210885026.3A
Authority: CN
Inventors: 汤向华; 王栋; 吴迪; 施雄杰; 张丽娟; 汪家钰; 俞天鹤; 罗飞; 陈飞龙
Original assignee: Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Current assignee: Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2022-07-26
Filing date: 2022-07-26
Publication date: 2022-11-11

Abstract

The invention discloses a key feature selection method and a device for processing power load monitoring sparse data, which reduce the subsequent calculation amount of data set analysis and improve the working efficiency by performing feature selection on the power load monitoring sparse data; the key features are selected, and irrelevant features and redundant features are removed, so that the accuracy of machine learning training and prediction is improved; the large redundancy of the high-dimensional heterogeneous data can find the relationship between the scenes and the associated features in the feature analysis of each typical scene of the power system, so that the method has pertinence in the actual analysis and can obviously shorten the processing time; by the mutual matching of the key feature selection method and the device, the problems that the electric power data sample is difficult to obtain and has real-time performance, and data loss is caused by system failure or external interference possibly encountered in various links such as acquisition, transmission, storage and the like are solved; and the correct operation decision can be made through the accurate analysis of the incomplete data.

Description

Key feature selection method and device for processing power load monitoring sparse data

Technical Field

The invention relates to the field of feature selection in the field of power load data mining, in particular to a key feature selection method and device for processing power load monitoring sparse data.

Background

In recent years, with the wide application of power grid digital technology, when high-proportion power electronic equipment is connected to a power grid in a large scale, massive multi-source heterogeneous data is generated. High-dimensional heterogeneous data has a large amount of redundancy, the relationship between a scene and the associated features of the scene cannot be found in the feature analysis of each typical scene of the power system, the pertinence is not only lacked in the actual analysis, and the processing time is also obviously increased. In addition, the electric power data sample is difficult to obtain and has real-time performance, and system faults or external interference may occur in various links such as collection, transmission and storage, so that data loss is caused. And making a correct operation decision by means of inaccurate analysis of incomplete data.

A method for processing power quality data as described in publication No. CN110084408A, comprising: a step of blocking: receiving power quality data acquired by a computer public network or a power quality monitoring platform, wherein the power quality data comprises spatial information, time information and event information, grouping the power quality data according to the spatial information, grouping the power quality data with the same spatial information into the same group, and dividing each group of power quality data according to time intervals; a cleaning step: acquiring the electric energy quality data after the blocking according to the blocking step, and cleaning the electric energy quality data by using a blocking fusion method; and (3) an analysis step: and analyzing the power quality data obtained in the cleaning step by adopting a statistical model. The method for processing the high-speed power quality data is provided, the power quality data is linked with environment information of different positions through the cause and effect, and the visualization of the processing and analysis results of the power quality data is realized.

A method and system for processing measurement data related to an electric power network or other electrical devices by using machine learning techniques and providing abnormal event detection from the electrical measurement data is also described in publication number WO2022074400 A1. According to a first aspect, a method of processing high resolution electrical measurement data may comprise obtaining high resolution electrical measurement data relating to time series data of an electrical or other parameter measured from an electrical grid system or other electrical device, wherein the time series data comprises a first set of data points. The time series data may be converted to feature vector format data, where the time series data is grouped into a plurality of data sets, each data set representing a subset of the first set of data points. A statistical data clustering scheme may be performed to generate different clustering patterns from the feature vector format data as cluster data, the cluster data including a first cluster related to the first electrical trend and a second cluster related to a second cluster different from the second electrical trend. A first electrical trend, wherein the cluster data comprises an anomalous data pattern that is part of the first cluster or the second cluster, and the anomalous data pattern is remote from its respective cluster center. The anomalous event detection can be based at least in part on the anomalous data cluster data including a first cluster associated with a first electrical trend and a second cluster associated with a second electrical trend different from the first electrical trend, wherein the cluster data includes anomalous data patterns that are partial clusters of the first or second electrical trends and the anomalous data patterns are remote from their respective cluster centers. The anomalous event detection can be based at least in part on the anomalous data cluster data including a first cluster associated with a first electrical trend and a second cluster associated with a second electrical trend different from the first electrical trend, wherein the cluster data includes anomalous data patterns that are partial clusters of the first or second electrical trends and the anomalous data patterns are remote from their respective cluster centers. The abnormal event detection may be based at least in part on abnormal data.

In summary, the technical problems to be solved by the present invention are:

1) A large amount of redundancy exists in high-dimensional heterogeneous data, the relationship between a scene and associated features of the scene cannot be found accurately in feature analysis of each typical scene of a power system, the pertinence is lacked in actual analysis, and the processing time is obviously increased;

2) The electric power data sample is difficult to obtain and real-time, and data loss is caused by system failure or external interference in various links such as acquisition, transmission, storage and the like;

3) And making a correct operation decision by means of inaccurate analysis of incomplete data.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a key feature selection method and device for processing power load monitoring sparse data, so as to solve the problems.

In order to achieve the above object, the present invention is achieved by the following technical solutions.

The key feature selection method for processing the power load monitoring sparse data comprises the following steps of: acquiring power load monitoring sparse data F, forming monitoring data input sparse flow characteristics, and constructing a buffer matrix B; putting the buffer matrix B into a pre-trained hidden feature filling model, calculating missing values and filling the missing values into a complete matrix

And carrying out flow characteristic selection on the complete matrix and storing an optimal characteristic subset BSF.

Preferably, the acquired power load monitoring sparse data F includes steady-state characteristic data before the fault and transient-state characteristic data after the fault in M rows and N columns, and the constructed buffer matrix B is used for caching newly arrived sparse flow characteristics in M rows and Bs columns, where Bs < < N.

Preferably, after the buffer matrix is full, the buffer matrix B is put into a pre-trained hidden feature filling model, PM × k and QN × k are randomly generated, R = B, and the objective function is optimized through Cauchy loss. Prediction matrix

Filling the predicted value in the missing position to obtain a complete matrix

Wherein λ P and λ Q are regularization parameters corresponding to P and Q, γ is a constant, Ω M × N is an indication matrix, and when the corresponding position in R has a monitoring value of 1, otherwise it is 0.

Preferably, for complete matrices

Develop flow feature selection, pair

Analyzing the real-time conditions of medium-sized feature simulation one by one, firstly performing correlation analysis on newly arrived features, calculating the correlation between the features and the labels by using Fisher-z test, returning a p value, and transferring a significance level alpha through a fuzzy membership function to enable the alpha to fluctuate between 0.01 and 0.1;

judging whether p < alpha is true, if so, carrying out redundancy analysis on the characteristic;

otherwise, judging whether alpha < p <0.1 is established or not, and if so, performing fuzzy correlation analysis on the new features;

otherwise the new feature will be discarded.

Preferably, the redundancy analysis is carried out on the new characteristics with the p value smaller than alpha. The method mainly comprises two steps: firstly, calculating the redundancy between the new feature and the existing features in the optimal feature subset BSF (initialized to be empty) through Fisher-z test, if the redundancy exists, discarding the new feature, otherwise, adding the new feature into the BSF; secondly, calculating whether existing features in the BSF become redundant due to the arrival of new features, and if the existing features are redundant, discarding the features;

preferably, fuzzy correlation analysis is carried out on the features which are not independent or correlated, the dependency degree of the features is calculated through a near neighbor rough set, and the features are added into the fuzzy correlation feature subset FSF and are sorted. Adding BSF to the front-ranked | BSF | or 2 in the FSF until no new features flow in; the above steps are repeated until no new features flow in, and finally the BSF is output.

The invention also provides a key characteristic selection device for processing the sparse data of the power load monitoring, which comprises a data buffer module, a data completion module and a flow characteristic selection module which are sequentially connected,

the data buffering module: the method is used for acquiring sparse data of power load monitoring and caching real-time data into a buffer matrix;

a data completion module: the method comprises the steps of putting a sparse buffer matrix into a pre-trained hidden feature model, calculating missing values and filling the missing values into a complete matrix;

a stream feature selection module: and carrying out feature selection on the complete matrix, and storing the result into an optimal feature subset, wherein the stream feature selection module comprises a correlation analysis unit, a redundancy analysis unit, a fuzzy correlation analysis unit and a storage unit.

Preferably, in the data buffering module, the acquisition of the real-time sparse data includes steady-state characteristic data before the fault and transient-state characteristic data after the fault. The columns of the buffer matrix used to buffer the real-time sparse data are much smaller than the columns of the entire power load data set.

Preferably, the data completion module generates a predicted value of the missing position by inputting the buffer matrix into the trained implicit feature model, and fills the predicted value into the missing position.

Preferably, the correlation analysis unit: for correlation analysis of features in the newly influent buffer matrix, the returned p-value is calculated by Fisher-z test. Let α denote the significance level of the fuzzy correlation. If p < α, then enter redundancy analysis, if α < p <0.1, then perform fuzzy correlation analysis, otherwise the feature is discarded.

Preferably, the redundancy analysis unit: respectively calculating whether the new features entering the redundancy analysis are redundant with the existing features in the BSF, if so, discarding the new features, and otherwise, adding the new features into the BSF; and whether to make the original features in the BSF redundant, and if so, to discard the original features.

Preferably, the fuzzy correlation analysis unit: and (5) fuzzy correlation analysis. And calculating the dependency degree of the new characteristics which are neither related nor independent with the label, and sequencing. After no new features flow in, the new features are sorted, half of the size of the first BSF is taken and stored in the BSF, and the storage unit: for storing the BSF, FSF generated after each cell execution.

Compared with the prior art, the invention discloses a key feature selection method and device for processing the power load monitoring sparse data, which reduces the subsequent calculation amount of data set analysis and improves the working efficiency by selecting the features of the power load monitoring sparse data; the key features are selected, and irrelevant features and redundant features are removed, so that the accuracy of machine learning training and prediction is improved;

the large redundancy of the high-dimensional heterogeneous data can find the relationship between the scenes and the associated features in the feature analysis of each typical scene of the power system, so that the method has pertinence in the actual analysis and can obviously shorten the processing time;

by the mutual matching of the key feature selection method and the device, the problems that the electric power data sample is difficult to obtain and has real-time performance, and data loss is caused by system failure or external interference possibly encountered in various links such as acquisition, transmission, storage and the like are solved;

and the correct operation decision can be made through the accurate analysis of the incomplete data.

Drawings

FIG. 1 is a flow chart of the steps of a key feature selection method of the present invention for processing power load monitoring sparse data;

FIG. 2 is a block diagram of a key feature selection device for processing sparse data for power load monitoring according to the present invention;

FIG. 3 is a comparison of the accuracy of monitoring sparse data feature selection performed by the present invention with four comparison algorithms.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.

The key feature selection method for processing the power load monitoring sparse data comprises the steps of obtaining the power load monitoring sparse data F, forming monitoring data input sparse flow features and constructing a buffer matrix B; putting the buffer matrix B into a pre-trained hidden feature filling model, calculating missing values and filling the missing values into a complete matrix

And carrying out flow characteristic selection on the complete matrix and storing the flow characteristic selection into an optimal characteristic subset BSF.

The method comprises the steps of obtaining power load monitoring sparse data F in M rows and N columns, wherein the power load monitoring sparse data F comprises steady-state characteristic data before failure and transient-state characteristic data after failure, and constructing a buffer matrix B in M rows and Bs columns for caching newly arrived sparse flow characteristics, wherein Bs < < N.

Putting the buffer matrix B into a pre-trained hidden feature model, calculating a missing value and filling the missing value into a complete matrix

The missing value calculation steps are as follows:

the method comprises the following steps: PM × k and QN × k are randomly generated, let R = B, and the following objective function is optimized by cauchy loss:

wherein λ P and λ Q are regularization parameters corresponding to P and Q, γ is a constant, Ω M × N is an indication matrix, and when the corresponding position in R has a monitoring value of 1, otherwise it is 0;

step two: prediction matrix

Filling the predicted value in the missing position to obtain a complete matrix

For the complete matrix

Developing stream feature selection, comprising the steps of:

the method comprises the following steps: and (3) correlation analysis: when new features flow in, calculating feature correlation through Fisher-z inspection, returning p value, making alpha represent significance level of fuzzy correlation, if p < alpha, entering redundancy analysis, if alpha < p <0.1, performing fuzzy correlation analysis, otherwise discarding the features;

step two: and (3) redundancy analysis: respectively calculating whether the new features entering the redundancy analysis are redundant with the existing features in the BSF, if so, discarding the new features, otherwise, adding the new features into the BSF; and whether the original features in the BSF are made redundant or not, if so, the original features are discarded;

step three: fuzzy correlation analysis: calculating the dependency degree of the new features subjected to the step three and the label, adding the new features into the fuzzy correlation feature subset FSF, sequencing the features, and taking half of the size of the previous BSF to store in the BSF after no new features flow in;

the above steps are repeated until no new features flow in, and finally the BSF is output.

The key feature selection device for processing the sparse data of the power load monitoring comprises a data buffer module, a data completion module and a flow feature selection module which are sequentially connected,

the data buffer module is used for acquiring sparse data of power load monitoring and caching real-time data into a buffer matrix;

the data completion module is used for putting the sparse buffer matrix into a pre-trained hidden feature model, calculating a missing value and filling the missing value into a complete matrix;

the stream characteristic selection module is used for carrying out characteristic selection on the complete matrix and storing the result into the optimal characteristic subset,

the stream feature selection module comprises a correlation analysis unit, a redundancy analysis unit, a fuzzy correlation analysis unit and a storage unit.

In the data buffering module, the acquisition of the monitoring sparse data comprises steady-state characteristic data before failure and transient-state characteristic data after failure, and the column of a buffering matrix for buffering the real-time monitoring sparse data is far smaller than that of the whole power load data set.

And the data completion module generates a predicted value of the missing position by putting the buffer matrix into the trained hidden feature model and fills the predicted value into the missing position.

The correlation analysis unit is used for carrying out correlation analysis on the characteristics in the newly-flowed buffer matrix, calculating a returned p value through Fisher-z inspection, enabling alpha to represent the significance level of fuzzy correlation, transferring alpha to be 0.01-0.1 by a fuzzy membership function, entering redundancy analysis if p is less than alpha, carrying out fuzzy correlation analysis if alpha is less than p is less than 0.1, and otherwise discarding the characteristics.

The redundancy analysis unit is used for respectively calculating whether the new features entering the redundancy analysis are redundant with the features already existing in the BSF, if so, discarding the new features, and otherwise, adding the new features into the BSF; and whether to make the original features in the BSF redundant, and if so, to discard the original features.

The fuzzy correlation analysis unit is used for carrying out fuzzy correlation analysis, calculating the dependency degree of the new features in the third step and storing the new features and the label into a fuzzy correlation feature subset FSF, sorting the new features, and after no new features flow in, sorting the new features, and storing half of the size of the previous BSF into the BSF; a storage unit: for storing the BSF, FSF generated after each cell execution.

The key feature selection method for processing the power load monitoring sparse data comprises the following steps:

s101, inputting sparse data through power load monitoring F. And automatically acquiring power load operation data in real time through equipment to serve as original input data. The data may be steady state characteristic data before the fault or transient characteristic data after the fault. And because links such as collection, transmission, storage break down, usually have sparse data of missing value.

And S102, storing the data into a buffer matrix B. For data generated in real time, a buffer matrix B is arranged to buffer the arriving data, when the buffer matrix is full, the process goes to step S103, and step S102 continues to buffer the arriving data.

S103, inputting a pre-trained hidden feature model to obtain a complete matrix. Let R = B by randomly generating PM × k and QN × k, and optimize the objective function by cauchy loss.

Wherein λ P and λ Q are regularization parameters corresponding to P and Q, γ is a constant, Ω mxn is an indicator matrix, and is 1 when the corresponding position in R has a monitoring value, otherwise it is 0. Prediction matrix

Filling the predicted value in the missing position to obtain a complete matrix

And S104, the complete matrix flows out new features f one by one.

And S105, evaluating the correlation between the new features and the tags, and returning a p value. For the new inflow feature f, its correlation with the target is calculated by Fisher-z test and p value is returned. The significance level alpha is mobilized through a trapezoidal fuzzy membership function. The Fisher-z test formula is as follows:

in the formula, N represents an example number, z is a condition characteristic, and xi is a partial correlation coefficient.

S106, when p is smaller than alpha, f has correlation with the target, and the process goes to step S107, otherwise, the process goes to step S111.

And S107, evaluating the redundancy between the new feature f and the existing features in the optimal feature subset BSF. And calculating the independence between the new feature f and the existing features in the BSF through Fisher-z test, wherein if the new feature f is independent, the new feature is not redundant.

And S108, judging whether the new feature f is redundant. If not, the process proceeds to step S110, otherwise, the process proceeds to step S109.

S109, the new feature f is discarded, and the process proceeds to step S114.

And S110, adding the BSF into the new features, evaluating the redundancy among the existing features, and discarding the redundant features. The specific method is the same as step S107. Subsequently, the process proceeds to step S114.

And S111, judging that alpha < p <0.1, if the alpha < p <0.1 is not satisfied, entering S112, and if the alpha < p > is not satisfied, entering S113.

S112, discard the new feature f, and proceed to step S114.

And S113, calculating the dependency of the new feature f and storing the dependency in a fuzzy correlation feature subset FSF. The way the dependencies are computed by blurring the coarse set is as follows:

i.e. the ratio of the lower approximation to the upper corpus of the feature f.

And S114, inquiring whether new characteristics continuously flow in. When no new feature is entered, the process proceeds to step S115. Otherwise, the step S105 is continued, and feature selection is performed again on the newly flowed-in features.

And S115, adding the front | BSF |/2 fuzzy correlation features in the FSF into the BSF, wherein the front | BSF |/2 fuzzy correlation features are the finally selected optimal feature subset.

And S116, outputting the optimal feature subset BSF.

Fig. 2 shows a block diagram of a key feature selection apparatus for processing sparse data of power load monitoring in an embodiment of the present invention. The device comprises:

the data buffering module 210: the method is used for acquiring sparse data of power load monitoring and caching real-time monitoring data into a buffer matrix;

in the embodiment of the invention, the acquisition of the monitoring sparse data comprises steady-state characteristic data before failure and transient-state characteristic data after failure. The columns of the buffer matrix used to buffer the monitoring sparse data are much smaller than the columns of the entire power load data set.

The data completion module 220: the method comprises the steps of putting a sparse buffer matrix into a pre-trained hidden feature model, calculating missing values and filling the missing values into a complete matrix;

the stream feature selection module 230: and carrying out feature selection on the complete matrix, and storing the result into the optimal feature subset.

The stream feature selection module includes:

correlation analysis unit 231: for correlation analysis of features in the newly influent buffer matrix, the returned p-value is calculated by Fisher-z test. Let α denote the significance level of the fuzzy correlation. If p < α, then enter redundancy analysis, if α < p <0.1, then perform fuzzy correlation analysis, otherwise the feature is discarded.

Redundancy analysis unit 232: respectively calculating whether the new features entering the redundancy analysis are redundant with the existing features in the BSF, if so, discarding the new features, otherwise, adding the new features into the BSF; and whether to make the original features in the BSF redundant, and if so, to discard the original features.

Fuzzy correlation analysis unit 233: and (5) fuzzy correlation analysis. And calculating the dependency degree of the new features which are neither related nor independent with the label, adding the new features into the fuzzy related feature subset FSF and sequencing the new features. After no new features are fed in, the new features are sorted, and half of the size of the top BSF is taken and stored in the BSF.

The storage unit 234: for storing the BSF generated after each cell execution.

FIG. 3 is a comparison of the accuracy of sparse data feature selection with four algorithms after application of the embodiments of the present invention. The OS2FSU is the proposed method of the present invention, compared to a classical algorithm Fast-OSFS (IEEE T PATTERN ANAL, 2012) and a newly proposed algorithm SFS _ FI (IEEE TNNLS, 2020). Six datasets (COIL, lung, SMK _ CAN _191, isolet, USPS, mfeat-fac) from ASU (https:// jundongl. Githu. Io/sciit-feature/datasets. Html), UCI (http:// architecture. UCI. Edu/ml/index. Php) were verified, feature selection was performed with 50% missing dataset data, and classification training was performed using support vector machine, random forest, K nearest neighbor classification algorithms. And taking the BSF finally output by the feature selection algorithm as the input of the classifier, substituting the training samples into the three classifiers for fitting training and calling functions for hyper-parameter adjustment to obtain the model with the best effect. And predicting the test sample and calculating the accuracy. The average value was taken as the final accuracy, and the result is shown in fig. 3. The prediction accuracy of the method is obviously higher than that of the other four methods, and the key feature selection method for processing the power load monitoring sparse data can effectively improve the precision and speed of data mining.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular is intended to include the plural unless the context clearly dictates otherwise, and it should be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of features, steps, operations, devices, components, and/or combinations thereof.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The key feature selection method for processing the power load monitoring sparse data is characterized by comprising the following steps of: acquiring power load monitoring sparse data F, forming monitoring data input sparse flow characteristics, and constructing a buffer matrix B;

putting the buffer matrix B into a pre-trained hidden feature filling model, calculating missing values and filling the missing values into a complete matrix

2. The method of claim 1, wherein the method comprises: the method comprises the steps of obtaining power load monitoring sparse data F in M rows and N columns, wherein the power load monitoring sparse data F comprises steady-state characteristic data before failure and transient-state characteristic data after failure, and constructing a buffer matrix B in M rows and Bs columns for caching newly arrived sparse flow characteristics, wherein Bs < < N.

3. The method of claim 1, wherein the method comprises: putting the buffer matrix B into a pre-trained hidden feature model, calculating missing values and filling the missing values into a complete matrix

The missing value calculation steps are as follows:

the method comprises the following steps: randomly generating PM × k and QN × k, making R = B, and optimizing the following objective function by Cauchy loss:

step two: prediction matrix

Filling the predicted value in the missing position to obtain a complete matrix

4. The method of claim 1, wherein the method comprises: for the complete matrix

Developing stream feature selection, comprising the steps of:

step two: and (3) redundancy analysis: respectively calculating whether the new features entering the redundancy analysis are redundant with the existing features in the BSF, if so, discarding the new features, and otherwise, adding the new features into the BSF; and whether the original features in the BSF are made redundant or not, if so, the original features are discarded;

5. A key feature selection device for processing sparse data for power load monitoring as claimed in any one of claims 1 to 4, wherein: comprises a data buffer module, a data completion module and a stream characteristic selection module which are connected in sequence,

the data completion module is used for putting a sparse buffer matrix into a pre-trained hidden feature model, calculating a missing value and filling the missing value into a complete matrix;

6. The apparatus for selecting key features of processing sparse data for power load monitoring as claimed in claim 5, wherein: in the data buffering module, the acquisition of the monitoring sparse data comprises steady-state characteristic data before failure and transient-state characteristic data after failure, and the column of a buffering matrix for buffering the real-time monitoring sparse data is far smaller than that of the whole power load data set.

7. The apparatus for selecting key features of processing sparse data for power load monitoring as claimed in claim 5, wherein: and the data completion module generates a predicted value of the missing position by putting the buffer matrix into the trained hidden feature model and fills the predicted value into the missing position.

8. A key feature selection device for processing power load monitoring sparse data as recited in claim 5, wherein: the correlation analysis unit is used for carrying out correlation analysis on the characteristics in the newly-flowed buffer matrix, calculating a returned p value through Fisher-z inspection, enabling alpha to represent the significance level of fuzzy correlation, transferring alpha to be 0.01-0.1 by a fuzzy membership function, entering redundancy analysis if p is less than alpha, carrying out fuzzy correlation analysis if alpha is less than p is less than 0.1, and otherwise discarding the characteristics.

9. A key feature selection device for processing power load monitoring sparse data as recited in claim 5, wherein: the redundancy analysis unit is used for respectively calculating whether the new features entering the redundancy analysis are redundant with the existing features in the BSF, if so, discarding the new features, and otherwise, adding the new features into the BSF; and whether to make the original features in the BSF redundant, and if so, to discard the original features.

10. The apparatus for selecting key features of processing sparse data for power load monitoring as claimed in claim 5, wherein: the fuzzy correlation analysis unit is used for carrying out fuzzy correlation analysis, calculating the dependency degree of the new features in the third step and storing the new features and the label into a fuzzy correlation feature subset FSF, sorting the new features, and after no new features flow in, sorting the new features, and storing half of the size of the previous BSF into the BSF; a storage unit: for storing the BSF, FSF generated after each cell execution.