CN113408622A - Non-invasive load identification method and system considering characteristic quantity information expression difference - Google Patents

Non-invasive load identification method and system considering characteristic quantity information expression difference Download PDF

Info

Publication number
CN113408622A
CN113408622A CN202110685729.7A CN202110685729A CN113408622A CN 113408622 A CN113408622 A CN 113408622A CN 202110685729 A CN202110685729 A CN 202110685729A CN 113408622 A CN113408622 A CN 113408622A
Authority
CN
China
Prior art keywords
feature
sample
membership
clustering center
membership degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110685729.7A
Other languages
Chinese (zh)
Inventor
鞠文杰
张海静
包圣
吕志星
程思瑾
秦承龙
王一
王沈征
张虓
张利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202110685729.7A priority Critical patent/CN113408622A/en
Publication of CN113408622A publication Critical patent/CN113408622A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides a non-invasive load identification method and system considering a difference in feature quantity information expression, including: the method comprises the steps of utilizing non-invasive acquisition equipment power utilization data to conduct feature extraction and feature screening to form an optimal feature subset; solving a weight vector corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center; performing characteristic weighting processing when calculating Euclidean distance of a clustering algorithm, and integrating a weight vector into distance calculation in a multiplier mode to form a weighted distance matrix of a sample point and a clustering center; and calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, iterating for multiple times to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified. The method can realize the close combination of the features and the identification algorithm, and can effectively enhance the generalization application capability and the identification accuracy of the algorithm.

Description

Non-invasive load identification method and system considering characteristic quantity information expression difference
Technical Field
The disclosure belongs to the technical field of intelligent power utilization and energy efficiency monitoring, and particularly relates to a non-invasive load identification method and system for calculating characteristic quantity information expression difference.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The power internet of things is one of strategic development targets of a power system, and is a key technology for realizing power information acquisition, analysis, management and application by efficiently utilizing various means. Based on the high-quality monitoring information acquired from the user load port, the using state of the user power equipment can be analyzed and identified, namely, the non-invasive load monitoring of the user is realized, so that powerful support is provided for the user to manage the self energy consumption and the electricity expense, the electric power company to strengthen the load side management and the like. Therefore, the research of non-invasive load identification is receiving attention and is urgently needed to be advanced.
Non-intrusive load recognition is generally based on the knowledge of load type and the determination of load characteristics, and can be classified as a pattern recognition problem. The neural network and the deep learning technology have self-learning capability and associative storage capability, so that the neural network and the deep learning technology are widely applied to non-invasive load identification. The document ' Songxahi Safan, Zhou Ming, paint Beijing, Li Heng Yin ', a non-invasive load monitoring method based on k-NN combined kernel Fisher discrimination [ J ]. electric power system automation, 2018,42(06):73-80 ' utilizes the simplicity of KNN algorithm combined with the nonlinear discrimination capability of kernel Fisher classification to identify the types of household appliances with similar characteristics, and obtains better identification effect. However, when the method is applied to analyze the problem of the overlay mode of the electrical appliance, the method has a poor effect in response to the characteristic masking situation which may exist in the electrical appliance with a large characteristic quantity difference, although the method also belongs to the identification problem with similar characteristics. In fact, most household appliances of residents still belong to low-power equipment, the change is relatively small when power conversion occurs, and the high-power equipment has relatively obvious transfer power when the working state changes, which can cause the characteristics of other equipment on a time axis to be partially or wholly covered, so that research and development of an identification method are needed to enhance the generalization of the identification method, namely, the capability of the identification method for adapting to various scenes is fully improved.
The fuzzy clustering algorithm is used as an unsupervised learning method, and the load identification can be realized without learning of a training set, which means that the fuzzy clustering algorithm has less limitation on a sample set compared with a supervised learning method, namely the applicability is stronger. The most widely applied fuzzy C-means clustering algorithm is well applied to non-invasive load identification, but the problems of subjective selection of a clustering center and a fuzzy index, susceptibility to noise interference and the like exist. In order to avoid subjective randomness when the clustering center is determined, the clustering center can be determined by introducing a fast hill climbing function method, using a K-means algorithm and a granularity principle and integrating improvement measures such as a clustering effectiveness function and the like, the selection of the clustering center is more objective by the improvement strategies, and the clustering center is used as a main basis for determining the type of an electric appliance mode by an identification algorithm and has a larger influence on the final identification effect, so that the generalization of related researches is mainly enhanced by optimizing parameter setting of the algorithm. The document ' Stone shinning edge, Zhouyanjun, Zhang Wujun, Yuhu, Li Bin, Wanglon ' adopts a load classification method of deep learning and multi-dimensional fuzzy C-means clustering [ J ]. the power system and the automatic study report thereof, 2019,31(07):43-50 ' combines the conventional Euclidean distance and the improved dynamic time bending distance considering the trend sequence characteristics into a new similarity distance, highlights the importance of the trend characteristics on the basis of the original algorithm, leads the algorithm classification result to be more reasonable, and widens the application range of the algorithm. The above research improves the fuzzy C-means clustering algorithm to a certain extent, but the importance processing of the characteristic quantity in the Euclidean distance calculation of the FCM algorithm causes adverse effects, the influence of characteristic quantity information expression difference on identification is not considered fundamentally, and the identification requirement under the complex and various load characteristics is difficult to meet.
Disclosure of Invention
In order to overcome the defects of the prior art, the non-invasive load identification method considering the characteristic quantity information expression difference is provided, and the information expression capacity of the characteristic quantity is fully considered in the load identification process, so that the generalization application capacity and the identification accuracy of the algorithm can be effectively improved.
In order to achieve the above object, one or more embodiments of the present disclosure provide the following technical solutions:
in a first aspect, a non-invasive load identification method for accounting for difference in feature quantity information expression is disclosed, which includes:
the method comprises the steps of utilizing non-invasive acquisition equipment power utilization data to conduct feature extraction and feature screening to form an optimal feature subset;
solving a weight vector corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center;
performing characteristic weighting processing when calculating Euclidean distance of a clustering algorithm, and integrating a weight vector into distance calculation in a multiplier mode to form a weighted distance matrix of a sample point and a clustering center;
and calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, iterating for multiple times to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified.
According to the further technical scheme, when the power utilization data of the equipment are obtained and feature extraction and feature screening are carried out, the power utilization data of the equipment are firstly subjected to transient and steady feature extraction, an initial feature set is established, the feature screening method carries out feature optimization on the initial feature set, and an optimal feature subset is obtained and used for subsequent load identification.
According to the further technical scheme, entropy values of various features in the optimal feature subset are calculated, weight coefficients expressing feature importance are calculated based on the entropy values of the various features, finally, weight coefficients corresponding to various dimensional feature values of the non-invasive load sample are obtained, and weight vectors are formed by the weight coefficients.
In a second aspect, a non-intrusive load identification system is disclosed that accounts for differences in the expression of feature quantity information, comprising:
an optimal feature subset acquisition module configured to: the method comprises the steps of utilizing non-invasive acquisition equipment power utilization data to conduct feature extraction and feature screening to form an optimal feature subset;
a weight vector acquisition module configured to: solving a weight vector corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center;
a weighted distance matrix module configured to: performing characteristic weighting processing when calculating Euclidean distance of a clustering algorithm, and integrating a weight vector into distance calculation in a multiplier mode to form a weighted distance matrix of a sample point and a clustering center;
an identification module configured to: and calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, iterating for multiple times to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified.
The above one or more technical solutions have the following beneficial effects:
based on the research on non-invasive load identification of residential users, the generalization and accuracy of the existing method are still to be improved, and aiming at the difference of each electrical characteristic quantity in expressing the electricity utilization characteristics of different equipment, the weight coefficient more closely distributed to the actual importance of the characteristic quantity is configured by an entropy weight method on the premise of considering characteristic differentiation expression, the characteristic weight is introduced into a target function of an identification algorithm, a distance weighting correction matrix of a sample point and a clustering center considering the characteristic weighting is obtained through optimization and improvement, the close combination of the characteristic and the identification algorithm can be realized, and the generalization application capability and the identification accuracy of the algorithm can be effectively enhanced.
The characteristic quantity is not in the same position in the load identification process, which is determined by the difference of the power utilization information carried by different characteristic quantities. The invention can objectively measure the differentiation status of the characteristic quantity in identification application by configuring the weight coefficient of the characteristic quantity by an entropy weight method, further provides a fuzzy C-means clustering non-intrusive load identification algorithm based on distance improvement, constructs a distance measurement matrix considering characteristic weighting correction, realizes correction of a membership matrix and a clustering center in the identification algorithm, and is closely combined with characteristic quantity information expression capacity, so that the algorithm has stronger generalization application capacity and higher identification accuracy.
The method comprises the steps of completing feature importance evaluation by using an entropy weight method, solving a weight coefficient of each feature, and reflecting differential expression of load features on power utilization information; introducing the characteristic weight into a target function of an identification algorithm, and optimizing and improving to obtain a distance weighting correction matrix of the sample point and the clustering center considering the characteristic weight; the method considers the influence of the difference of various characteristic representation original electricity utilization information on the center distance of the sample class, and overcomes the problem of characteristic masking in actual identification.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
FIG. 1 is a flowchart illustrating an embodiment of the present disclosure for non-intrusive load identification of residential users;
fig. 2 is a non-intrusive load identification result of scenario 1 in accordance with an embodiment of the present disclosure;
fig. 3 is a non-intrusive load identification result of scenario 2 of the present disclosure;
fig. 4 is a non-intrusive load identification result of scenario 3 of the present disclosure;
FIG. 5 is a comparison of the load recognition accuracy results of the standard FCM algorithm and the method of the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
Example one
The embodiment discloses a non-invasive load identification method considering characteristic quantity information expression difference, wherein non-invasive load identification fuzzy C-means clustering of the characteristic quantity information expression difference is utilized, the limitation that Euclidean distances with equal characteristic importance are taken as similarity criteria is analyzed and revealed, the weight coefficient of the characteristics is configured by an entropy weight method, a distance measurement matrix considering characteristic weight correction is constructed, and the close combination of correction of a membership matrix and a clustering center in an identification algorithm and characteristic quantity information expression capacity is realized. Meanwhile, 3 example application scenes are set for verification aiming at various conditions after different electric appliance modes are mixed. The invention fully considers the information expression capability of the characteristic quantity in the load identification process, and can effectively improve the generalization application capability and the identification accuracy of the algorithm.
The implementation example of the disclosure specifically comprises the following steps:
(1) carrying out feature extraction and feature screening on the equipment power utilization data to form an optimal feature subset;
(2) solving a weight vector W corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center;
(3) performing characteristic weighting processing when calculating Euclidean distance of a fuzzy C-means clustering algorithm, and integrating a weight vector W into distance calculation in a form of a multiplier to form a weighted distance matrix of a sample point and a clustering center;
(4) calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, performing multiple iterations to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified;
in the embodiment, the optimal membership matrix of the sample to be identified can be obtained through the identification algorithm, and the corresponding type and working state of the electric appliance can be found according to the maximum membership value in the matrix.
(5) 3 example application scenes are constructed, and the identification effect of the method provided by the invention is verified. The principle of the whole process is as follows:
the method comprises the following steps of (1) extracting transient and steady state features of the equipment power utilization data, wherein 14 kinds of steady state features are extracted, 6 kinds of transient state features are extracted, an initial feature set is built accordingly, 20 feature quantities are contained in total, the transient and steady state features are used as input quantities of a load identification algorithm, further, feature optimization is carried out on the initial feature set based on a feature screening method, and 6 kinds of feature quantities are selected from the 20 feature quantities to form an optimal feature subset for subsequent load identification.
The electricity consumption data comprise current, voltage, power factor, cycle current, cycle voltage, harmonic current and harmonic voltage.
Whether the feature quantity selection properly determines the quality of the non-intrusive load identification effect or not improves the correlation for eliminating redundancy in the feature selection, so that the initial feature set is optimized to form an optimal feature subset, and the optimal feature subset is finally used for improving the accuracy of load identification.
The entropy weight in step (2) is a value for determining various evaluation indexes after a specific set of evaluation objects. From a value perspective, the entropy weight represents the size of the amount of useful information that each index provides for a particular problem. Compared with a subjective weight value acquisition method, the weight value configured by the entropy weight method is closer to the actual importance distribution condition, the obtained result is more objective and scientific, the interpretability and the precision are better, the application scene of the entropy weight method is very wide, and the method is suitable for most of problems needing weight determination, so that the characteristic importance evaluation is realized by selectively introducing the entropy weight method. The method is applied to the non-invasive load field, the larger the difference of the original power utilization information of the feature expression sample is, the smaller the entropy value obtained by calculation is, the more useful information quantity can be provided by representing the feature, and the more important the feature is when the feature is used as an evaluation index of a classification problem, namely the larger the weight coefficient of the feature is.
The method for measuring the weight coefficient of each type of characteristics of the importance degree of the characteristic expression by using the entropy weight method comprises the following steps of firstly calculating the entropy value of each type of characteristics as shown in a formula (1):
Figure BDA0003124530750000071
Figure BDA0003124530750000072
in the equation (1), for the f-dimensional characteristics of the sample, e is a problem of expressing the electricity consumption information of the class C sample typeiEntropy values representing the i-th class of features, i-1, 2, …, f, rijRepresenting the i-dimension characteristic quantity, p, of the actual measured data, in particular the j-type sampleijThe representation of the ith type feature expresses the proportion of useful information of the jth type sample to the total power consumption information.
Secondly, calculating entropy weights of various types of features according to the formula (3) and obtaining weight coefficients expressing feature importance from the entropy weights:
Figure BDA0003124530750000073
therefore, after each dimension characteristic value of the non-invasive load sample is given, a weight vector W ═ omega ═ corresponding to the characteristic quantity can be obtained according to the expressions (1) to (3) in sequence12,…ωf]。
And (2) initializing a membership matrix and a clustering center of the load identification algorithm by setting a zero vector.
And (3) performing characteristic weighting improvement on the basis of a standard fuzzy C-means clustering algorithm, and realizing close combination of load identification and characteristic quantity information expression capacity. The fuzzy C-means clustering algorithm is a clustering algorithm for finding out the optimal analysis scheme of objects based on objective function optimization, is commonly used in the field of pattern recognition, and mainly determines the category of a sample to be recognized through the fuzzy membership of the sample. The objective function is shown in formula (4):
Figure BDA0003124530750000081
the characteristic quantity extracted according to the electricity consumption data is the objective function dijX in expanded formiThe weight vector is in the following pair dijThe amount of substitution when the correction is performed.
Wherein X is { X ═ X1,x2,…,xNIs a set of samples, each xiA corresponding weight coefficient ω (k) can be obtained according to an entropy weight method, where N is the number of samples, the total samples are classified into C, also called the number of clusters, m is a parameter of the degree of blurring, also called the weighting index of the degree of membership, and is generally set to be m 2, x is set to be x during calculationi=[xi1,xi2,…,xit]Is the ith sample in the sample set, t represents the number of feature classes, cj=[cj1,cj2,…,cjt]Denotes the jth cluster center, uijRepresenting the degree of membership of the ith sample belonging to the jth class, let hij=xi-cjThen d isij=||hij||=||xi-cjI represents sample xiTo the center of the cluster cjThe euclidean distance of (c).
The constraint of the objective function is satisfied for a single sample whose sum of membership for each cluster center is 1, i.e. the sum of membership
Figure BDA0003124530750000082
uij∈[0,1],1≤i≤N,1≤j≤C。
And (3) introducing a Lagrange multiplier lambda in the formula (4) by combining a membership degree constraint condition, constructing a Lagrange function, solving the partial derivative of the Lagrange function to enable the partial derivative value to be 0, and deducing to obtain a membership degree updating formula (5) and a clustering center updating formula (6):
Figure BDA0003124530750000083
Figure BDA0003124530750000091
when the fuzzy C-means clustering algorithm is used for solving, the process of continuously and iteratively calculating the membership degree and the clustering center according to the formula (5) and the formula (6) is actually adopted, and when the final result meets the iteration termination condition or reaches the iteration times, namely the iteration times are reached
Figure BDA0003124530750000092
Wherein l is the iteration number, and epsilon is the error threshold, obviously, after the iteration is completed, the membership degree calculated according to the formula (5) is the optimal membership degree matrix, and the load identification can be realized by finding the corresponding optimal membership degree vector of the sample in the membership degree matrix.
The membership degree involved in the fuzzy C-means clustering algorithm is used as a very important concept in the fuzzy theory, and the load classification is embodiedThe judgment of the absolute (not 0, namely 1) is changed into the fuzzy recognition conversion process, the membership degree of the sample to be recognized is expressed as the degree of the sample belonging to each cluster class, and the corresponding electric appliance type and the working state can be found according to the maximum membership degree value in the membership degree matrix. Suppose a sample set is provided
Figure BDA0003124530750000093
The number of sample classes is C, and after clustering identification, the sample set is decomposed into subsets X under each class1,X2,…,
Figure BDA0003124530750000094
The relationship between these subsets is shown in equation (7):
Figure BDA0003124530750000095
when inputting a sample x to be identifiedj,xjThe jth sample in the sample set is represented, and as the membership is a physical quantity for measuring the similarity between sample elements and sample classes, the membership is obtained by a membership function in actual calculation:
Figure BDA0003124530750000096
equation (8) represents the sample x to be identifiedjSubset X of samples belonging to class iiOf membership function uijIs a scalar quantity, and takes on the value of [0,1]Thereby according to the sample xjDegree of membership u ofijIn [0,1 ]]X can be determined by the degree of deviationjBelonging to a category and being calculated such that the sum of the membership values of all samples to be identified is 1. Calculating a membership value of a single sample through a membership function, wherein when the sample to be identified is in a set form, X ═ X1,x2,…xj…,xgAnd g represents the number of samples to be identified, and the membership matrix can be obtained by calculating according to the formula.
The specific details of the feature weighting improvement strategy in the step (3) are as follows:
in the standard fuzzy C-means clustering algorithm, the distance between each sample point and each clustering center is generally calculated and used as a classification basis of the sample points, and in the optimization process of an objective function, the degree of cluster loss influenced by different load characteristics is only related to the size of characteristic quantity, namely the weight values of different characteristics are equal, and the influence of characteristic categories is neglected; however, the importance of different features in load identification is different, so in order to highlight the influence of different features on the identification effect, the invention utilizes the weight coefficients of various feature quantities to correct the distance measurement matrix of the standard FCM algorithm, and performs feature weighting processing when calculating Euclidean distance, namely dijModified to d'ijAnd forming a weighted distance matrix of the sample points and the cluster centers.
The Euclidean distance combined with the feature weight is essentially for hijFor the same sample x, the value of each element constituting the input matrix changes due to the difference of the elements (the kind of the feature) inside the sample, i.e. hij=xi-cjThe improvement is hij={ω(k)(xik-cjk) Where k is 1,2, … t, ω (k) is the weight coefficient of each type of feature calculated in equation (3), and d is the weight coefficientij=||hij'd' is modified to | |.ij=||h′ijThe specific transformation is as follows:
hij=[xi1-cj1,xi2-cj2,…,xit-cjt] (9)
then sample xiTo the center of the cluster cjOf Euclidean distance dijCalculating as shown in equation (10):
Figure BDA0003124530750000101
the standard distance metric matrix d is shown as equation (11):
d=[d1,d2,…di…dN] (11)
in the formula (11), diDenotes the Euclidean distance of the ith sample point from its home cluster center (closest cluster center), di=min[di1,di2,…,dij,…,diC]Wherein d isijIs calculated as shown in formula (10), and becomes formula (12) after weighted correction:
Figure BDA0003124530750000111
the formula (12) is substituted back into the formula (11) to improve the standard distance metric matrix d into a weighted distance matrix d', thereby resulting in a change in the formula update degree in the subsequent derivation, and it can be seen from the formula (6) that d is not contained in the formulaijIn addition, in the algorithm identification process, the sample classification number is given, and the load type cannot be changed, so that the calculation of the clustering center cannot be influenced by the improvement of the calculation of the characteristic weight.
And (3) the influence of the difference of various characteristic representation original electricity utilization information on the sample class center distance is considered to improve the standard FCM algorithm, the problem that the membership degree is difficult to reflect the class of the sample to be identified is solved, and the interpretability of load identification is enhanced. It can be seen from the distance measurement matrix transformation process that the introduction of the characteristic weight coefficient finely adjusts the membership degree updating formula, and the clustering quality can be improved to a certain extent.
The calculation flow of the step (4) is as follows:
the specific flow of the non-intrusive load identification application of the residential users based on the distance-improved fuzzy C-means clustering is shown in FIG. 1.
(1) The measured original electricity consumption data are subjected to data preprocessing, then are subjected to transient and steady state interval division by utilizing a minimum spanning tree principle, and are subjected to acquisition of an optimal feature subset by adopting a feature extraction and feature selection method;
(2) combining an entropy weight method principle and the evaluation of the contour coefficient to obtain a weight vector W corresponding to the feature set, and simultaneously carrying out initialization processing on the membership matrix and the clustering center;
(3) calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement;
(4) making an iteration condition judgment if
Figure BDA0003124530750000112
If l is iteration times, epsilon is an error threshold value, or the iteration times are larger than a set value, the next step is carried out, or the step (3) is returned to continue updating the membership degree and the clustering center;
(5) and outputting the category and the state of the sample to be identified according to the optimal membership matrix obtained after iteration, and calculating to obtain the final identification accuracy, wherein the corresponding electric appliance category and the working state can be found according to the maximum membership value in the optimal membership matrix.
The verification data in the step (5) is from a competition data source held by a certain company, the data sampling frequency is 6400Hz, and the total number of tested household appliances is 11, namely an electric fan (YD1), a microwave oven (YD2), a hot water kettle (YD3), a computer (YD4), an incandescent lamp (YD5), an energy-saving lamp (YD6), a printer (YD7), a water dispenser (YD8), a hanging air conditioner (YD9), a blower (YD10) and a television (YD 11). 3 example application scenarios are constructed specifically as follows:
scene 1: load identification scenario with a mixture of finite multi-state devices (finite multi-state devices with multiple power modes of operation, power level in a step change)
Scene 2: load identification scene of a plurality of mixed start-stop two-state devices (the start-stop two-state devices only relate to a power operation mode of closing and starting)
Scene 3: hybrid load identification scenario for full mode devices (on-off two-state device + finite multi-state device + continuous variable state device) (continuous variable state device with continuously variable power mode of operation, power level changes are very severe)
And (5) identifying the sampled data according to the steps (1) to (5), wherein the specific result is shown in fig. 2-4.
As can be seen from the identification results of the 3 scenes, the method provided by the invention has a good identification effect under various application scenes, can effectively acquire the time-sharing power consumption details of each device aiming at the identification analysis of the total power consumption data, and has strong algorithm generalization application capability.
The accuracy comparison, verification and analysis are performed on the scene 3, and the result of the identification accuracy is shown in fig. 5. The standard fuzzy C-means clustering algorithm lacks basic judgment on the relation between the features, and the Euclidean distance considering the feature weight is taken as a basis for dividing sample points around different clustering centers, so that the defect is overcome to a certain extent, the sample classification effect is better, and the load identification accuracy is obviously improved.
The method comprises the steps of completing feature importance evaluation by using an entropy weight method, solving a weight coefficient of each feature, and reflecting differential expression of load features on power utilization information; and introducing the characteristic weight into an objective function of an identification algorithm, and optimizing and improving to obtain a distance weighting correction matrix of the sample point and the clustering center considering the characteristic weight.
Example two
It is an object of this embodiment to provide a computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the program.
EXAMPLE III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
Example four
The objective of the present embodiment is to provide a non-intrusive load identification system for accounting for the difference in the expression of feature quantity information, comprising:
an optimal feature subset acquisition module configured to: the method comprises the steps of utilizing non-invasive acquisition equipment power utilization data to conduct feature extraction and feature screening to form an optimal feature subset;
a weight vector acquisition module configured to: solving a weight vector corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center;
a weighted distance matrix module configured to: performing characteristic weighting processing when calculating Euclidean distance of a clustering algorithm, and integrating a weight vector into distance calculation in a multiplier mode to form a weighted distance matrix of a sample point and a clustering center;
an identification module configured to: and calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, iterating for multiple times to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified.
The steps involved in the apparatuses of the above second, third and fourth embodiments correspond to the first embodiment of the method, and the detailed description thereof can be found in the relevant description of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media containing one or more sets of instructions; it should also be understood to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any of the methods of the present disclosure.
Those skilled in the art will appreciate that the modules or steps of the present disclosure described above can be implemented using general purpose computer means, or alternatively, they can be implemented using program code executable by computing means, whereby the modules or steps may be stored in memory means for execution by the computing means, or separately fabricated into individual integrated circuit modules, or multiple modules or steps thereof may be fabricated into a single integrated circuit module. The present disclosure is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims (10)

1. A non-invasive load identification method considering characteristic quantity information expression difference is characterized by comprising the following steps:
the method comprises the steps of utilizing non-invasive acquisition equipment power utilization data to conduct feature extraction and feature screening to form an optimal feature subset;
solving a weight vector corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center;
performing characteristic weighting processing when calculating Euclidean distance of a clustering algorithm, and integrating a weight vector into distance calculation in a multiplier mode to form a weighted distance matrix of a sample point and a clustering center;
and calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, iterating for multiple times to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified.
2. The non-invasive load identification method considering the difference in the expression of the feature quantity information as claimed in claim 1, wherein when the power consumption data of the equipment is obtained and feature extraction and feature screening are performed, the power consumption data of the equipment is firstly subjected to transient and steady feature extraction to establish an initial feature set, and the feature screening method performs feature optimization on the initial feature set to obtain an optimal feature subset for subsequent load identification.
3. The method as claimed in claim 1, wherein entropy values of the features of the optimal feature subset are calculated, a weight coefficient for expressing feature importance is calculated based on the entropy values of the features, and a weight coefficient corresponding to the feature value of each dimension of the non-invasive load sample is finally obtained, and the weight vector is formed by the weight coefficient.
4. The method of claim 1, wherein the non-intrusive load identification method based on the difference of the characteristic quantity information expression includes:
establishing an objective function:
Figure FDA0003124530740000021
xi=[xi1,xi2,…,xit]is the ith sample in the sample set, t represents the number of feature classes, cj=[cj1,cj2,…,cjt]Denotes the jth cluster center, uijRepresenting the degree of membership of the ith sample belonging to the jth class, let hij=xi-cjThen d isij=||hij||=||xi-cjI represents sample xiTo the center of the cluster cjThe euclidean distance of (c).
5. The method of claim 4, wherein the constraint of the objective function is satisfied for a single sample, and the sum of membership degrees of the single sample to each cluster center is 1, that is
Figure FDA0003124530740000022
uij∈[0,1],1≤i≤N,1≤j≤C。
6. The non-invasive load identification method taking into account the difference in the expression of the characteristic quantity information as claimed in claim 4, wherein the target function introduces a Lagrangian multiplier λ to construct a Lagrangian function, and calculates the partial derivative thereof so that the partial derivative value is 0, and a membership degree update formula and a clustering center update formula can be derived.
7. The method as claimed in claim 1, wherein the fuzzy C-means clustering algorithm is used for solving, the process of iteratively calculating the membership degree and the clustering center according to the membership degree updating formula and the clustering center updating formula is continuously performed, when the final result meets the iteration termination condition or reaches the iteration times, the membership degree calculated according to the membership degree updating formula is an optimal membership degree matrix, and the load identification can be realized by finding the corresponding optimal membership degree vector of the sample in the membership degree matrix.
8. The non-invasive load identification system for calculating the difference of the characteristic quantity information expression is characterized by comprising the following steps:
an optimal feature subset acquisition module configured to: the method comprises the steps of utilizing non-invasive acquisition equipment power utilization data to conduct feature extraction and feature screening to form an optimal feature subset;
a weight vector acquisition module configured to: solving a weight vector corresponding to the optimal feature subset based on an entropy weight method principle, and simultaneously carrying out initialization processing on a membership matrix and a clustering center;
a weighted distance matrix module configured to: performing characteristic weighting processing when calculating Euclidean distance of a clustering algorithm, and integrating a weight vector into distance calculation in a multiplier mode to form a weighted distance matrix of a sample point and a clustering center;
an identification module configured to: and calculating the membership degree and the clustering center of the sample to be identified according to a membership degree updating formula and a clustering center updating formula obtained by characteristic weighting improvement, iterating for multiple times to form an optimal membership degree matrix, and outputting the category and the working state of the sample to be identified.
9. A computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of the preceding claims 1 to 7.
CN202110685729.7A 2021-06-21 2021-06-21 Non-invasive load identification method and system considering characteristic quantity information expression difference Pending CN113408622A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110685729.7A CN113408622A (en) 2021-06-21 2021-06-21 Non-invasive load identification method and system considering characteristic quantity information expression difference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110685729.7A CN113408622A (en) 2021-06-21 2021-06-21 Non-invasive load identification method and system considering characteristic quantity information expression difference

Publications (1)

Publication Number Publication Date
CN113408622A true CN113408622A (en) 2021-09-17

Family

ID=77681929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110685729.7A Pending CN113408622A (en) 2021-06-21 2021-06-21 Non-invasive load identification method and system considering characteristic quantity information expression difference

Country Status (1)

Country Link
CN (1) CN113408622A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114280352A (en) * 2021-12-27 2022-04-05 杭州电子科技大学 Current-based large instrument man-hour calculation method
CN117932311A (en) * 2024-03-21 2024-04-26 杭州可当科技有限公司 Intelligent user identification method of intelligent internet terminal based on 5G network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110907762A (en) * 2019-12-10 2020-03-24 深圳供电局有限公司 Non-invasive load matching identification method
CN111612074A (en) * 2020-05-22 2020-09-01 王彬 Identification method and device of non-invasive load monitoring electric equipment and related equipment
CN112152313A (en) * 2020-07-08 2020-12-29 宁波三星医疗电气股份有限公司 Method for identifying power equipment by acquisition system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110907762A (en) * 2019-12-10 2020-03-24 深圳供电局有限公司 Non-invasive load matching identification method
CN111612074A (en) * 2020-05-22 2020-09-01 王彬 Identification method and device of non-invasive load monitoring electric equipment and related equipment
CN112152313A (en) * 2020-07-08 2020-12-29 宁波三星医疗电气股份有限公司 Method for identifying power equipment by acquisition system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
于依: ""基于特性指标降维和改进熵权法的电力负荷模式识别算法研究"", 《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅱ辑》 *
何聪: ""综合负荷特性在线解析方法及其自动建模平台开发"", 《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅱ辑》 *
王国伟等: ""基于熵权法加权的模糊 C 均值聚类算法研究"", 《农业网络信息》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114280352A (en) * 2021-12-27 2022-04-05 杭州电子科技大学 Current-based large instrument man-hour calculation method
CN114280352B (en) * 2021-12-27 2024-02-13 杭州电子科技大学 Current-based large instrument working hour calculation method
CN117932311A (en) * 2024-03-21 2024-04-26 杭州可当科技有限公司 Intelligent user identification method of intelligent internet terminal based on 5G network
CN117932311B (en) * 2024-03-21 2024-05-31 杭州可当科技有限公司 Intelligent user identification method of intelligent internet terminal based on 5G network

Similar Documents

Publication Publication Date Title
Cheng et al. Evolutionary multiobjective optimization-based multimodal optimization: Fitness landscape approximation and peak detection
Kang et al. A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence
CN109461025B (en) Electric energy substitution potential customer prediction method based on machine learning
CN113408622A (en) Non-invasive load identification method and system considering characteristic quantity information expression difference
CN113191253A (en) Non-invasive load identification method based on feature fusion under edge machine learning
Chen et al. Transfer learning-based parallel evolutionary algorithm framework for bilevel optimization
Humala et al. Universalnilm: A semi-supervised energy disaggregation framework using general appliance models
CN114114039A (en) Method and device for evaluating consistency of single battery cells of battery system
CN111563827A (en) Load decomposition method based on electrical appliance physical characteristics and residential electricity consumption behaviors
CN111046913A (en) Load abnormal value identification method
Harell et al. PowerGAN: Synthesizing appliance power signatures using generative adversarial networks
Ardeshiri et al. Gated recurrent unit least-squares generative adversarial network for battery cycle life prediction
CN115345297A (en) Platform area sample generation method and system based on generation countermeasure network
CN113466710B (en) SOC and SOH collaborative estimation method for energy storage battery in receiving-end power grid containing new energy
CN113489514B (en) Power line communication noise identification method and device based on self-organizing mapping neural network
Haq et al. Classification of electricity load profile data and the prediction of load demand variability
CN117313795A (en) Intelligent building energy consumption prediction method based on improved DBO-LSTM
Parker et al. Nonlinear time series classification using bispectrum‐based deep convolutional neural networks
CN116662893A (en) Water quality prediction method for optimizing SVM (support vector machine) based on improved goblet sea squirt algorithm
CN115619028A (en) Clustering algorithm fusion-based power load accurate prediction method
CN106485286B (en) Matrix classification model based on local sensitivity discrimination
CN114139937A (en) Indoor thermal comfort data generation method, system, equipment and medium
CN111369074A (en) Corn yield prediction method based on artificial bee colony optimized BP neural network
Lin et al. Depth load identification method of a power fingerprint based on the transferability of the VI trajectory
CN116306226B (en) Fuel cell performance degradation prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210917