CN114004285A - Non-invasive load identification method based on improved kNN algorithm - Google Patents
Non-invasive load identification method based on improved kNN algorithm Download PDFInfo
- Publication number
- CN114004285A CN114004285A CN202111201436.3A CN202111201436A CN114004285A CN 114004285 A CN114004285 A CN 114004285A CN 202111201436 A CN202111201436 A CN 202111201436A CN 114004285 A CN114004285 A CN 114004285A
- Authority
- CN
- China
- Prior art keywords
- amplitude
- track
- knn algorithm
- sample
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
The invention provides a non-invasive load identification method based on an improved kNN algorithm, which comprises the following steps: step S01: selecting a V-I track as a load characteristic, extracting a track characteristic, and adding an amplitude characteristic in the extracted track characteristic; step S02: improving the KNN algorithm, distributing different weights to the training samples in the KNN algorithm, and increasing the voting weight of the minority samples in the classification judgment, wherein the step S03 is as follows: two load characteristics of a binary V-I track and an amplitude are extracted in the step S01, a comprehensive similarity concept is introduced, the two load characteristics are combined by using the improved KNN algorithm in the step S02, the category of the sample to be detected is determined, and the load is identified. Under the condition of unbalanced data set, the method improves the identification accuracy of a few types of samples with similar V-I track shapes with a plurality of types by increasing the weight, and improves the identification accuracy of two types of electric equipment with the same front-end circuit topology but different power levels.
Description
Technical Field
The invention relates to a non-invasive load identification method based on an improved kNN algorithm.
Background
The kNN algorithm classifies samples to be detected by comparing the similarity between the samples to be detected and a large number of training samples, and has the core idea that K samples which are most similar to the samples to be detected are selected, and if the total similarity between the samples to be detected and one class of the K training samples is the maximum, the samples to be detected are classified into the class.
The kNN algorithm has the disadvantage that when the data set has an imbalance problem, most training samples with a large number of samples are easily selected as K nearest neighbors, and the judgment of the few classes is interfered.
Disclosure of Invention
1. The technical problem to be solved is as follows:
the conventional kNN algorithm has the problem that when a data set has an unbalanced problem, most types of training samples with a large number of samples are easy to select as K nearest neighbors, so that the judgment of the few types of training samples is interfered.
2. The technical scheme is as follows:
in order to solve the above problems, the present invention provides a non-intrusive load identification method based on an improved kNN algorithm, comprising the following steps: step S01: selecting a V-I track as a load characteristic, extracting a track characteristic, and adding an amplitude characteristic in the extracted track characteristic; step S02: improving the KNN algorithm, distributing different weights to the training samples in the KNN algorithm, and increasing the voting weight of the minority samples in the classification judgment, wherein the step S03 is as follows: two load characteristics of a binary V-I track and an amplitude are extracted in the step S01, a comprehensive similarity concept is introduced, the two load characteristics are combined by using the improved KNN algorithm in the step S02, the category of the sample to be detected is determined, and the load is identified.
The amplitude characteristics comprise active power, reactive power and current amplitude of fundamental wave and current amplitude of 3, 5 and 7 harmonic waves when the electric equipment is in steady operation.
Under the condition of increasing amplitude characteristics, the amplitude and the phase of fundamental waves and each harmonic can be obtained by performing fast Fourier transform on voltage and current, and the calculation formula of power is as follows:wherein: a is1And b1The amplitudes of the fundamental voltage and the current respectively;is the phase difference between the two.
In step S02, the improved KNN algorithm specifically includes:
The majority sample weight assignment method is as follows:wherein size (C)Tj) Represents TjThe number of training samples contained in the category.
Step S03 specifically includes: step S031: calculating the V-I track similarity and the amplitude similarity of the sample to be detected and all training samples, and respectively recording the V-I track similarity and the amplitude similarity as Sim1 and Sim 2:wherein: dist1 and dist2 are the distance of the V-I locus and the distance of the amplitude among 2 samples respectively; step S032: arranging the samples in descending order according to the sizes of the Sim1, and taking the first K training samples with the largest Sim1 as the K nearest neighbors of the current test samples; step S033: calculating the comprehensive similarity between the current test sample and all K nearest neighbors: sim (a, T)j)=Sim 1(a,Tj)×weight(Tj)+Sim 2(a,Tj);
Step S034: and (4) counting the total comprehensive similarity of the sample to be detected and each class in the K nearest neighbors, and taking the class with the maximum total comprehensive similarity as a prediction result.
Both the dist1 and dist2 are Euclidean distances.
3. Has the advantages that:
under the condition of unbalanced data set, the method improves the identification accuracy of a few types of samples with similar V-I track shapes with a plurality of types by increasing the weight, and improves the identification accuracy of two types of electric equipment with the same front-end circuit topology but different power levels.
Detailed Description
The present invention will be described in detail below.
The method is used for solving the problem that different types of electric equipment with similar topological structures of a front-stage circuit cannot be distinguished due to the fact that numerical characteristics are lacked in a V-I track.
Step S01, the shape of the V-I track of the electric equipment is related to the topological structure of the preceding stage circuit, the function range of the electric equipment can be divided according to the characteristic, and the requirement on the completeness of the database is reduced, therefore, the invention firstly selects the V-I track as the load characteristic, the extraction method of the track characteristic is to convert the original V-I track into the binary V-I track through the mapping [6,12], and the process is as follows: firstly, waveform data of high-frequency voltage u and current I in one period during stable operation of electric equipment are collected, and an original V-I track is drawn by taking u as an abscissa and I as an ordinate. Dividing the voltage-current 2-dimensional plane into 2 Nx 2N grids, and calculating the length (voltage span) and the height (current span) of each grid as follows:
initializing a 2-dimensional matrix B with dimensions 2 Nx 2N, each element being assigned a value of 1, and displaying as white, for a data point (u) in the original V-I trajectoryj,ij) (J ═ 1, 2, …, J), the index of the position it occupies in matrix B is (x)j,yj) If 0 < xj< 2N +1 and 0 < yj< 2N +1, the element B (x) of the matrix Bj,yj) Set to 0, indicating that the V-I track of the device passes through this cell, marked black:
as can be seen from the above binary trajectory extraction method, the mapping process is equivalent to normalizing the voltage and current data, and the trajectory includes only shape features reflecting information such as voltage-current phase difference, load nonlinearity, and harmonic characteristics, but does not include features related to the power level. When the V-I tracks of the 2 types of electric equipment are similar, misjudgment is easy to occur, so that the distinguishability of the electric equipment is improved by increasing the dimension of the amplitude characteristic.
The amplitude characteristics comprise active power, reactive power and current amplitude of fundamental wave and current amplitude of 3, 5 and 7 harmonic waves when the electric equipment is in steady operation. The amplitude and the phase of the fundamental wave and each harmonic can be obtained by performing fast Fourier transform on the voltage and the current, and the calculation formula of the power is as follows:
wherein: a is1And b1The amplitudes of the fundamental voltage and the current respectively;is the phase difference between the two.
Step S02 improved kNN algorithm
The specific process of the kNN algorithm is as follows:
calculating the similarity of a and all training samples for a to-be-detected sample a, arranging the similarity in a reverse order, and taking the first K as K nearest neighbors of a;
secondly, respectively calculating the sum of the similarity of each category of the a and K nearest neighbors, wherein the category of the a is the category with the maximum total similarity, such as: sample a and class CiHas a total similarity of
In the formula: t isjThe j (th) nearest neighbor of the sample a to be detected is shown, if TjBelong to class CiA and CiThe overall similarity of (a) increases, and the final class of a is:
the kNN algorithm has the disadvantage that when the data set has an imbalance problem, most training samples with a large number of samples are easily selected as K nearest neighbors, and the judgment of the few classes is interfered. Firstly, by using an under-sampling or over-sampling method, the number of two types of samples is enabled to be close by deleting most types of samples or synthesizing few types of samples, thereby eliminating the problem of data set imbalance; and secondly, the algorithm is improved, different weights are distributed to the training samples, and the voting weight of the minority samples during classification judgment is increased. In order not to delete useful data or introduce redundant data, the invention adopts a second type of solution to improve the algorithm of the total similarity of the sample a and the class Ci:
wherein: weight (T)j) For training sample TjWeight of (1) is TjWhen the weight is distributed, the principle that the weight of a few types of samples is great and the weight of a majority type of samples is small should be followed, and the distribution method is as follows:
wherein size (C)Tj) Represents TjThe number of training samples contained in the category.
Step S03 is a category decision method based on the integrated similarity. After the similarity calculation method based on the weight is determined, the next step is to judge the category of the sample to be detected by using a judgment rule, because the invention extracts two load characteristics of a binary V-I track and an amplitude value, the concept of comprehensive similarity is introduced, the category of the sample to be detected is determined according to the comprehensive similarity by combining the two load characteristics, and the process is as follows: step S031 calculates V-I track similarity and amplitude similarity of the sample to be measured and all training samples, and respectively records as Sim1 and Sim 2:
Sim 1=1/(1+dist1)
Sim 2=1/(1+dist2),
wherein: dist1 and dist2 are distances of V-I locus and amplitude among 2 samples respectively, and are Euclidean distances. Step S032, arranging the training samples in descending order according to the sizes of the Sim1, and taking the first K training samples with the largest Sim1 as the K nearest neighbors of the current test sample;
step SO 33: calculating the comprehensive similarity between the current test sample and all K nearest neighbors:
Sim(a,Tj)=Sim 1(a,Tj)×weight(Tj)+Sim2(a,Tj),
step SO 34: and (4) counting the total comprehensive similarity of the sample to be detected and each class in the K nearest neighbors, and taking the class with the maximum total comprehensive similarity as a prediction result.
Claims (7)
1. A non-intrusive load identification method based on an improved kNN algorithm comprises the following steps: step S01: selecting a V-I track as a load characteristic, extracting a track characteristic, and adding an amplitude characteristic in the extracted track characteristic; step S02: improving the KNN algorithm, distributing different weights to the training samples in the KNN algorithm, and increasing the voting weight of the minority samples in the classification judgment, wherein the step S03 is as follows: two load characteristics of a binary V-I track and an amplitude are extracted in the step S01, a comprehensive similarity concept is introduced, the two load characteristics are combined by using the improved KNN algorithm in the step S02, the category of the sample to be detected is determined, and the load is identified.
2. The method of claim 1, wherein the amplitude characteristics include fundamental active, reactive power, fundamental current amplitude, and 3, 5, 7 harmonic current amplitude at steady state operation of the powered device.
3. The method of claim 2, wherein: under the condition of increasing amplitude characteristics, the amplitude and the phase of fundamental waves and each harmonic can be obtained by performing fast Fourier transform on voltage and current, and the calculation formula of power is as follows:wherein: a is1And b1The amplitudes of the fundamental voltage and the current respectively;is the phase difference between the two.
5. The method of claim 3, wherein: the majority sample weight assignment method is as follows: weight (T)j)=1/size(CTj) Wherein size (C)Tj) Represents TjThe number of training samples contained in the category.
6. The method of any one of claims 1 to 5, wherein: step S03 specifically includes: step S031: calculating the V-I track similarity and the amplitude similarity of the sample to be detected and all training samples, and respectively recording the V-I track similarity and the amplitude similarity as Sim1 and Sim 2:
Sim=1/(1+dist)
sim2 ═ 1/(1+ dist2), where: dist1 and dist2 are the distance of the V-I locus and the distance of the amplitude among 2 samples respectively; step S032: arranging the training samples according to the descending order of the sizes of the Sim1, and taking the first K training samples with the largest Sim1 as the current testK nearest neighbors of the sample; step S033: calculating the comprehensive similarity between the current test sample and all K nearest neighbors: sim (a, T)j)=Sim1(a,Tj)×weight(Tj)+Sim2(a,Tj) (ii) a Step S034: and (4) counting the total comprehensive similarity of the sample to be detected and each class in the K nearest neighbors, and taking the class with the maximum total comprehensive similarity as a prediction result.
7. The method of claim 6, wherein: both the dist1 and dist2 are Euclidean distances.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111201436.3A CN114004285A (en) | 2021-10-15 | 2021-10-15 | Non-invasive load identification method based on improved kNN algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111201436.3A CN114004285A (en) | 2021-10-15 | 2021-10-15 | Non-invasive load identification method based on improved kNN algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114004285A true CN114004285A (en) | 2022-02-01 |
Family
ID=79922987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111201436.3A Pending CN114004285A (en) | 2021-10-15 | 2021-10-15 | Non-invasive load identification method based on improved kNN algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114004285A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117290802A (en) * | 2023-11-27 | 2023-12-26 | 惠州市鑫晖源科技有限公司 | Host power supply operation monitoring method based on data processing |
-
2021
- 2021-10-15 CN CN202111201436.3A patent/CN114004285A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117290802A (en) * | 2023-11-27 | 2023-12-26 | 惠州市鑫晖源科技有限公司 | Host power supply operation monitoring method based on data processing |
CN117290802B (en) * | 2023-11-27 | 2024-03-26 | 惠州市鑫晖源科技有限公司 | Host power supply operation monitoring method based on data processing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110752410B (en) | Method for rapidly sorting and recombining retired lithium battery | |
CN110443281B (en) | Text classification self-adaptive oversampling method based on HDBSCAN (high-density binary-coded decimal) clustering | |
CN110398663B (en) | Flexible direct current power grid fault identification method based on convolutional neural network | |
CN109829497B (en) | Supervised learning-based station area user identification and discrimination method | |
CN112732748B (en) | Non-invasive household appliance load identification method based on self-adaptive feature selection | |
CN110147760B (en) | Novel efficient electric energy quality disturbance image feature extraction and identification method | |
CN103136587A (en) | Power distribution network operating state classification recognition method based on support vector machine | |
CN111046913B (en) | Load abnormal value identification method | |
CN110068776B (en) | Three-level inverter open-circuit fault diagnosis method based on optimized support vector machine | |
CN110738232A (en) | grid voltage out-of-limit cause diagnosis method based on data mining technology | |
CN114359674A (en) | Non-invasive load identification method based on metric learning | |
CN116449218B (en) | Lithium battery health state estimation method | |
CN112287980B (en) | Power battery screening method based on typical feature vector | |
CN109359665A (en) | A kind of family's electric load recognition methods and device based on support vector machines | |
CN112036450B (en) | High-voltage cable partial discharge mode identification method and system based on transfer learning | |
CN114004285A (en) | Non-invasive load identification method based on improved kNN algorithm | |
CN115879048A (en) | Series arc fault identification method and system based on WRFMDA model | |
CN108898182A (en) | A kind of MMC method for diagnosing faults based on core pivot element analysis and support vector machines | |
CN111651932A (en) | Online dynamic security assessment method for power system based on integrated classification model | |
CN116796271A (en) | Resident energy abnormality identification method | |
CN116699446A (en) | Method, device, equipment and storage medium for rapidly sorting retired batteries | |
CN113191419B (en) | Sag homologous event detection and type identification method based on track key point matching and region division | |
CN117907835A (en) | New energy battery fault diagnosis method | |
CN105823964B (en) | Power transmission line comprehensive Fault Locating Method towards intelligent substation | |
CN117171586A (en) | Household transformer relation identification method and system based on current sequence similarity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |