CN114004285A - Non-invasive load identification method based on improved kNN algorithm - Google Patents

Non-invasive load identification method based on improved kNN algorithm Download PDF

Info

Publication number
CN114004285A
CN114004285A CN202111201436.3A CN202111201436A CN114004285A CN 114004285 A CN114004285 A CN 114004285A CN 202111201436 A CN202111201436 A CN 202111201436A CN 114004285 A CN114004285 A CN 114004285A
Authority
CN
China
Prior art keywords
amplitude
track
knn algorithm
sample
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111201436.3A
Other languages
Chinese (zh)
Inventor
王新迪
卞海红
潘柯言
王新策
房可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Institute of Technology
Original Assignee
Nanjing Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Institute of Technology filed Critical Nanjing Institute of Technology
Priority to CN202111201436.3A priority Critical patent/CN114004285A/en
Publication of CN114004285A publication Critical patent/CN114004285A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention provides a non-invasive load identification method based on an improved kNN algorithm, which comprises the following steps: step S01: selecting a V-I track as a load characteristic, extracting a track characteristic, and adding an amplitude characteristic in the extracted track characteristic; step S02: improving the KNN algorithm, distributing different weights to the training samples in the KNN algorithm, and increasing the voting weight of the minority samples in the classification judgment, wherein the step S03 is as follows: two load characteristics of a binary V-I track and an amplitude are extracted in the step S01, a comprehensive similarity concept is introduced, the two load characteristics are combined by using the improved KNN algorithm in the step S02, the category of the sample to be detected is determined, and the load is identified. Under the condition of unbalanced data set, the method improves the identification accuracy of a few types of samples with similar V-I track shapes with a plurality of types by increasing the weight, and improves the identification accuracy of two types of electric equipment with the same front-end circuit topology but different power levels.

Description

Non-invasive load identification method based on improved kNN algorithm
Technical Field
The invention relates to a non-invasive load identification method based on an improved kNN algorithm.
Background
The kNN algorithm classifies samples to be detected by comparing the similarity between the samples to be detected and a large number of training samples, and has the core idea that K samples which are most similar to the samples to be detected are selected, and if the total similarity between the samples to be detected and one class of the K training samples is the maximum, the samples to be detected are classified into the class.
The kNN algorithm has the disadvantage that when the data set has an imbalance problem, most training samples with a large number of samples are easily selected as K nearest neighbors, and the judgment of the few classes is interfered.
Disclosure of Invention
1. The technical problem to be solved is as follows:
the conventional kNN algorithm has the problem that when a data set has an unbalanced problem, most types of training samples with a large number of samples are easy to select as K nearest neighbors, so that the judgment of the few types of training samples is interfered.
2. The technical scheme is as follows:
in order to solve the above problems, the present invention provides a non-intrusive load identification method based on an improved kNN algorithm, comprising the following steps: step S01: selecting a V-I track as a load characteristic, extracting a track characteristic, and adding an amplitude characteristic in the extracted track characteristic; step S02: improving the KNN algorithm, distributing different weights to the training samples in the KNN algorithm, and increasing the voting weight of the minority samples in the classification judgment, wherein the step S03 is as follows: two load characteristics of a binary V-I track and an amplitude are extracted in the step S01, a comprehensive similarity concept is introduced, the two load characteristics are combined by using the improved KNN algorithm in the step S02, the category of the sample to be detected is determined, and the load is identified.
The amplitude characteristics comprise active power, reactive power and current amplitude of fundamental wave and current amplitude of 3, 5 and 7 harmonic waves when the electric equipment is in steady operation.
Under the condition of increasing amplitude characteristics, the amplitude and the phase of fundamental waves and each harmonic can be obtained by performing fast Fourier transform on voltage and current, and the calculation formula of power is as follows:
Figure BDA0003304950240000011
wherein: a is1And b1The amplitudes of the fundamental voltage and the current respectively;
Figure BDA0003304950240000012
is the phase difference between the two.
In step S02, the improved KNN algorithm specifically includes:
Figure BDA0003304950240000013
wherein: weight (T)j) For training sample TjThe weight of (c).
The majority sample weight assignment method is as follows:
Figure BDA0003304950240000021
wherein size (C)Tj) Represents TjThe number of training samples contained in the category.
Step S03 specifically includes: step S031: calculating the V-I track similarity and the amplitude similarity of the sample to be detected and all training samples, and respectively recording the V-I track similarity and the amplitude similarity as Sim1 and Sim 2:
Figure BDA0003304950240000022
wherein: dist1 and dist2 are the distance of the V-I locus and the distance of the amplitude among 2 samples respectively; step S032: arranging the samples in descending order according to the sizes of the Sim1, and taking the first K training samples with the largest Sim1 as the K nearest neighbors of the current test samples; step S033: calculating the comprehensive similarity between the current test sample and all K nearest neighbors: sim (a, T)j)=Sim 1(a,Tj)×weight(Tj)+Sim 2(a,Tj);
Step S034: and (4) counting the total comprehensive similarity of the sample to be detected and each class in the K nearest neighbors, and taking the class with the maximum total comprehensive similarity as a prediction result.
Both the dist1 and dist2 are Euclidean distances.
3. Has the advantages that:
under the condition of unbalanced data set, the method improves the identification accuracy of a few types of samples with similar V-I track shapes with a plurality of types by increasing the weight, and improves the identification accuracy of two types of electric equipment with the same front-end circuit topology but different power levels.
Detailed Description
The present invention will be described in detail below.
The method is used for solving the problem that different types of electric equipment with similar topological structures of a front-stage circuit cannot be distinguished due to the fact that numerical characteristics are lacked in a V-I track.
Step S01, the shape of the V-I track of the electric equipment is related to the topological structure of the preceding stage circuit, the function range of the electric equipment can be divided according to the characteristic, and the requirement on the completeness of the database is reduced, therefore, the invention firstly selects the V-I track as the load characteristic, the extraction method of the track characteristic is to convert the original V-I track into the binary V-I track through the mapping [6,12], and the process is as follows: firstly, waveform data of high-frequency voltage u and current I in one period during stable operation of electric equipment are collected, and an original V-I track is drawn by taking u as an abscissa and I as an ordinate. Dividing the voltage-current 2-dimensional plane into 2 Nx 2N grids, and calculating the length (voltage span) and the height (current span) of each grid as follows:
Figure BDA0003304950240000023
initializing a 2-dimensional matrix B with dimensions 2 Nx 2N, each element being assigned a value of 1, and displaying as white, for a data point (u) in the original V-I trajectoryj,ij) (J ═ 1, 2, …, J), the index of the position it occupies in matrix B is (x)j,yj) If 0 < xj< 2N +1 and 0 < yj< 2N +1, the element B (x) of the matrix Bj,yj) Set to 0, indicating that the V-I track of the device passes through this cell, marked black:
Figure BDA0003304950240000031
as can be seen from the above binary trajectory extraction method, the mapping process is equivalent to normalizing the voltage and current data, and the trajectory includes only shape features reflecting information such as voltage-current phase difference, load nonlinearity, and harmonic characteristics, but does not include features related to the power level. When the V-I tracks of the 2 types of electric equipment are similar, misjudgment is easy to occur, so that the distinguishability of the electric equipment is improved by increasing the dimension of the amplitude characteristic.
The amplitude characteristics comprise active power, reactive power and current amplitude of fundamental wave and current amplitude of 3, 5 and 7 harmonic waves when the electric equipment is in steady operation. The amplitude and the phase of the fundamental wave and each harmonic can be obtained by performing fast Fourier transform on the voltage and the current, and the calculation formula of the power is as follows:
Figure BDA0003304950240000032
wherein: a is1And b1The amplitudes of the fundamental voltage and the current respectively;
Figure BDA0003304950240000033
is the phase difference between the two.
Step S02 improved kNN algorithm
The specific process of the kNN algorithm is as follows:
calculating the similarity of a and all training samples for a to-be-detected sample a, arranging the similarity in a reverse order, and taking the first K as K nearest neighbors of a;
secondly, respectively calculating the sum of the similarity of each category of the a and K nearest neighbors, wherein the category of the a is the category with the maximum total similarity, such as: sample a and class CiHas a total similarity of
Figure BDA0003304950240000034
Figure BDA0003304950240000035
In the formula: t isjThe j (th) nearest neighbor of the sample a to be detected is shown, if TjBelong to class CiA and CiThe overall similarity of (a) increases, and the final class of a is:
Figure BDA0003304950240000036
the kNN algorithm has the disadvantage that when the data set has an imbalance problem, most training samples with a large number of samples are easily selected as K nearest neighbors, and the judgment of the few classes is interfered. Firstly, by using an under-sampling or over-sampling method, the number of two types of samples is enabled to be close by deleting most types of samples or synthesizing few types of samples, thereby eliminating the problem of data set imbalance; and secondly, the algorithm is improved, different weights are distributed to the training samples, and the voting weight of the minority samples during classification judgment is increased. In order not to delete useful data or introduce redundant data, the invention adopts a second type of solution to improve the algorithm of the total similarity of the sample a and the class Ci:
Figure BDA0003304950240000041
wherein: weight (T)j) For training sample TjWeight of (1) is TjWhen the weight is distributed, the principle that the weight of a few types of samples is great and the weight of a majority type of samples is small should be followed, and the distribution method is as follows:
Figure BDA0003304950240000042
wherein size (C)Tj) Represents TjThe number of training samples contained in the category.
Step S03 is a category decision method based on the integrated similarity. After the similarity calculation method based on the weight is determined, the next step is to judge the category of the sample to be detected by using a judgment rule, because the invention extracts two load characteristics of a binary V-I track and an amplitude value, the concept of comprehensive similarity is introduced, the category of the sample to be detected is determined according to the comprehensive similarity by combining the two load characteristics, and the process is as follows: step S031 calculates V-I track similarity and amplitude similarity of the sample to be measured and all training samples, and respectively records as Sim1 and Sim 2:
Sim 1=1/(1+dist1)
Sim 2=1/(1+dist2),
wherein: dist1 and dist2 are distances of V-I locus and amplitude among 2 samples respectively, and are Euclidean distances. Step S032, arranging the training samples in descending order according to the sizes of the Sim1, and taking the first K training samples with the largest Sim1 as the K nearest neighbors of the current test sample;
step SO 33: calculating the comprehensive similarity between the current test sample and all K nearest neighbors:
Sim(a,Tj)=Sim 1(a,Tj)×weight(Tj)+Sim2(a,Tj),
step SO 34: and (4) counting the total comprehensive similarity of the sample to be detected and each class in the K nearest neighbors, and taking the class with the maximum total comprehensive similarity as a prediction result.

Claims (7)

1. A non-intrusive load identification method based on an improved kNN algorithm comprises the following steps: step S01: selecting a V-I track as a load characteristic, extracting a track characteristic, and adding an amplitude characteristic in the extracted track characteristic; step S02: improving the KNN algorithm, distributing different weights to the training samples in the KNN algorithm, and increasing the voting weight of the minority samples in the classification judgment, wherein the step S03 is as follows: two load characteristics of a binary V-I track and an amplitude are extracted in the step S01, a comprehensive similarity concept is introduced, the two load characteristics are combined by using the improved KNN algorithm in the step S02, the category of the sample to be detected is determined, and the load is identified.
2. The method of claim 1, wherein the amplitude characteristics include fundamental active, reactive power, fundamental current amplitude, and 3, 5, 7 harmonic current amplitude at steady state operation of the powered device.
3. The method of claim 2, wherein: under the condition of increasing amplitude characteristics, the amplitude and the phase of fundamental waves and each harmonic can be obtained by performing fast Fourier transform on voltage and current, and the calculation formula of power is as follows:
Figure FDA0003304950230000011
wherein: a is1And b1The amplitudes of the fundamental voltage and the current respectively;
Figure FDA0003304950230000012
is the phase difference between the two.
4. The method of claim 1, wherein: in step S02, the improved KNN algorithm specifically includes:
Figure FDA0003304950230000013
wherein: weight (T)j) For training sample TjThe weight of (c).
5. The method of claim 3, wherein: the majority sample weight assignment method is as follows: weight (T)j)=1/size(CTj) Wherein size (C)Tj) Represents TjThe number of training samples contained in the category.
6. The method of any one of claims 1 to 5, wherein: step S03 specifically includes: step S031: calculating the V-I track similarity and the amplitude similarity of the sample to be detected and all training samples, and respectively recording the V-I track similarity and the amplitude similarity as Sim1 and Sim 2:
Sim=1/(1+dist)
sim2 ═ 1/(1+ dist2), where: dist1 and dist2 are the distance of the V-I locus and the distance of the amplitude among 2 samples respectively; step S032: arranging the training samples according to the descending order of the sizes of the Sim1, and taking the first K training samples with the largest Sim1 as the current testK nearest neighbors of the sample; step S033: calculating the comprehensive similarity between the current test sample and all K nearest neighbors: sim (a, T)j)=Sim1(a,Tj)×weight(Tj)+Sim2(a,Tj) (ii) a Step S034: and (4) counting the total comprehensive similarity of the sample to be detected and each class in the K nearest neighbors, and taking the class with the maximum total comprehensive similarity as a prediction result.
7. The method of claim 6, wherein: both the dist1 and dist2 are Euclidean distances.
CN202111201436.3A 2021-10-15 2021-10-15 Non-invasive load identification method based on improved kNN algorithm Pending CN114004285A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111201436.3A CN114004285A (en) 2021-10-15 2021-10-15 Non-invasive load identification method based on improved kNN algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111201436.3A CN114004285A (en) 2021-10-15 2021-10-15 Non-invasive load identification method based on improved kNN algorithm

Publications (1)

Publication Number Publication Date
CN114004285A true CN114004285A (en) 2022-02-01

Family

ID=79922987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111201436.3A Pending CN114004285A (en) 2021-10-15 2021-10-15 Non-invasive load identification method based on improved kNN algorithm

Country Status (1)

Country Link
CN (1) CN114004285A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290802A (en) * 2023-11-27 2023-12-26 惠州市鑫晖源科技有限公司 Host power supply operation monitoring method based on data processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290802A (en) * 2023-11-27 2023-12-26 惠州市鑫晖源科技有限公司 Host power supply operation monitoring method based on data processing
CN117290802B (en) * 2023-11-27 2024-03-26 惠州市鑫晖源科技有限公司 Host power supply operation monitoring method based on data processing

Similar Documents

Publication Publication Date Title
CN110752410B (en) Method for rapidly sorting and recombining retired lithium battery
CN110443281B (en) Text classification self-adaptive oversampling method based on HDBSCAN (high-density binary-coded decimal) clustering
CN110398663B (en) Flexible direct current power grid fault identification method based on convolutional neural network
CN109829497B (en) Supervised learning-based station area user identification and discrimination method
CN112732748B (en) Non-invasive household appliance load identification method based on self-adaptive feature selection
CN110147760B (en) Novel efficient electric energy quality disturbance image feature extraction and identification method
CN103136587A (en) Power distribution network operating state classification recognition method based on support vector machine
CN111046913B (en) Load abnormal value identification method
CN110068776B (en) Three-level inverter open-circuit fault diagnosis method based on optimized support vector machine
CN110738232A (en) grid voltage out-of-limit cause diagnosis method based on data mining technology
CN114359674A (en) Non-invasive load identification method based on metric learning
CN116449218B (en) Lithium battery health state estimation method
CN112287980B (en) Power battery screening method based on typical feature vector
CN109359665A (en) A kind of family&#39;s electric load recognition methods and device based on support vector machines
CN112036450B (en) High-voltage cable partial discharge mode identification method and system based on transfer learning
CN114004285A (en) Non-invasive load identification method based on improved kNN algorithm
CN115879048A (en) Series arc fault identification method and system based on WRFMDA model
CN108898182A (en) A kind of MMC method for diagnosing faults based on core pivot element analysis and support vector machines
CN111651932A (en) Online dynamic security assessment method for power system based on integrated classification model
CN116796271A (en) Resident energy abnormality identification method
CN116699446A (en) Method, device, equipment and storage medium for rapidly sorting retired batteries
CN113191419B (en) Sag homologous event detection and type identification method based on track key point matching and region division
CN117907835A (en) New energy battery fault diagnosis method
CN105823964B (en) Power transmission line comprehensive Fault Locating Method towards intelligent substation
CN117171586A (en) Household transformer relation identification method and system based on current sequence similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination