CN117493922A - Power distribution network household transformer relation identification method based on data driving - Google Patents
Power distribution network household transformer relation identification method based on data driving Download PDFInfo
- Publication number
- CN117493922A CN117493922A CN202311233645.5A CN202311233645A CN117493922A CN 117493922 A CN117493922 A CN 117493922A CN 202311233645 A CN202311233645 A CN 202311233645A CN 117493922 A CN117493922 A CN 117493922A
- Authority
- CN
- China
- Prior art keywords
- data
- kernel
- matrix
- dimension reduction
- principal component
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000009467 reduction Effects 0.000 claims abstract description 42
- 238000000513 principal component analysis Methods 0.000 claims abstract description 32
- 238000012545 processing Methods 0.000 claims abstract description 21
- 238000010606 normalization Methods 0.000 claims abstract description 17
- 230000008859 change Effects 0.000 claims abstract description 15
- 238000003064 k means clustering Methods 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 36
- 239000013598 vector Substances 0.000 claims description 9
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 7
- 238000007781 pre-processing Methods 0.000 abstract description 3
- 238000004458 analytical method Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 235000018185 Betula X alpestris Nutrition 0.000 description 2
- 235000018212 Betula X uliginosa Nutrition 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000004379 similarity theory Methods 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000007175 bidirectional communication Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012847 principal component analysis method Methods 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Probability & Statistics with Applications (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
A method for identifying a household transformer relation of a power distribution network based on data driving comprises the following steps: step one, voltage data processing normalization, wherein a maximum and minimum normalization method is adopted due to a large voltage difference of users; step two, analyzing the main components of the kernel to reduce the dimension; the kernel principal component analysis dimension reduction method is a principal component analysis dimension reduction method added with kernel functions, the kernel principal component analysis uses the kernel functions to map data into a high-dimension feature space through the kernel functions, and then dimension reduction is carried out in the space. And thirdly, K-means clustering, namely clustering the data subjected to the dimension reduction processing based on the principle that the same platform region voltage data have similarity, and identifying the platform region household change relation. Firstly, preprocessing the original data by adopting a maximum and minimum value normalization method, so that the characteristic difference of the user voltage data is more obvious. And then, performing dimension reduction on the processed data by adopting kernel principal component analysis, thereby facilitating the accuracy and rapidity of a subsequent clustering algorithm.
Description
Technical Field
The invention relates to the technical field of power systems, in particular to a method for identifying a household transformer relation of a power distribution network based on data driving.
Background
In recent years, smart grids have become one of the most influential changes and innovations in the power industry field, and are also an essential component of smart city construction. However, the rapid development of smart grids also puts higher demands on the fine management of the distribution network. The current ubiquitous distribution area problems, such as abnormal line loss calculation and the like, influence the smooth promotion of a plurality of advanced applications such as area operation and planning, limit the capacity of realizing intelligent management of the whole area and directly influence the safe electricity utilization of users. In this problem, the difficulty in exact matching between the end user and the distribution transformer under control of the distribution transformer area is a major symptom. Therefore, the accurate and efficient user-change relation identification method has important strategic significance for realizing the informatization, automation and interaction targets of the intelligent power distribution area.
Disclosure of Invention
The invention aims to solve the technical problem of providing a data-driven power distribution network user-variable relation identification method, which utilizes the bidirectional communication of an AMI intelligent ammeter and the capability of recording detailed load information, and provides an AMI-based user-variable relation identification method based on a voltage fluctuation curve similarity theory. Firstly, preprocessing the original data by adopting a maximum and minimum value normalization method, so that the characteristic difference of the user voltage data is more obvious. And then, performing dimension reduction on the processed data by adopting kernel principal component analysis, thereby facilitating the accuracy and rapidity of a subsequent clustering algorithm. And finally, based on a voltage fluctuation similarity theory, carrying out k-means clustering on the data subjected to dimension reduction, classifying users with high voltage change similarity into one class, and belonging to the same transformer, thereby realizing user change relationship identification.
In order to solve the technical problems, the invention adopts the following technical scheme: a method for identifying a household transformer relation of a power distribution network based on data driving comprises the following steps:
step one, voltage data processing normalization, wherein a maximum and minimum normalization method is adopted due to a large voltage difference of users;
step two, analyzing the main components of the kernel to reduce the dimension; the kernel principal component analysis dimension reduction method is a principal component analysis dimension reduction method added with kernel functions, the kernel principal component analysis uses the kernel functions to map data into a high-dimension feature space through the kernel functions, and then dimension reduction is carried out in the space.
And thirdly, K-means clustering, namely clustering the data subjected to the dimension reduction processing based on the principle that the same platform region voltage data have similarity, and identifying the platform region household change relation.
Preferably, the first step includes the following:
1) For data set X collected by one user i =[x i,1 x i,2 … x i,t ]Determining a normalized range, selecting scaling the voltage data to [0,1 ]]Within the scope, wherein x i,1 For the first acquired voltage data of user i, x i,2 For the voltage data acquired by user i for the second time, x i,t User i t-th acquisitionVoltage data obtained;
2) Finding the maximum x of individual user voltage data max And a minimum value x min ;
3) For a voltage data point x collected by a single user, it is converted to a normalized value x' using the following normalization formula,
4) Applying the above normalization formula to each data point in the data set, converting them into values within the specified range, and outputting the data set X i 'and form a total voltage matrix X' = [ X ] 1 X 2 … X i ] T 。
Preferably, the second step includes the following:
1) Inputting a preprocessed regional individual user voltage data set X i ' calculate X i ' Gaussian kernel function values between data sets, using Euclidean distance formula,
where δ is the bandwidth parameter of the gaussian kernel function;
2) The core matrix K is subjected to centering processing, a centering matrix H is calculated, the centering matrix is a matrix related to a dimension t, and the expression is as follows:
wherein I is an identity matrix, 1 is an all 1 vector, 11 T Is a matrix generated from all 1 vectors;
calculating a centralizing kernel matrix K ', and applying the centralizing matrix H to the original kernel matrix K to obtain the centralizing kernel matrix K ' '
K′=HKH
The centralized nuclear moment K' array is used for the subsequent characteristic value decomposition and principal component selection steps;
3) Performing eigenvalue decomposition on the centering kernel matrix K' to obtain eigenvalue lambda 1 ,λ 2 ,...λ t And corresponding feature vector phi 1 ,φ 2 ,…φ t ;
4) Will characteristic value lambda 1 ,λ 2 ,...λ t Normalizing to obtain a characteristic value in a percentage form;
5) Setting the number of principal components to be retained according to the cumulative interpretable variance ratio;
6) The principal components to be kept are selected, the data point x is projected through the principal components, the data point Z after dimension reduction is obtained, and the total voltage data processing set Z after dimension reduction is obtained after repeating the steps.
Preferably, in the third step, clustering is performed on the data after the dimension reduction processing, including the following steps:
1) Initializing cluster centers, randomly selecting k data points as the initial cluster centers, and supposing that the cluster centers are c respectively 1 ,c 2 ,...c k
2) For each data point z, calculate it to each cluster center c j Z to the cluster Sj corresponding to the cluster center closest to,
s j ={z|||z-c j || 2 ≤||z-c l || 2 for all 1.ltoreq.l.ltoreq.k })
4) For each cluster Sj, calculating the average value of all data points in the cluster Sj to obtain a new cluster center c j ,
4) Repeating steps 2) and 3) until no significant change occurs in the cluster center;
5) Finally, each data point z is assigned to one cluster Sj, resulting in a clustered result.
The invention provides a data-driven power distribution network household transformer relation identification method, which has the following beneficial effects:
1. for the user voltage data acquired by AMI, the main reason of adopting maximum and minimum normalization is that the method can effectively resist the interference of abnormal values, and the influence of the extreme outlier on the model training can be limited by mapping the data characteristics into a normalized scale range, so that the stability and the robustness of the model are ensured. This process not only helps maintain the accuracy of the model, but also helps mitigate unreasonable deviations from outlier results.
2. The nuclear principal component analysis is an advanced dimension reduction technology, and combines the powerful characteristics of principal component analysis and nuclear methods. PCA is a linear dimension reduction technique that may not capture the intrinsic structure of the data for nonlinear data. The kernel principal component analysis can process nonlinear data and perform dimension reduction by using kernel skills, so that the data is more suitable for a linear model. The core principal component analysis better retains the key features of the data compared to linear PCA because it uses a kernel function to capture the nonlinear structure of the data, so that the reduced-dimension data can better distinguish between different classes or clusters, the power system topology is typically composed of a complex power network, while the nonlinear topology analysis capability of the core principal component analysis enables it to capture the nonlinear power system topology more accurately, while simplifying the high-dimension data by dimension reduction, thereby efficiently managing and analyzing the vast power data. In addition, in high dimensional data sets, the number of features far exceeds the number of samples may cause problems such as overfitting. The kernel principal component analysis can help reduce dimensionality, thereby improving the generalization performance of the model and mitigating the impact of dimensional disasters. The comprehensive characteristics enable the analysis of the nuclear principal components to become a powerful tool in the topology identification of the power system, and promote the further improvement of the reliability and efficiency of the power system.
3. The K-means clustering method can automatically perform data grouping on nodes or equipment in the power distribution network without predefining a topological structure, and is particularly beneficial to facing complex and variable low-voltage power distribution networks. The user-to-transformer relation identification needs to process real-time information, and k-means has real-time performance and can update topology information in real time along with the change of the state of the power distribution network, so that fault detection, monitoring and quick problem response are supported. And the clustering result of K-means is easy to understand and visualize, so that operators can be helped to better understand the topological structure of the low-voltage distribution network, including the distribution, connection relation and load distribution of equipment, which is helpful for decision making and problem analysis.
Drawings
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
FIG. 1 is a flow chart of data processing according to the present invention;
FIG. 2 is a class aggregation flow diagram of the present invention;
FIG. 3 is a graph of the invention based on the conventional principal component analysis data dimension reduction
FIG. 4 is a graph of the invention for dimension reduction based on analysis of kernel principal components.
Detailed Description
As shown in fig. 1, a method for identifying a household transformer relation of a power distribution network based on data driving is characterized by comprising the following steps:
step one, voltage data processing normalization, wherein a maximum and minimum normalization method is adopted due to a large voltage difference of users;
step two, analyzing the main components of the kernel to reduce the dimension; the kernel principal component analysis dimension reduction method is a principal component analysis dimension reduction method added with kernel functions, the kernel principal component analysis uses the kernel functions to map data into a high-dimension feature space through the kernel functions, and then dimension reduction is carried out in the space.
And thirdly, K-means clustering, namely clustering the data subjected to the dimension reduction processing based on the principle that the same platform region voltage data have similarity, and identifying the platform region household change relation.
Preferably, the first step includes the following:
1) For data set X collected by one user i =[x i,1 x i,2 …x i,t ]Determining a normalized range, selecting scaling the voltage data to [0,1 ]]Within the scope, wherein x i,1 First for user iSub-acquired voltage data, x i,2 For the voltage data acquired by user i for the second time, x i,t The voltage data acquired by the user i for the t time;
2) Finding the maximum x of individual user voltage data max And a minimum value x min ;
3) For a voltage data point x collected by a single user, it is converted to a normalized value x' using the following normalization formula,
4) Applying the above normalization formula to each data point in the data set, converting them into values within the specified range, and outputting the data set x i 'and form a total voltage matrix X' = [ X ] 1 X 2 … X i ] T 。
Preferably, the second step includes the following:
1) Inputting a preprocessed regional individual user voltage data set X i ' calculate X i ' Gaussian kernel function values between data sets, using Euclidean distance formula,
where δ is the bandwidth parameter of the gaussian kernel function;
2) The core matrix K is subjected to centering processing, a centering matrix H is calculated, the centering matrix is a matrix related to a dimension t, and the expression is as follows:
wherein I is an identity matrix, 1 is an all 1 vector, 11 T Is a matrix generated from all 1 vectors;
calculating a centralizing kernel matrix K ', and applying the centralizing matrix H to the original kernel matrix K to obtain the centralizing kernel matrix K ' '
K′=HKH
The centralized nuclear moment K' array is used for the subsequent characteristic value decomposition and principal component selection steps;
3) Performing eigenvalue decomposition on the centering kernel matrix K' to obtain eigenvalue lambda 1 ,λ 2 ,...λ t And corresponding feature vector phi 1 ,φ 2 ,...φ t ;
4) Will characteristic value lambda 1 ,λ 2 ,...λ t Normalizing to obtain a characteristic value in a percentage form;
5) Setting the number of principal components to be retained according to the cumulative interpretable variance ratio;
6) The principal components to be kept are selected, the data point x is projected through the principal components, the data point Z after dimension reduction is obtained, and the total voltage data processing set Z after dimension reduction is obtained after repeating the steps.
4. The method for identifying the household transformer relation of the power distribution network based on data driving according to claim 1, wherein in the third step, the data after the dimension reduction processing is clustered, and the method comprises the following steps:
1) Initializing cluster centers, randomly selecting k data points as the initial cluster centers, and supposing that the cluster centers are c respectively 1 ,c 2 ,...c k
2) For each data point z, calculate it to each cluster center c j Z to the cluster Sj corresponding to the cluster center closest to,
S j ={z|||z-c j || 2 ≤||z-c l || 2 for all 1.ltoreq.l.ltoreq.k })
5) For each cluster Sj, calculating the average value of all data points in the cluster Sj to obtain a new cluster center c j ,
4) Repeating steps 2) and 3) until no significant change occurs in the cluster center;
5) Finally, each data point z is assigned to one cluster Sj, resulting in a clustered result.
The invention takes 200 user voltage data extracted from five areas in a certain area as an example to carry out user change relation identification calculation analysis:
first, we perform maximum and minimum preprocessing on the voltage data acquired based on the AMI system, and perform kernel component dimension reduction on the voltage data.
And randomly extracting voltage data of two users in each area to perform traditional principal component analysis and dimension reduction processing of a nuclear principal component analysis technology, wherein the original data is mapped from 96 dimensions to a 50-dimension space. By comparing the traditional principal component analysis with the kernel principal component analysis, the comparison of the reduced-dimension data images can obviously show that the reduced-dimension data still maintains the change characteristics of the original data, as shown in fig. 3 and 4, wherein the horizontal axis in the figure is the data dimension, and the vertical axis is the voltage fluctuation value.
This result shows that the self-encoder can efficiently extract information that is significant to the data changes during the dimension reduction process and encode it into a low-dimensional representation. Although the dimension of the data after dimension reduction is reduced, the change trend and the structure of the original data can be maintained, and the effectiveness of the kernel principal component analysis in simultaneously realizing the dimension reduction target and maintaining the key characteristics of the data is verified.
Dimension reduction data accuracy comparison
PCA+K-means | Kernel PCA+K-means |
90.96% | 92.05% |
The graph results clearly show that the Kernel principal component analysis (Kernel PCA) shows clear superiority when processing nonlinear data of the user voltage in the power system, compared with the traditional principal component analysis method. By mapping the data to the high-dimensional feature space by using the Kernel function, the Kernel PCA can more effectively reserve nonlinear features in the data, so that higher accuracy is presented in the data after the dimension reduction. The discovery strongly supports the application of the nuclear principal component analysis in the power system, not only can the main change characteristics of the user voltage be reserved, but also the accuracy of the household change relation of the low-voltage distribution network station area can be improved
In order to deeply verify the accuracy of the method proposed in the paper, we have performed a comprehensive performance comparison, and examined the performance of a series of common clustering algorithms and dimension reduction techniques. The detailed performance evaluation is carried out by adopting evaluation indexes such as a Kaolinski-Harabasz Index (Calinski-Harabasz Index) and an RI (Rand Index).
Lande coefficient
20D | 30D | 40D | 50D | 60D | |
PCA+Birch | 0.678413 | 0.66382 | 0.652543 | 0.660385 | 0.637875 |
PCA+Kmeans | 0.756582 | 0.755845 | 0.759092 | 0.749383 | 0.728877 |
KPCA+Kmeans | 0.757283 | 0.754523 | 0.759285 | 0.756562 | 0.7592 |
Karnssky-Harabase index
20D | 30D | 40D | 50D | 60D | |
PCA+Birch | 80.91547 | 62.07512 | 54.71555 | 47.2577 | 42.6965 |
PCA+Kmeans | 123.8984 | 95.56359 | 80.76184 | 71.73475 | 64.62178 |
KPCA+Kmeans | 123.6157 | 95.59159 | 80.65116 | 71.86636 | 65.91663 |
In this study, the Kernel principal component analysis (Kernel PCA) and K-means clustering algorithms are significantly prominent, and the method herein is superior to the conventional method. The core principal component analysis as a nonlinear dimension reduction technique shows excellent ability in processing nonlinear structures of time-series data of an electric power system. By introducing the kernel function, the kernel function can map the data to a high-dimensional feature space, so that the complex nonlinear characteristic in the data is better reserved. This is of great importance for the modeling and interpretation of the complexity of the power system data. In addition, kernel PCA improves computational efficiency by reducing data dimensionality while maintaining critical information of the data. And the K-means clustering algorithm shows excellent performance in a power system data clustering task. The lower sensitivity of the clustering method to the outliers enables the outliers to be effectively divided into independent clusters, so that the accuracy of clustering results is improved. The excellent performance of the kernel principal component analysis and the K-means clustering algorithm in the power system topology identification is derived from the fact that the kernel principal component analysis and the K-means clustering algorithm respectively exert the characteristics of nonlinear descent and efficient clustering. The cooperative application of the two methods provides a powerful tool for the data analysis of the power system, and has obvious influence on the aspect of improving the accuracy and efficiency of user-to-user relationship identification.
The above embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and the scope of the present invention should be defined by the claims, including the equivalents of the technical features in the claims. I.e., equivalent replacement modifications within the scope of this invention are also within the scope of the invention.
Claims (4)
1. The method for identifying the household transformer relation of the power distribution network based on data driving is characterized by comprising the following steps of:
step one, voltage data processing normalization adopts a maximum and minimum normalization method;
step two, analyzing the main components of the kernel to reduce the dimension; the kernel principal component analysis dimension reduction method is a principal component analysis dimension reduction method added with kernel functions, the kernel principal component analysis uses the kernel functions to map data into a high-dimension feature space through the kernel functions, and then dimension reduction is carried out in the space.
And thirdly, K-means clustering, namely clustering the data subjected to the dimension reduction processing, and identifying the household change relation of the station area.
2. The method for identifying a household transformer relation of a power distribution network based on data driving according to claim 1, wherein the first step comprises the following steps:
1) For data set X collected by one user i =[x i,1 x i,2 …x i,t ]Determining a normalized range, selecting scaling the voltage data to [0,1 ]]Within the scope, wherein x i,1 For the first acquired voltage data of user i, x i,2 For the voltage data acquired by user i for the second time, x i,t The voltage data acquired by the user i for the t time;
2) Finding the maximum x of individual user voltage data max And a minimum value x min ;
3) For a voltage data point x collected by a single user, it is converted to a normalized value x' using the following normalization formula,
4) Applying the above normalization formula to each data point in the data set, converting them into values within the specified range, and outputting the data set X i 'and form a total voltage matrix X' = [ X ] 1 X 2 …X i ] T 。
3. The method for identifying the household transformer relations of the power distribution network based on data driving according to claim 1, wherein the second step comprises the following steps:
1) Inputting a preprocessed regional individual user voltage data set X i ' calculate X i ' Gaussian kernel function values between data sets, using Euclidean distance formula,
where δ is the bandwidth parameter of the gaussian kernel function;
2) The core matrix K is subjected to centering processing, a centering matrix H is calculated, the centering matrix is a matrix related to a dimension t, and the expression is as follows:
wherein I is an identity matrix, 1 is an all 1 vector, 11 T Is a matrix generated from all 1 vectors;
calculating a centralizing kernel matrix K ', and applying the centralizing matrix H to the original kernel matrix K to obtain the centralizing kernel matrix K ' '
K′=HKH
The centralized nuclear moment K' array is used for the subsequent characteristic value decomposition and principal component selection steps;
3) Performing eigenvalue decomposition on the centering kernel matrix K' to obtain eigenvalue lambda 1 ,λ 2 ,...λ t And corresponding feature vector phi 1 ,φ 2 ,...φ t ;
4) Will characteristic value lambda 1 ,λ 2 ,...λ t Normalizing to obtain a characteristic value in a percentage form;
5) Setting the number of principal components to be retained according to the cumulative interpretable variance ratio;
6) The principal components to be kept are selected, the data point x is projected through the principal components, the data point Z after dimension reduction is obtained, and the total voltage data processing set Z after dimension reduction is obtained after repeating the steps.
4. The method for identifying the household transformer relation of the power distribution network based on data driving according to claim 1, wherein in the third step, the data after the dimension reduction processing is clustered, and the method comprises the following steps:
1) Initializing cluster centers, randomly selecting k data points as the initial cluster centers, and supposing that the cluster centers are c respectively 1 ,c 2 ,...c k
2) For each data point z, calculate it to each cluster center c j Z to the cluster Sj corresponding to the cluster center closest to,
S j =[z|||z-c j || 2 ≤||z-c l || 2 for all 1.ltoreq.l.ltoreq.k })
3) For each cluster Sj, an average of all data points therein is calculated,obtaining a new cluster center c j ,
4) Repeating steps 2) and 3) until no significant change occurs in the cluster center;
5) Finally, each data point z is assigned to one cluster Sj, resulting in a clustered result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311233645.5A CN117493922A (en) | 2023-09-22 | 2023-09-22 | Power distribution network household transformer relation identification method based on data driving |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311233645.5A CN117493922A (en) | 2023-09-22 | 2023-09-22 | Power distribution network household transformer relation identification method based on data driving |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117493922A true CN117493922A (en) | 2024-02-02 |
Family
ID=89673302
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311233645.5A Pending CN117493922A (en) | 2023-09-22 | 2023-09-22 | Power distribution network household transformer relation identification method based on data driving |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117493922A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117706278A (en) * | 2024-02-04 | 2024-03-15 | 昆明理工大学 | Fault line selection method and system for power distribution network and readable storage medium |
-
2023
- 2023-09-22 CN CN202311233645.5A patent/CN117493922A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117706278A (en) * | 2024-02-04 | 2024-03-15 | 昆明理工大学 | Fault line selection method and system for power distribution network and readable storage medium |
CN117706278B (en) * | 2024-02-04 | 2024-06-07 | 昆明理工大学 | Fault line selection method and system for power distribution network and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yu et al. | Online fault diagnosis in industrial processes using multimodel exponential discriminant analysis algorithm | |
CN109582003B (en) | Bearing fault diagnosis method based on pseudo label semi-supervised kernel local Fisher discriminant analysis | |
Zhong et al. | Clustering-based network intrusion detection | |
CN110569316A (en) | low-voltage distribution area user topology identification method based on t-SNE dimension reduction technology and BIRCH clustering | |
CN111428768A (en) | Hellinger distance-Gaussian mixture model-based clustering method | |
CN112926045B (en) | Group control equipment identification method based on logistic regression model | |
He et al. | Fault diagnosis using improved discrimination locality preserving projections integrated with sparse autoencoder | |
CN117493922A (en) | Power distribution network household transformer relation identification method based on data driving | |
CN110795690A (en) | Wind power plant operation abnormal data detection method | |
Yang et al. | Auxiliary information regularized machine for multiple modality feature learning | |
CN113570200A (en) | Power grid operation state monitoring method and system based on multidimensional information | |
CN116821832A (en) | Abnormal data identification and correction method for high-voltage industrial and commercial user power load | |
CN110569888A (en) | transformer fault diagnosis method and device based on directed acyclic graph support vector machine | |
Zhu et al. | Novel K-Medoids based SMOTE integrated with locality preserving projections for fault diagnosis | |
CN110796159A (en) | Power data classification method and system based on k-means algorithm | |
Wang et al. | A hybrid approach for identification of concurrent control chart patterns | |
Aparna et al. | Comprehensive study and analysis of partitional data clustering techniques | |
CN109409394A (en) | A kind of cop-kmeans method and system based on semi-supervised clustering | |
Jiang et al. | Dynamic Bhattacharyya bound-based approach for fault classification in industrial processes | |
Chen et al. | An economic operation analysis method of transformer based on clustering | |
CN115344693B (en) | Clustering method based on fusion of traditional algorithm and neural network algorithm | |
CN113595242B (en) | Non-invasive load identification method based on depth CNN-HMM | |
Lu et al. | An improved DAG-SVM algorithm based on KFCM in power transformer fault diagnosis | |
Liu et al. | A survey of image clustering: Taxonomy and recent methods | |
Pan et al. | Multi-agent evolutionary clustering algorithm based on manifold distance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |