CN110995153A - Abnormal data detection method and device for photovoltaic power station and electronic equipment - Google Patents

Abnormal data detection method and device for photovoltaic power station and electronic equipment Download PDF

Info

Publication number
CN110995153A
CN110995153A CN201911308534.XA CN201911308534A CN110995153A CN 110995153 A CN110995153 A CN 110995153A CN 201911308534 A CN201911308534 A CN 201911308534A CN 110995153 A CN110995153 A CN 110995153A
Authority
CN
China
Prior art keywords
data
photovoltaic
photovoltaic residual
residual data
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911308534.XA
Other languages
Chinese (zh)
Other versions
CN110995153B (en
Inventor
沈文涛
刘海谊
谢祥颖
郭兴科
那峙雄
范雪凝
马大燕
孟凡腾
王栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bg Sugon Big Data Co ltd
Guowang Xiongan Finance Technology Group Co ltd
State Grid Digital Technology Holdings Co ltd
Original Assignee
Beijing Bg Sugon Big Data Co ltd
Guowang Xiongan Finance Technology Group Co ltd
State Grid E Commerce Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bg Sugon Big Data Co ltd, Guowang Xiongan Finance Technology Group Co ltd, State Grid E Commerce Co Ltd filed Critical Beijing Bg Sugon Big Data Co ltd
Priority to CN201911308534.XA priority Critical patent/CN110995153B/en
Publication of CN110995153A publication Critical patent/CN110995153A/en
Application granted granted Critical
Publication of CN110995153B publication Critical patent/CN110995153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02SGENERATION OF ELECTRIC POWER BY CONVERSION OF INFRARED RADIATION, VISIBLE LIGHT OR ULTRAVIOLET LIGHT, e.g. USING PHOTOVOLTAIC [PV] MODULES
    • H02S50/00Monitoring or testing of PV systems, e.g. load balancing or fault identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E10/00Energy generation through renewable energy sources
    • Y02E10/50Photovoltaic [PV] energy

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Photovoltaic Devices (AREA)

Abstract

The invention provides a method and a device for detecting abnormal data of a photovoltaic power station and electronic equipment. Furthermore, the clustering center point is determined through the two dimension data of the density value and the distance value, so that the determined clustering center point is more accurate, the clustering result obtained by using the clustering center point is more accurate, and the determined abnormal data is more accurate.

Description

Abnormal data detection method and device for photovoltaic power station and electronic equipment
Technical Field
The invention relates to the field of data processing of photovoltaic power stations, in particular to a method and a device for detecting abnormal data of a photovoltaic power station and electronic equipment.
Background
The mode of combining distributed photovoltaic power generation with a large power grid has great advantages in aspects of saving investment, reducing energy consumption, improving stability and flexibility of a power system and the like.
At present, the geographical positions of distributed photovoltaic power stations are scattered, the conditions that the operation condition of the power stations is difficult to monitor, the fault processing is not timely and the like exist, if abnormal data of the distributed photovoltaic power stations can be detected in time, the monitoring capability and the fault processing capability of the distributed photovoltaic power stations can be improved, and the stability and the safety of an electric power system are ensured.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for detecting abnormal data of a photovoltaic power station, and an electronic device, so as to solve the problem that abnormal data of a distributed photovoltaic power station needs to be detected urgently.
In order to solve the technical problems, the invention adopts the following technical scheme:
an abnormal data detection method of a photovoltaic power station comprises the following steps:
acquiring at least one photovoltaic residual data;
calculating a density value and a distance value corresponding to the photovoltaic residual data;
screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and taking the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
clustering the photovoltaic residual data based on the clustering central point to obtain a clustering result;
and determining photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
Optionally, calculating a density value corresponding to the photovoltaic residual data includes:
acquiring a photovoltaic residual data threshold;
by using
Figure BDA0002323863900000021
Formula, calculating to obtain density value rho corresponding to photovoltaic residual datai(ii) a Wherein i, j is the identifier of the photovoltaic residual data; di,jThe Euclidean distance of the two photovoltaic residual error data; dcIs a photovoltaic residual data threshold.
Optionally, the calculating a distance value corresponding to the photovoltaic residual data includes:
for each of the photovoltaic residual data, determining a set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data;
according to
Figure BDA0002323863900000022
Calculating to obtain a distance value corresponding to the photovoltaic residual error data; wherein, IsA set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data.
Optionally, the preset condition includes that the density value is greater than an average value of the density values corresponding to all the photovoltaic residual data, and the distance value is greater than an average value of the distance values corresponding to all the photovoltaic residual data.
Optionally, acquiring at least one photovoltaic residual data comprises:
acquiring actual operation data and predicted operation data of at least one power station;
and subtracting the actual operation data and the predicted operation data corresponding to the same power station to obtain photovoltaic residual data corresponding to the power station.
Optionally, obtaining predicted operating data of at least one power station comprises:
acquiring predicted operation data, a weight value and acquired actual operation data at the previous data acquisition moment;
according to St=ayt-1+(1-a)St-1Calculating to obtain the predicted operation data of the power station according to a formula; wherein a is a weight value; y ist-1Actual operation data collected at the previous data collection moment; st-1And predicting the operation data at the previous data acquisition time.
An abnormal data detection apparatus of a photovoltaic power plant, comprising:
the data acquisition module is used for acquiring at least one photovoltaic residual error data;
the numerical value calculation module is used for calculating a density value and a distance value corresponding to the photovoltaic residual data;
the data screening module is used for screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions and using the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
the clustering module is used for clustering the photovoltaic residual error data based on the clustering central point to obtain a clustering result;
and the abnormal data determining module is used for determining the photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
Optionally, when the numerical calculation module is configured to calculate the density value corresponding to the photovoltaic residual data, the numerical calculation module is specifically configured to:
obtaining photovoltaic residual data threshold value, utilizing
Figure BDA0002323863900000031
Formula, calculating to obtain density value rho corresponding to photovoltaic residual datai(ii) a Wherein i, j is the identifier of the photovoltaic residual data; di,jThe Euclidean distance of the two photovoltaic residual error data; dcIs a photovoltaic residual data threshold.
Optionally, when the numerical calculation module is configured to calculate a distance value corresponding to the photovoltaic residual data, the numerical calculation module is specifically configured to:
for each photovoltaic residual data, determining a set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data, according to
Figure BDA0002323863900000032
Calculating to obtain a distance value corresponding to the photovoltaic residual error data; wherein, IsA set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data.
An electronic device, comprising: a memory and a processor;
wherein the memory is used for storing programs;
the processor calls a program and is used to:
acquiring at least one photovoltaic residual data;
calculating a density value and a distance value corresponding to the photovoltaic residual data;
screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and taking the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
clustering the photovoltaic residual data based on the clustering central point to obtain a clustering result;
and determining photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a method and a device for detecting abnormal data of a photovoltaic power station and electronic equipment. Furthermore, the clustering center point is determined through the two dimension data of the density value and the distance value, so that the determined clustering center point is more accurate, the clustering result obtained by using the clustering center point is more accurate, and the determined abnormal data is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a method for detecting abnormal data of a photovoltaic power station according to an embodiment of the present invention;
fig. 2 is a distribution diagram of photovoltaic residual data according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for detecting abnormal data of a photovoltaic power plant according to another embodiment of the present invention;
fig. 4 is a distribution diagram of density and distance corresponding to photovoltaic residual data according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an abnormal data detection apparatus of a photovoltaic power station according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to monitor abnormal data in photovoltaic power stations at different positions, the inventor finds that abnormal data can be detected through a data abnormality detection model based on cubic exponential smoothing and DBSCAN. The data anomaly detection model mainly comprises a cubic exponential smoothing model and a DBSCAN (sensitivity-Based Spatial Clustering of Applications with Noise) Clustering algorithm. And the cubic exponential smoothing model carries out time sequence modeling on the input power consumption data sequence, predicts time by time and obtains a power consumption predicted value corresponding to each time. And then, carrying out cluster analysis on residual error items of the real value and the predicted value of the power consumption data by adopting a DBSCAN clustering algorithm, thereby realizing the detection of abnormal data points. Specifically, a clustering center point is determined by manually referring to residual items of the true value and the predicted value according to experience, then clustering is performed by using the clustering center point to obtain a clustering result, and photovoltaic residual data which do not belong to any clustering cluster in the clustering result is determined as abnormal data.
However, the inventor finds that, when the abnormal detection technology based on the DBSCAN data clustering determines the clustering center point, a manual determination mode is adopted, and if the manually determined clustering center point is inaccurate, the minimum inclusion number MinPts and the scanning radius Eps are difficult to select by using the inaccurate clustering center point under the scenes that the density of the data clustering is not uniform and the clustering intervals are greatly different. And finally, the clustering result is inaccurate, namely the clustering quality is poor, and further, the screened abnormal data is inaccurate. If the method is applied to the power data anomaly detection, the anomaly data cannot be detected and distinguished quickly and accurately, and the data processing accuracy and the real-time improvement effect are not obvious.
In order to solve the problems that a Clustering center point is selected wrongly due to a mode of manually determining the Clustering center point, so that Clustering failure and abnormal data detection are caused, the inventor of the invention provides a method for determining the Clustering center point based on a CFSFDP (Clustering by fast Clustering algorithm and fine of similarity peaks, density peak value based) algorithm and screening abnormal data, when determining the Clustering center point, two influencing factors of density and distance are considered comprehensively, similar data can be clustered together by adopting the density factor, the distance between the two Clustering center points can be far enough by adopting the distance factor, the difference between the two Clustering clusters is larger, and the accuracy of which Clustering cluster the data should fall into is improved. Therefore, the clustering center line point can be determined more quickly and accurately, and then clustering calculation is carried out based on the clustering center point to obtain abnormal data, so that the determined abnormal data is more accurate.
Specifically, referring to fig. 1, the abnormal data detection method of the photovoltaic power station may include:
and S11, acquiring at least one photovoltaic residual error data.
The distributed photovoltaic power stations are widely distributed, for example, one photovoltaic power station can be respectively arranged at A, B, C, D, E, F and other places, and in order to monitor which data monitored by the photovoltaic power station at the same time is abnormal data, the abnormal data detection method in the embodiment of the invention is provided. In the present embodiment, the case of simultaneously collecting irradiance and ambient temperature is described.
If 23 photovoltaic power generation stations are provided, each photovoltaic power generation station acquires the ambient temperature and irradiance at the current moment, wherein the ambient temperature and irradiance at the current moment acquired by each photovoltaic power generation station can form a vector, 23 photovoltaic power generation stations correspond to 23 vectors, and each vector is actual operation data.
After having gathered actual operation data, still need carry out data cleaning to actual operation data, wash dirty data, data cleaning can include:
the first step is as follows: the method comprises the following steps of firstly cleaning missing values, namely confirming the range of the missing values, removing unnecessary fields, filling missing contents and re-fetching.
The second step is that: format content cleaning, including two stages of adjusting display format inconsistency and content inconsistency between the content and the field.
The third step: and the logical error cleaning comprises three stages of removing the duplicate, removing the unreasonable value and correcting the contradictory content.
The fourth step: and (4) relevance verification, namely performing relevance verification on a plurality of data sources, and striving for non-contradictory data before the plurality of data sources.
After the actual operation data is acquired, the predicted operation data at the current moment also needs to be determined, at this time, the calculation of the predicted operation data can be carried out by adopting an exponential smoothing algorithm, and the predicted operation data StThe calculation formula of (2) is as follows:
St=ayt-1+(1-a)St-1(ii) a Wherein a is a weight value; y ist-1Actual operation data at the previous data acquisition moment; st-1And predicting the operation data at the previous data acquisition time.
The predicted operation data of each power station at the current moment can be determined through the formula, and the predicted operation data of 23 power stations at the current moment can be obtained.
It should be noted that, if data is detected every fixed time, such as 5 seconds, 10 seconds, 1 minute, and the like, the previous data acquisition time is the previous data acquisition time of the current time, and if the fixed time is 5 seconds, the previous data acquisition time is 5 seconds before. a is a weight value, which is set by a technician according to a specific data detection scenario, and is not limited to a specific numerical value.
If the actual operation data is two, such as the ambient temperature and the irradiance, the ambient temperature and the irradiance in the predicted operation data are calculated separately according to the formula, the predicted data corresponding to the ambient temperature and the irradiance are obtained through calculation respectively, and then a vector comprising the ambient temperature and the irradiance is formed and used as the predicted operation data.
And after actual operation data and the predicted operation data corresponding to each power station are obtained, subtracting the actual operation data and the predicted operation data corresponding to the same power station to obtain photovoltaic residual data corresponding to the power station. And after a difference making result is obtained, taking the absolute value of the difference value as final photovoltaic residual data.
If the number of the power stations is 23, 23 pieces of photovoltaic residual data, that is, 23 photovoltaic residual vectors, may be obtained, and a scene diagram of the photovoltaic residual data may refer to fig. 2. In fig. 2, the abscissa and the ordinate may represent the temperature difference value and the illumination difference value. There are 23 circles, each representing a photovoltaic residual data.
As can be seen from fig. 2, the point 23 photovoltaic residual data is far from other photovoltaic residual data, and there is a high possibility that the data is abnormal data.
And S12, determining the clustering center point of the photovoltaic residual error data according to the photovoltaic residual error data.
And S13, clustering the photovoltaic residual error data based on the clustering central point to obtain a clustering result.
The clustering center point is a center point used in data clustering, and after a plurality of photovoltaic residual data are obtained, clustering is carried out by using the clustering center point, so that a plurality of clustering clusters can be obtained, and the clustering result can be used.
When clustering is carried out, algorithms such as a DBSCAN algorithm, a K-means algorithm, an improved KNN algorithm and the like can be adopted for clustering. For example, taking the improved KNN algorithm as an example, a corresponding density threshold (also referred to as a scanning radius) Eps is set, and the photovoltaic residual data is clustered, so that a clustering result can be obtained.
And S14, determining the photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
In practical application, the photovoltaic residual data which do not belong to any cluster is abnormal data. And after the abnormal data are obtained, the abnormal data are pushed to a monitoring system for photovoltaic operation and maintenance personnel to check and make decisions and timely process faults.
In the embodiment, after the photovoltaic residual data corresponding to the power station is obtained, the clustering center point of the photovoltaic residual data is determined according to the photovoltaic residual data, the photovoltaic residual data are clustered based on the clustering center point to obtain clustering results, and the photovoltaic residual data which do not belong to any clustering cluster in the clustering results are determined as abnormal data. Furthermore, the clustering center point is determined through the two dimension data of the density value and the distance value, so that the determined clustering center point is more accurate, the clustering result obtained by using the clustering center point is more accurate, and the determined abnormal data is more accurate.
The above proposes "determining the clustering center point of the photovoltaic residual data according to the photovoltaic residual data", and now details a specific implementation process thereof. Specifically, referring to fig. 3, step S12 may include:
and S21, calculating the density value and the distance value corresponding to the photovoltaic residual data.
In this embodiment, a CFSFDP algorithm is used to determine the clustering center point, and when determining the clustering center point, a density value and a distance value corresponding to each photovoltaic residual data need to be determined.
In practical applications, the density value is related to the euclidean distance between the photovoltaic residual data and all of the photovoltaic residual data except the photovoltaic residual data. In particular, the density value ρiThe calculation formula of (2) is as follows:
Figure BDA0002323863900000081
wherein, i, j is the mark of the photovoltaic residual data, namelyWhich photovoltaic residual data is the body; dcIs a photovoltaic residual data threshold; di,jEuclidean distance for two photovoltaic residual data.
After determining the density value rhoiThen, the distance value sigma of the photovoltaic residual error data is continuously determinediA distance value is related to a density value of the photovoltaic residual data and Euclidean distances between the photovoltaic residual data and all of the photovoltaic residual data except the photovoltaic residual data; in particular, the method comprises the following steps of,
Figure BDA0002323863900000091
wherein, for a photovoltaic residual data, IsIs a set of photovoltaic residual data with a corresponding density value greater than the corresponding density value of the photovoltaic residual data, if IsIf not, the photovoltaic residual data is compared with IsThe minimum Euclidean distance of the photovoltaic residual data in (1) is taken as sigmaiIf I issIf the set is an empty set, the maximum Euclidean distance between the photovoltaic residual data and all the photovoltaic residual data is taken as sigmai
And S22, screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and taking the photovoltaic residual data as the clustering center point.
After the density value and the distance value of each photovoltaic residual data are determined, whether the density value and the distance value corresponding to each photovoltaic residual data meet preset conditions is judged, and the preset conditions can be that the density value is larger than the average value of the density values corresponding to all the photovoltaic residual data and the distance value is larger than the average value of the distance values corresponding to all the photovoltaic residual data. That is to say, the photovoltaic residual data with the larger density value and the larger distance value are screened out, and the data are the clustering center points.
Referring to fig. 4, after obtaining the density value and the distance value of each photovoltaic residual data, a two-dimensional graph of ρ and σ may be constructed, and points with larger density values and distance values, such as 14 and 19, are screened from the two-dimensional graph, i.e., the clustering center points are obtained. Note that point 23 in fig. 4 is closer to the coordinate σ axis and farther from the ρ axis, and this point is determined as an abnormal point. No. 23 points can be screened out through a clustering algorithm, and compared with the existing anomaly detection algorithm, the algorithm is higher in precision and higher in operation speed.
In this embodiment, a CFSFDP clustering algorithm based on density and distance is used, and by analyzing two characteristics of low density around the outlier and a distance between the outlier and the central point, similar data can be clustered together by using a density factor, and the distance between the two clustering central points can be sufficiently far by using a distance factor, so that the difference between the two clustering clusters is larger, and the accuracy of which clustering cluster a certain data should fall into is improved. The method and the device can quickly find the outlier, namely the abnormal data. Compared to a single density-based or single distance-based data anomaly detection algorithm. On one hand, relevance between the original electric power data is not damaged as far as possible, on the other hand, dimensionality and complexity of the data are reduced, and accurate detection of abnormal data is achieved, so that the safety situation of the electric power big data network is ensured, the abnormal detection result is more accurate, and the operation speed is more efficient.
Optionally, on the basis of the above embodiment of the abnormal data detection method, another embodiment of the present invention provides an abnormal data detection apparatus for a photovoltaic power plant, and with reference to fig. 5, the abnormal data detection apparatus may include:
a data obtaining module 11, configured to obtain at least one photovoltaic residual data;
a numerical calculation module 12, configured to calculate a density value and a distance value corresponding to the photovoltaic residual data;
the data screening module 13 is configured to screen out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and use the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
the clustering module 14 is configured to cluster the photovoltaic residual data based on the clustering center point to obtain a clustering result;
and the abnormal data determining module 15 is configured to determine photovoltaic residual data, which do not belong to any cluster, in the clustering result as abnormal data.
Further, when the numerical calculation module is configured to calculate the density value corresponding to the photovoltaic residual data, the numerical calculation module is specifically configured to:
obtaining photovoltaic residual data threshold value, utilizing
Figure BDA0002323863900000101
Formula, calculating to obtain density value rho corresponding to photovoltaic residual datai(ii) a Wherein i, j is the identifier of the photovoltaic residual data; di,jThe Euclidean distance of the two photovoltaic residual error data; dcIs a photovoltaic residual data threshold.
Further, when the numerical calculation module is configured to calculate a distance value corresponding to the photovoltaic residual data, the numerical calculation module is specifically configured to:
for each photovoltaic residual data, determining a set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data, according to
Figure BDA0002323863900000102
Calculating to obtain a distance value corresponding to the photovoltaic residual error data; wherein, IsA set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data.
Further, the preset conditions include that the density value is greater than an average value of the density values corresponding to all the photovoltaic residual data, and the distance value is greater than an average value of the distance values corresponding to all the photovoltaic residual data.
Further, when the data obtaining module is configured to obtain at least one photovoltaic residual data, the data obtaining module is specifically configured to: acquiring actual operation data and predicted operation data of at least one power station, and subtracting the actual operation data and the predicted operation data corresponding to the same power station to obtain photovoltaic residual data corresponding to the power station.
Further, obtaining predicted operational data for at least one power station, comprising:
obtaining the predicted operation data, the weight value and the collected actual operation data at the previous data collection moment according to the St=ayt-1+(1-a)St-1Calculating to obtain the predicted operation data of the power station according to a formula; wherein a is a weight value; y ist-1Actual operation data collected at the previous data collection moment; st-1And predicting the operation data at the previous data acquisition time.
In the embodiment, after the photovoltaic residual data corresponding to the power station is obtained, the clustering center point of the photovoltaic residual data is determined according to the photovoltaic residual data, the photovoltaic residual data are clustered based on the clustering center point to obtain clustering results, and the photovoltaic residual data which do not belong to any clustering cluster in the clustering results are determined as abnormal data. Furthermore, the clustering center point is determined through the two dimension data of the density value and the distance value, so that the determined clustering center point is more accurate, the clustering result obtained by using the clustering center point is more accurate, and the determined abnormal data is more accurate.
In addition, by utilizing a CFSFDP clustering algorithm based on density and distance, similar data can be clustered together by adopting a density factor through analyzing two characteristics of low density around a cluster point and far distance between the cluster point and a central point, the distance between the two cluster central points can be far enough by adopting a distance factor, the difference of the two cluster clusters is larger, and the accuracy of which cluster the certain data should fall into is improved. The method and the device can quickly find the outlier, namely the abnormal data. Compared to a single density-based or single distance-based data anomaly detection algorithm. On one hand, relevance between the original electric power data is not damaged as far as possible, on the other hand, dimensionality and complexity of the data are reduced, and accurate detection of abnormal data is achieved, so that the safety situation of the electric power big data network is ensured, the abnormal detection result is more accurate, and the operation speed is more efficient.
It should be noted that, for the working process of each module in this embodiment, please refer to the corresponding description in the above embodiments, which is not described herein again.
Optionally, on the basis of the above embodiments of the abnormal data detection method and apparatus, another embodiment of the present invention provides an electronic device, including: a memory and a processor;
wherein the memory is used for storing programs;
the processor calls a program and is used to:
acquiring at least one photovoltaic residual data;
calculating a density value and a distance value corresponding to the photovoltaic residual data;
screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and taking the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
clustering the photovoltaic residual data based on the clustering central point to obtain a clustering result;
and determining photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
In the embodiment, after the photovoltaic residual data corresponding to the power station is obtained, the clustering center point of the photovoltaic residual data is determined according to the photovoltaic residual data, the photovoltaic residual data are clustered based on the clustering center point to obtain clustering results, and the photovoltaic residual data which do not belong to any clustering cluster in the clustering results are determined as abnormal data. Furthermore, the clustering center point is determined through the two dimension data of the density value and the distance value, so that the determined clustering center point is more accurate, the clustering result obtained by using the clustering center point is more accurate, and the determined abnormal data is more accurate.
In addition, by utilizing a CFSFDP clustering algorithm based on density and distance, similar data can be clustered together by adopting a density factor through analyzing two characteristics of low density around a cluster point and far distance between the cluster point and a central point, the distance between the two cluster central points can be far enough by adopting a distance factor, the difference of the two cluster clusters is larger, and the accuracy of which cluster the certain data should fall into is improved. The method and the device can quickly find the outlier, namely the abnormal data. Compared to a single density-based or single distance-based data anomaly detection algorithm. On one hand, relevance between the original electric power data is not damaged as far as possible, on the other hand, dimensionality and complexity of the data are reduced, and accurate detection of abnormal data is achieved, so that the safety situation of the electric power big data network is ensured, the abnormal detection result is more accurate, and the operation speed is more efficient.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An abnormal data detection method for a photovoltaic power station is characterized by comprising the following steps:
acquiring at least one photovoltaic residual data;
calculating a density value and a distance value corresponding to the photovoltaic residual data;
screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and taking the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
clustering the photovoltaic residual data based on the clustering central point to obtain a clustering result;
and determining photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
2. The abnormal data detection method of claim 1, wherein calculating the density value corresponding to the photovoltaic residual data comprises:
acquiring a photovoltaic residual data threshold;
by using
Figure FDA0002323863890000011
Formula, calculating to obtain density value rho corresponding to photovoltaic residual datai(ii) a Wherein i, j is the identifier of the photovoltaic residual data; di,jThe Euclidean distance of the two photovoltaic residual error data; dcIs a photovoltaic residual data threshold.
3. The abnormal data detection method according to claim 2, wherein the calculating of the distance value corresponding to the photovoltaic residual data comprises:
for each of the photovoltaic residual data, determining a set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data;
according to
Figure FDA0002323863890000012
Calculating to obtain a distance value corresponding to the photovoltaic residual error data; wherein, IsA set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data.
4. The abnormal data detection method of claim 1, wherein the preset conditions include that the density value is greater than an average of the density values corresponding to all of the photovoltaic residual data, and the distance value is greater than an average of the distance values corresponding to all of the photovoltaic residual data.
5. The abnormal data detection method of claim 1, wherein obtaining at least one photovoltaic residual data comprises:
acquiring actual operation data and predicted operation data of at least one power station;
and subtracting the actual operation data and the predicted operation data corresponding to the same power station to obtain photovoltaic residual data corresponding to the power station.
6. The abnormal data detection method of claim 5, wherein obtaining predicted operational data for at least one power station comprises:
acquiring predicted operation data, a weight value and acquired actual operation data at the previous data acquisition moment;
according to St=ayt-1+(1-a)St-1Calculating to obtain the predicted operation data of the power station according to a formula; wherein a is a weight value; y ist-1Actual operation data collected at the previous data collection moment; st-1And predicting the operation data at the previous data acquisition time.
7. An abnormal data detection device of a photovoltaic power station, characterized by comprising:
the data acquisition module is used for acquiring at least one photovoltaic residual error data;
the numerical value calculation module is used for calculating a density value and a distance value corresponding to the photovoltaic residual data;
the data screening module is used for screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions and using the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
the clustering module is used for clustering the photovoltaic residual error data based on the clustering central point to obtain a clustering result;
and the abnormal data determining module is used for determining the photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
8. The abnormal data detection apparatus according to claim 7, wherein the numerical calculation module, when calculating the density value corresponding to the photovoltaic residual data, is specifically configured to:
obtaining photovoltaic residual data threshold value, utilizing
Figure FDA0002323863890000021
Formula, calculating to obtain density value rho corresponding to photovoltaic residual datai(ii) a Wherein i, j is the identifier of the photovoltaic residual data; di,jThe Euclidean distance of the two photovoltaic residual error data; dcIs a photovoltaic residual data threshold.
9. The abnormal data detection device according to claim 8, wherein the numerical calculation module, when being configured to calculate the distance value corresponding to the photovoltaic residual data, is specifically configured to:
for each photovoltaic residual data, determining a set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data, according to
Figure FDA0002323863890000031
Calculating to obtain a distance value corresponding to the photovoltaic residual error data; wherein, IsA set of photovoltaic residual data having a corresponding density value greater than a corresponding density value of the photovoltaic residual data.
10. An electronic device, comprising: a memory and a processor;
wherein the memory is used for storing programs;
the processor calls a program and is used to:
acquiring at least one photovoltaic residual data;
calculating a density value and a distance value corresponding to the photovoltaic residual data;
screening out photovoltaic residual data of which the corresponding density values and distance values meet preset conditions, and taking the photovoltaic residual data as a clustering center point of at least one photovoltaic residual data;
clustering the photovoltaic residual data based on the clustering central point to obtain a clustering result;
and determining photovoltaic residual data which do not belong to any cluster in the clustering result as abnormal data.
CN201911308534.XA 2019-12-18 2019-12-18 Abnormal data detection method and device for photovoltaic power station and electronic equipment Active CN110995153B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911308534.XA CN110995153B (en) 2019-12-18 2019-12-18 Abnormal data detection method and device for photovoltaic power station and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911308534.XA CN110995153B (en) 2019-12-18 2019-12-18 Abnormal data detection method and device for photovoltaic power station and electronic equipment

Publications (2)

Publication Number Publication Date
CN110995153A true CN110995153A (en) 2020-04-10
CN110995153B CN110995153B (en) 2020-11-24

Family

ID=70095586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911308534.XA Active CN110995153B (en) 2019-12-18 2019-12-18 Abnormal data detection method and device for photovoltaic power station and electronic equipment

Country Status (1)

Country Link
CN (1) CN110995153B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579584A (en) * 2020-12-21 2021-03-30 北京华能新锐控制技术有限公司 Photoelectric abnormal data detection method and device
CN115601197A (en) * 2022-11-28 2023-01-13 深圳市峰和数智科技有限公司(Cn) Abnormal state detection method and device for photovoltaic power station
CN116992389A (en) * 2023-09-26 2023-11-03 河北登浦信息技术有限公司 False data detection method and system for Internet of things
CN117540225A (en) * 2024-01-09 2024-02-09 成都电科星拓科技有限公司 Distributed ups system consistency assessment system and method based on DBSCAN clustering
CN117932523A (en) * 2024-03-25 2024-04-26 山东诚祥建设集团股份有限公司 Construction engineering construction data processing method and system
CN118035916A (en) * 2024-03-01 2024-05-14 国网黑龙江省电力有限公司佳木斯供电公司 Rural power grid power supply fault abnormality detection method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016083493A1 (en) * 2014-11-28 2016-06-02 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method of detecting clusters in a set of signals
CN105678402A (en) * 2015-12-29 2016-06-15 北京国能日新***控制技术有限公司 Photovoltaic power prediction method based on seasonal regionalization
CN106446357A (en) * 2016-09-06 2017-02-22 中国农业大学 Multi-state modeling method and apparatus for photovoltaic power generation system
CN106529731A (en) * 2016-11-17 2017-03-22 云南电网有限责任公司电力科学研究院 Regional power grid photovoltaic power station cluster division method
CN106548270A (en) * 2016-09-30 2017-03-29 许昌许继软件技术有限公司 A kind of photovoltaic plant power anomalous data identification method and device
CN106991430A (en) * 2017-02-28 2017-07-28 浙江工业大学 A kind of cluster number based on point of proximity method automatically determines Spectral Clustering
CN109409405A (en) * 2018-09-18 2019-03-01 中国电力科学研究院有限公司 A kind of active power distribution network bad data recognition method and apparatus
CN109842372A (en) * 2017-11-28 2019-06-04 中国电力科学研究院有限公司 A kind of photovoltaic module fault detection method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016083493A1 (en) * 2014-11-28 2016-06-02 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method of detecting clusters in a set of signals
CN105678402A (en) * 2015-12-29 2016-06-15 北京国能日新***控制技术有限公司 Photovoltaic power prediction method based on seasonal regionalization
CN106446357A (en) * 2016-09-06 2017-02-22 中国农业大学 Multi-state modeling method and apparatus for photovoltaic power generation system
CN106548270A (en) * 2016-09-30 2017-03-29 许昌许继软件技术有限公司 A kind of photovoltaic plant power anomalous data identification method and device
CN106529731A (en) * 2016-11-17 2017-03-22 云南电网有限责任公司电力科学研究院 Regional power grid photovoltaic power station cluster division method
CN106991430A (en) * 2017-02-28 2017-07-28 浙江工业大学 A kind of cluster number based on point of proximity method automatically determines Spectral Clustering
CN109842372A (en) * 2017-11-28 2019-06-04 中国电力科学研究院有限公司 A kind of photovoltaic module fault detection method and system
CN109409405A (en) * 2018-09-18 2019-03-01 中国电力科学研究院有限公司 A kind of active power distribution network bad data recognition method and apparatus

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579584A (en) * 2020-12-21 2021-03-30 北京华能新锐控制技术有限公司 Photoelectric abnormal data detection method and device
CN115601197A (en) * 2022-11-28 2023-01-13 深圳市峰和数智科技有限公司(Cn) Abnormal state detection method and device for photovoltaic power station
CN115601197B (en) * 2022-11-28 2023-03-10 深圳市峰和数智科技有限公司 Abnormal state detection method and device for photovoltaic power station
CN116992389A (en) * 2023-09-26 2023-11-03 河北登浦信息技术有限公司 False data detection method and system for Internet of things
CN116992389B (en) * 2023-09-26 2023-12-29 河北登浦信息技术有限公司 False data detection method and system for Internet of things
CN117540225A (en) * 2024-01-09 2024-02-09 成都电科星拓科技有限公司 Distributed ups system consistency assessment system and method based on DBSCAN clustering
CN117540225B (en) * 2024-01-09 2024-04-12 成都电科星拓科技有限公司 Distributed ups system consistency assessment system and method based on DBSCAN clustering
CN118035916A (en) * 2024-03-01 2024-05-14 国网黑龙江省电力有限公司佳木斯供电公司 Rural power grid power supply fault abnormality detection method
CN117932523A (en) * 2024-03-25 2024-04-26 山东诚祥建设集团股份有限公司 Construction engineering construction data processing method and system
CN117932523B (en) * 2024-03-25 2024-06-04 山东诚祥建设集团股份有限公司 Construction engineering construction data processing method and system

Also Published As

Publication number Publication date
CN110995153B (en) 2020-11-24

Similar Documents

Publication Publication Date Title
CN110995153B (en) Abnormal data detection method and device for photovoltaic power station and electronic equipment
EP3017403A2 (en) System and method for abnormality detection
CN113572625B (en) Fault early warning method, early warning device, equipment and computer medium
CN108647707B (en) Probabilistic neural network creation method, failure diagnosis method and apparatus, and storage medium
CN106612511B (en) Wireless network throughput evaluation method and device based on support vector machine
CN109743356B (en) Industrial internet data acquisition method and device, readable storage medium and terminal
CN115664038B (en) Intelligent power distribution operation and maintenance monitoring system for electrical safety management
CN113746798B (en) Cloud network shared resource abnormal root cause positioning method based on multi-dimensional analysis
CN115454778A (en) Intelligent monitoring system for abnormal time sequence indexes in large-scale cloud network environment
CN115800272A (en) Power grid fault analysis method, system, terminal and medium based on topology identification
CN113516105A (en) Lane detection method and device and computer readable storage medium
CN112905671A (en) Time series exception handling method and device, electronic equipment and storage medium
CN110738415A (en) Electricity stealing user analysis method based on electricity utilization acquisition system and outlier algorithm
CN113810792A (en) Edge data acquisition and analysis system based on cloud computing
CN114116168A (en) Method for collecting virtual network flow
CN115495274B (en) Exception handling method based on time sequence data, network equipment and readable storage medium
CN117031201A (en) Multi-scene topology anomaly identification method and system for power distribution network
CN110334125A (en) A kind of power distribution network measurement anomalous data identification method and device
CN110807014A (en) Cross validation based station data anomaly discrimination method and device
CN110837953A (en) Automatic abnormal entity positioning analysis method
CN115935285A (en) Multi-element time series anomaly detection method and system based on mask map neural network model
CN112765219B (en) Stream data abnormity detection method for skipping steady region
CN114597886A (en) Power distribution network operation state evaluation method based on interval type two fuzzy clustering analysis
CN110826904B (en) Data processing method and device for fan, processing equipment and readable storage medium
CN115684918B (en) Switch state identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 100032 room 8018, 8 / F, building 7, Guangyi street, Xicheng District, Beijing

Patentee after: State Grid Digital Technology Holdings Co.,Ltd.

Patentee after: Guowang Xiongan Finance Technology Group Co.,Ltd.

Patentee after: BEIJING BG-SUGON BIG DATA CO.,LTD.

Address before: 8 / F, building 1, Xianglong business building, 311 guang'anmennei street, Xicheng District, Beijing 100053

Patentee before: STATE GRID ELECTRONIC COMMERCE Co.,Ltd.

Patentee before: Guowang Xiongan Finance Technology Group Co.,Ltd.

Patentee before: BEIJING BG-SUGON BIG DATA CO.,LTD.