CN116595425B - Defect identification method based on power grid dispatching multi-source data fusion - Google Patents

Defect identification method based on power grid dispatching multi-source data fusion Download PDF

Info

Publication number
CN116595425B
CN116595425B CN202310857687.XA CN202310857687A CN116595425B CN 116595425 B CN116595425 B CN 116595425B CN 202310857687 A CN202310857687 A CN 202310857687A CN 116595425 B CN116595425 B CN 116595425B
Authority
CN
China
Prior art keywords
sequence
data
operation data
event
defect identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310857687.XA
Other languages
Chinese (zh)
Other versions
CN116595425A (en
Inventor
黄迪
罗少杰
严性平
顾建炜
朱超越
边巧燕
胡锡幸
陈潘霞
郑伟彦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dayou Industrial Co ltd Hangzhou Science And Technology Development Branch
Original Assignee
Zhejiang Dayou Industrial Co ltd Hangzhou Science And Technology Development Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dayou Industrial Co ltd Hangzhou Science And Technology Development Branch filed Critical Zhejiang Dayou Industrial Co ltd Hangzhou Science And Technology Development Branch
Priority to CN202310857687.XA priority Critical patent/CN116595425B/en
Publication of CN116595425A publication Critical patent/CN116595425A/en
Application granted granted Critical
Publication of CN116595425B publication Critical patent/CN116595425B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The application discloses a defect identification method based on power grid dispatching multi-source data fusion, which comprises the following steps: s1: generating signal sequences of different types of equipment in the power grid to obtain the signal sequences; s2: selecting a corresponding clustering algorithm to perform clustering analysis according to the data types in the signal sequence to obtain different event sequences; s3: and analyzing the data defects of the event sequence by adopting a dynamic programming algorithm, comparing the event sequence of the real-time operation data with the event sequence of the normal operation of the equipment, judging whether the real-time operation data is abnormal or not, and outputting a defect identification result if the real-time operation data is abnormal. The power distribution network operation fault feature information extraction method based on multi-source data fusion can improve reliability and accuracy of fault feature extraction and diagnosis, and can be widely applied to power grid fault diagnosis directions.

Description

Defect identification method based on power grid dispatching multi-source data fusion
Technical Field
The application relates to the field of data processing, in particular to a defect identification method based on power grid dispatching multi-source data fusion.
Background
The degree of distribution network automation is improving fast, has greatly promoted the convenient degree of obtaining fault data. When a power distribution network fails, accident information is large and complex, and the power distribution network quickly floods into a dispatching center, and in such a case, dispatching personnel need to quickly and accurately know the most central alarm information. However, it is very difficult to quickly and accurately identify the fault, mainly due to misjudgment, missed judgment, and the like. Therefore, the personnel processing the information are required to provide fault characteristic information by means of an effective power distribution network fault characteristic extraction theory and method, and the fault characteristic information is used as auxiliary judgment, so that the safe operation of the power distribution network is ensured.
At present, common defect identification is to judge the fault position through a certain intelligent algorithm after acquiring characteristic information, and how to process the characteristic information obviously influences the accuracy and efficiency of subsequent identification, so how to improve the accuracy and efficiency of data processing in the process from the early processing of data to the final identification is a problem existing in the prior art.
Disclosure of Invention
Aiming at the problem of low accuracy and efficiency of data processing in the prior art, the application provides a defect identification method based on power grid dispatching multi-source data fusion, which takes a sequence as a data carrier, further upgrades the sequence through clustering, and then performs defect identification in a mode of analyzing and comparing the sequence, thereby improving the accuracy and efficiency of defect identification.
The following is a technical scheme of the application.
A defect identification method based on power grid dispatching multi-source data fusion comprises the following steps:
s1: generating signal sequences of different types of equipment in the power grid to obtain the signal sequences;
s2: selecting a corresponding clustering algorithm to perform clustering analysis according to the data types in the signal sequence to obtain different event sequences;
s3: and analyzing the data defects of the event sequence by adopting a dynamic programming algorithm, comparing the event sequence of the real-time operation data with the event sequence of the normal operation of the equipment, judging whether the real-time operation data is abnormal or not, and outputting a defect identification result if the real-time operation data is abnormal.
According to the method, data of different types of equipment in a power grid are summarized through display of signal sequences, corresponding cluster analysis is carried out on the basis of the signal sequences according to data types, the data are updated into event sequences, finally, data defects of the event sequences are analyzed through a dynamic programming algorithm, and recognition results can be obtained through comparison of the sequences. The application processes data around different types of sequences, and compared with the situation that the difference between the front and the back of the data is huge in the traditional processing process, the application has the uniformity of the data in the processing process, and can improve the processing efficiency and the accuracy.
Preferably, the step S1: generating signal sequences for different types of equipment in a power grid to obtain the signal sequences, wherein the signal sequences comprise:
collecting operation data of different types of equipment in a power grid;
sequencing the operation data to form a signal sequenceStored as intermediate table->WhereinDifferent kinds of operation data collected for different devices.
Preferably, the step S2: selecting a corresponding clustering algorithm to perform clustering analysis according to the data types in the signal sequence to obtain different event sequences, wherein the method comprises the following steps:
according to the input data form, different clustering algorithms are adopted, wherein current, voltage and power data adopt K-means clustering based on Euclidean distance of sentence vectors, switching value data adopt DBSCAN noise density clustering, and non-electric quantity data adopt aggregation hierarchical algorithm based on editing distance. Because the data types are more and the characteristics are different, if the same clustering mode is adopted, the data differences cannot be balanced, so that three different clustering algorithms are adopted to aim at different data types.
Preferably, the step S3: analyzing the data defects of the event sequence by adopting a dynamic programming algorithm, comparing the event sequence of the real-time operation data with the event sequence of the normal operation of the equipment, judging whether the real-time operation data is abnormal or not, and outputting defect identification results if the real-time operation data is abnormal, wherein the method comprises the following steps:
performing adjacent deduplication on the event sequence, and only reserving signals which appear for the first time for adjacent repeated signals;
forming the complete sequence of events into an overall sequence setWherein->Represents a sequence, +.>Representing a signal in a sequence;
is provided with,/>For two sequences A and B are both +.>Is a sequence of->And->Are all signals in a sequence;
let any common subsequence thereof beThen: if->Then->And->Is->And->Is the longest sub-sequence; if->And is also provided withThen->Is->And->Is a LCS of (C); if->And->Then->Is->And->Is a LCS of (C);
using a two-dimensional arrayRepresenting the corresponding pre +.in A and B>The length of LCS of a character yields the following formula:
(1);
if it isThere is->And according to->I.e. +.>Prefix->Is->The longest common subsequence of (2) is also +.>Is to convert the question into +.>And->A kind of electronic device
(2);
If it isThen->Or->Due to->And->At least one of them must be true; if->There is->Similarly, if->There is->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the problem is changed to +.>And->LCS of->And->LCS of (a); />The length of (2) is:
(3);
by comparing the longest subsequences of the two sequences A and B, whether the operation data is abnormal or not can be judged, wherein one sequence is an implementation operation data sequence, and the other sequence is a normal operation data sequence of the equipment.
Preferably, S4: the effectiveness of defect identification is evaluated and displayed: and calculating the error of defect identification by using three evaluation indexes of root mean square error, average absolute error and Perplexity value, and displaying the calculation result.
Preferably, the step S3 further includes labeling the event sequence, including:
generating different labels by using the equipment in the distribution network equipment table, the fault indicator table, the main transformer information table, the line information table, the pole distribution information table, the remote sensing data table and the non-electric quantity data table as a table unit, and mapping the labels with ID fields of each table;
mapping the real-time operation data table with the label generated in the last step to generate an intermediate table;
and generating a signal intermediate table with labels by associating the real-time operation data table with the intermediate table, and finishing labeling.
The application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the defect identification method based on the power grid dispatching multi-source data fusion when calling the computer program in the memory.
The application also provides a storage medium, wherein the storage medium stores computer executable instructions, and when the computer executable instructions are loaded and executed by a processor, the steps of the defect identification method based on the power grid dispatching multi-source data fusion are realized.
The essential effects of the application include:
the application focuses on the change of electrical quantity data before and after the occurrence of faults, comprehensively analyzes the data, and designs targeted algorithms respectively. Firstly, dividing distribution network equipment signal sources into different types of distribution network equipment and distribution network switch equipment defect signals, extracting and clustering, then selecting types of different clustering algorithms, and finally analyzing and comparing data defects of an event sequence through a dynamic programming algorithm to obtain a result. The power distribution network operation fault feature information extraction based on multi-source data fusion can improve the reliability and accuracy of fault feature extraction and diagnosis, and can be widely applied to the power grid fault diagnosis direction.
Drawings
FIG. 1 is a flow chart of an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solution will be clearly and completely described in the following in conjunction with the embodiments, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be understood that, in various embodiments of the present application, the sequence number of each process does not mean that the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
It should be understood that in the present application, "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "plurality" means two or more. "and/or" is merely an association relationship describing an association object, and means that three relationships may exist, for example, and/or B may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. "comprising A, B and C", "comprising A, B, C" means that all three of A, B, C comprise, "comprising A, B or C" means that one of the three comprises A, B, C, and "comprising A, B and/or C" means that any 1 or any 2 or 3 of the three comprises A, B, C.
The technical scheme of the application is described in detail below by specific examples. Embodiments may be combined with each other and the same or similar concepts or processes may not be described in detail in some embodiments.
Embodiment one:
a defect identification method based on power grid dispatching multi-source data fusion is shown in fig. 1, and comprises the following steps:
s1: generating signal sequences of different types of equipment in the power grid to obtain the signal sequences; comprising the following steps:
collecting operation data of different types of equipment in a power grid;
sequencing the operation data to form a signal sequenceStored as intermediate table->WhereinDifferent kinds of operation data collected for different devices.
The collected data comprise power distribution network dispatching operation data, remote sensing data, non-electric quantity data and the like. The dispatching operation data of the power distribution network are as follows: total load of the distribution network, node voltage of buses in the range of the distribution network, and nodes in the range from 110KV/35KV buses of 220KV stations to 10KV buses of 110KV/35KV stations. Active power, reactive power and current of main transformer, 220KV main transformer and 110KV/35KV main transformer in the power distribution network. Active current and reactive current of 110KV/35KV lines and 10KV outgoing lines of a transformer substation in a power distribution network, and voltages of 10KV switch stations, distribution stations, ring main units and box transformer buses in buses; the pole is matched with the current 10KV pole to be matched with active, reactive and current; the line tide 10KV feeder line segments active, reactive and current. The remote sensing data comprises the state of an opening and closing switch and the electric pulse quantity. The non-electric quantity data is the oil temperature of the transformer.
The SCADA system is mainly used in the power distribution network running state estimation and early warning scene. The data acquisition and monitoring control system is adopted in the power distribution system, so that the data can be acquired, and the utilization efficiency of the data is relatively high.
S2: selecting a corresponding clustering algorithm to perform clustering analysis according to the data types in the signal sequence to obtain different event sequences; comprising the following steps:
according to the input data form, different clustering algorithms are adopted, wherein current, voltage and power data adopt K-means clustering based on Euclidean distance of sentence vectors, switching value data adopt DBSCAN noise density clustering, and non-electric quantity data adopt aggregation hierarchical algorithm based on editing distance. Because the data types are more and the characteristics are different, if the same clustering mode is adopted, the data differences cannot be balanced, so that three different clustering algorithms are adopted to aim at different data types.
According to analysis requirements, a clustering method is adopted to search event sequences, the same event sequences are considered to be the same event sequences, and a calculation mode of three inter-sequence distances, namely an editing distance, a distance based on the longest common subsequence and an Euclidean distance based on sentence vectors, is required to be realized. The event sequence signal interpreted from the operation log is an important sequence signal, and in order to find a sequence signal similar to the important sequence signal, a clustering method is adopted, and the same type is considered to be a similar signal. According to the input data form, different clustering algorithms are adopted to realize aggregation hierarchy based on editing distance, K-means clustering based on Euclidean distance of sentence vector and DBSCAN noise density clustering, and event classification is optimized according to the clustering analysis result.
Specifically, the three clustering algorithms are as follows:
(1) Clustering algorithm for character string editing distance
Agglimerate (aggregation level) clustering based on edit distance. Agglomerate clustering. Firstly, assuming that each sample is an independent cluster, calculating the cluster number, if the number is larger than the given cluster number, combining the samples according to the distance nearest principle to form fewer clusters, and repeating the above processes until the cluster number reaches the expected value. The clustering number K needs to be set, and the clustering center is not available, so that the class prediction and the clustering model training can not be performed on samples outside the training set. And taking a distance matrix of the sequence set generated based on the editing distance as the distance, measuring Agglimerate clustering by using different cluster numbers, judging the clustering performance by using a contour coefficient, and selecting the better cluster number as a parameter to perform cluster analysis. Observing the dissimilarity of each group of clustering sequences, and analyzing the corresponding relation with event classification.
(2) Clustering algorithm for Euclidean distance of sequence sentence vector
K-means clustering based on sentence vector Euclidean distance. In the total sample space, K samples are randomly selected as initial clustering centers, euclidean distance between all samples and each clustering center is calculated, and the nearest center belongs to the category represented by the center, so that one-time clustering division is completed. And calculating the geometric center of the sample set of each obtained cluster, and if the geometric center is not coincident with the cluster center, recalculating the distance between each sample and the new center by taking the geometric center as the new cluster center to obtain a new cluster partition. The geometric center is calculated again until the geometric center of each cluster coincides with or is close enough to the cluster center according to which the clusters are divided, and the cluster number K needs to be set.
K-means clustering model training: and taking a sequence vector set obtained by a Doc2Vec algorithm as a sample, using different clustering numbers to perform K-means clustering, using a contour coefficient to judge the clustering performance, and selecting the better clustering number as a parameter to perform clustering analysis. Observing the dissimilarity of each group of clustering sequences, and analyzing the corresponding relation with event classification.
(3) Clustering algorithm based on DBSCAN
The DBSCAN clustering is to make a circle with a given radius from any sample, make the circle with the sample in the circle as the center, make the circle with the sample belonging to the same cluster with the sample in the circle as the center, and make the circle continuously according to the same method to cover more samples to add the cluster until no new sample is added, thus completing the division of a cluster. The above process is repeated, optionally one from the remaining samples, to partition the other clusters until there are no more unused samples. An optimal division radius may be selected based on the profile factor score.
Training a DBSCAN clustering model: and performing DBSCAN clustering by using different dividing radiuses based on a distance matrix of the edit distance generation sequence set as a distance measure, selecting a better dividing radius as a parameter to perform cluster analysis, observing the dissimilarity of each group of cluster sequences, and analyzing the corresponding relation with event classification.
S3: analyzing the data defects of the event sequence by adopting a dynamic programming algorithm, comparing the event sequence of the real-time operation data with the event sequence of the normal operation of the equipment, judging whether the real-time operation data is abnormal or not, and outputting a defect identification result if the real-time operation data is abnormal; comprising the following steps:
performing adjacent deduplication on the event sequence, and only reserving signals which appear for the first time for adjacent repeated signals;
forming the complete sequence of events into an overall sequence setWherein->Represents a sequence, +.>Representing a signal in a sequence;
is provided with,/>For two sequences A and B are both +.>Is a sequence of->And->Are all signals in a sequence;
let any common subsequence thereof beThen: if->Then->And->Is->And->Is the longest sub-sequence; if->And->Then->Is->And->Is a LCS of (C); if->And->Then->Is->And->Is a LCS of (C);
using a two-dimensional arrayRepresenting the corresponding pre +.in A and B>The length of LCS of a character yields the following formula:
(1);
if it isThere is->And according to->I.e. +.>Prefix->Is->The longest common subsequence of (2) is also +.>Is to convert the question into +.>And->A kind of electronic device
(2);
If it isThen->Or->Due to->And->At least one of them must be true; if->There is->Similarly, if->There is->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the problem is changed to +.>And->LCS of->And->LCS of (a); />The length of (2) is:
(3);
by comparing the longest subsequences of the two sequences A and B, whether the operation data is abnormal or not can be judged, wherein one sequence is an implementation operation data sequence, and the other sequence is a normal operation data sequence of the equipment.
Embodiment two:
in this embodiment, based on the first embodiment, S3 further includes labeling the event sequence, including:
generating different labels by using the equipment in the distribution network equipment table, the fault indicator table, the main transformer information table, the line information table, the pole distribution information table, the remote sensing data table and the non-electric quantity data table as a table unit, and mapping the labels with ID fields of each table;
mapping the real-time operation data table with the label generated in the last step to generate an intermediate table;
and generating a signal intermediate table with labels by associating the real-time operation data table with the intermediate table, and finishing labeling.
And, further comprising S4: the effectiveness of defect identification is evaluated and displayed: and calculating the error of defect identification by using three evaluation indexes of root mean square error, average absolute error and Perplexity value, and displaying the calculation result.
The average absolute error calculation formula is as follows:
(4);
the root mean square error calculation formula is as follows:
(5);
the smaller the mean absolute error and the root mean square error, the more accurate the result is explained.
The formula for the Perplexity value is described as follows:
(6);
in the method, in the process of the application,device defect representing test set, +.>Indicating other defects of the device->Expert knowledge rules indicating recommendations for defects, +.>Representing the total amount of equipment failure set.
In practical application, under normal conditions, the switch opening and closing is directly related to planned power failure and tripping, and the actual occurrence times of the switch are generally not more than 3 times (the switch opening and closing are calculated once) per day, so that the signals sent for more than 6 times per day (including the switch opening and closing) are regarded as false signals aiming at the switch deflection signals, and 99.08% of signals can be screened out; for the whole set of signals, 3.42% of signals can be screened out according to the rule that the repeated signals continuously transmitted under the same line/station are frequency signals and the first signal is the effective signal is reserved.
Taking an operation log and a remote signaling SOE table as examples, the process of equipment defect prediction training is described:
1. data source
Data sources two tables: the operation log and remote signaling SOE table YX_SOE.
2. Field cleaning
According to analysis requirements, the operation data table needs to be divided into fields such as events, lines, device types, device names, signals, states and the like. First, 5 fields such as line, device type, device name, signal type, signal status, etc. need to be decomposed from the fields. All signals belonging to the same line and occurring within 15 minutes are the same event, and signals sent in the next minute along the time axis over one minute are noted as further events. In the original data, we find that many data are missing information such as line names or device types, and we need to supplement according to the known information.
Operation log parsing event: the log table "info" field is operated with regular matching.
The data studied consisted of signals sent from all devices on the same line within 15 minutes of log resolved event matching.
3. Data tagging
Since the fields from the parsing place are text-type, the variables of text-type need to be labeled for facilitating the later calculation, and the following tables 1 and 2 are specifically shown:
TABLE 1
TABLE 2
Coding according to the corresponding fields in table 1 to obtain an event representation of the sunny line, see table 2 for the form:a group of sequences consisting of +.>
Assume that N kinds of defect labels exist in a certain distribution network device, and the defects are generatedAnd defect->The device portrait feature text contains N labels, and the similarity of two device defects is +.>And->The calculation formula is as follows:
(7);
in the method, in the process of the application,indicating defect->And defect->Number of features common to device images, +.>Indicating defect->And defect->Total feature quantity of the device.
According to the method, data of different types of equipment in a power grid are summarized through display of signal sequences, corresponding cluster analysis is carried out on the basis of the signal sequences according to data types, the data are updated into event sequences, finally, data defects of the event sequences are analyzed through a dynamic programming algorithm, and recognition results can be obtained through comparison of the sequences. The application processes data around different types of sequences, and compared with the situation that the difference between the front and the back of the data is huge in the traditional processing process, the application has the uniformity of the data in the processing process, and can improve the processing efficiency and the accuracy.
The embodiment also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the defect identification method based on the power grid dispatching multi-source data fusion when calling the computer program in the memory.
The embodiment also provides a storage medium, in which computer executable instructions are stored, and when the computer executable instructions are loaded and executed by a processor, the steps of the defect identification method based on the power grid dispatching multi-source data fusion are realized.
The essential effects of the present embodiment include:
in the embodiment, the change of the electrical quantity data before and after the occurrence of the fault is focused, and the comprehensive analysis is performed, so that targeted algorithms are respectively designed. Firstly, dividing distribution network equipment signal sources into different types of distribution network equipment and distribution network switch equipment defect signals, extracting and clustering, then selecting types of different clustering algorithms, and finally analyzing and comparing data defects of an event sequence through a dynamic programming algorithm to obtain a result. The power distribution network operation fault feature information extraction based on multi-source data fusion can improve the reliability and accuracy of fault feature extraction and diagnosis, and can be widely applied to the power grid fault diagnosis direction.
From the above description of embodiments, those skilled in the art will appreciate that the disclosed structures and methods may be implemented in other ways. For example, the embodiments described above with respect to structures are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another structure, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via interfaces, structures or units, which may be in electrical, mechanical or other forms.
In addition, each functional unit in the embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read Only Memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (5)

1. The defect identification method based on the power grid dispatching multi-source data fusion is characterized by comprising the following steps of:
s1: generating signal sequences of different types of equipment in the power grid to obtain the signal sequences;
s2: selecting a corresponding clustering algorithm to perform clustering analysis according to the data types in the signal sequence to obtain different event sequences;
s3: analyzing the data defects of the event sequence by adopting a dynamic programming algorithm, comparing the event sequence of the real-time operation data with the event sequence of the normal operation of the equipment, judging whether the real-time operation data is abnormal or not, and outputting a defect identification result if the real-time operation data is abnormal;
wherein S2 comprises:
the event sequence signals interpreted from the operation log are used as important sequence signals, and different clustering algorithms are adopted to search sequence signals similar to the important sequence signals so as to obtain different event sequences;
the S1: generating signal sequences for different types of equipment in a power grid to obtain the signal sequences, wherein the signal sequences comprise:
collecting operation data of different types of equipment in a power grid;
sequencing the operation data to form a signal sequenceStored as intermediate table->Wherein->Different kinds of operation data acquired for different devices;
the S2: selecting a corresponding clustering algorithm to perform clustering analysis according to the data types in the signal sequence to obtain different event sequences, wherein the method comprises the following steps:
according to the input data form, different clustering algorithms are adopted, wherein current, voltage and power data adopt K-means clustering based on Euclidean distance of sentence vectors, switching value data adopt DBSCAN noise density clustering, and non-electric quantity data adopt aggregation hierarchical algorithm based on editing distance;
the S3: analyzing the data defects of the event sequence by adopting a dynamic programming algorithm, comparing the event sequence of the real-time operation data with the event sequence of the normal operation of the equipment, judging whether the real-time operation data is abnormal or not, and outputting defect identification results if the real-time operation data is abnormal, wherein the method comprises the following steps:
performing adjacent deduplication on the event sequence, and only reserving signals which appear for the first time for adjacent repeated signals;
forming the complete sequence of events into an overall sequence setWherein->Represents a sequence, +.>Representing a signal in a sequence;
is provided with,/>For two sequences A and B are both +.>Is a sequence of->And->Are all signals in a sequence;
let any common subsequence thereof beThen: if->ThenAnd->Is->And->Is the longest sub-sequence; if->And->Then->Is->And->Is a LCS of (C); if->And->Then->Is->And->Is a LCS of (C);
using a two-dimensional arrayRepresenting the corresponding pre +.in A and B>The length of LCS of a character yields the following formula:
(1);
if it isThere is->And according to->I.e. +.>Prefix->Is thatThe longest common subsequence of (2) is also +.>Is to convert the question into +.>And->A kind of electronic device
(2);
If it isThen->Or->Due to->And->At least one of them must be true; if->There is->Similarly, if->There is->The method comprises the steps of carrying out a first treatment on the surface of the At this time, the problem is changed to +.>And->LCS of->And->LCS of (a); />The length of (2) is:
(3);
by comparing the longest subsequences of the two sequences A and B, whether the operation data is abnormal or not can be judged, wherein one sequence is an implementation operation data sequence, and the other sequence is a normal operation data sequence of the equipment.
2. The defect identification method based on power grid dispatching multi-source data fusion according to claim 1, further comprising S4: the effectiveness of defect identification is evaluated and displayed: and calculating the error of defect identification by using three evaluation indexes of root mean square error, average absolute error and Perplexity value, and displaying the calculation result.
3. The defect identification method based on grid dispatching multi-source data fusion according to claim 1, wherein S3 further comprises labeling an event sequence, comprising:
generating different labels by using the equipment in the distribution network equipment table, the fault indicator table, the main transformer information table, the line information table, the pole distribution information table, the remote sensing data table and the non-electric quantity data table as a table unit, and mapping the labels with ID fields of each table;
mapping the real-time operation data table with the label generated in the last step to generate an intermediate table;
and generating a signal intermediate table with labels by associating the real-time operation data table with the intermediate table, and finishing labeling.
4. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor, when invoking the computer program in the memory, performs the steps of a method for defect identification based on grid-dispatching multi-source data fusion as claimed in any one of claims 1 to 3.
5. A storage medium having stored therein computer executable instructions which when loaded and executed by a processor implement the steps of a grid-dispatching multisource data fusion-based defect identification method as claimed in any one of claims 1 to 3.
CN202310857687.XA 2023-07-13 2023-07-13 Defect identification method based on power grid dispatching multi-source data fusion Active CN116595425B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310857687.XA CN116595425B (en) 2023-07-13 2023-07-13 Defect identification method based on power grid dispatching multi-source data fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310857687.XA CN116595425B (en) 2023-07-13 2023-07-13 Defect identification method based on power grid dispatching multi-source data fusion

Publications (2)

Publication Number Publication Date
CN116595425A CN116595425A (en) 2023-08-15
CN116595425B true CN116595425B (en) 2023-11-10

Family

ID=87606581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310857687.XA Active CN116595425B (en) 2023-07-13 2023-07-13 Defect identification method based on power grid dispatching multi-source data fusion

Country Status (1)

Country Link
CN (1) CN116595425B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015059272A1 (en) * 2013-10-24 2015-04-30 Universite Libre De Bruxelles Improved non-intrusive appliance load monitoring method and device
WO2019114947A1 (en) * 2017-12-13 2019-06-20 Telefonaktiebolaget Lm Ericsson (Publ) Technique for analyzing quality of service in a telecommunications network
CN110288004A (en) * 2019-05-30 2019-09-27 武汉大学 A kind of diagnosis method for system fault and device excavated based on log semanteme
CN110958136A (en) * 2019-11-11 2020-04-03 国网山东省电力公司信息通信公司 Deep learning-based log analysis early warning method
CN113205700A (en) * 2021-03-26 2021-08-03 福建新大陆软件工程有限公司 High-speed vehicle position identification method based on mobile phone signaling road network matching
KR102298452B1 (en) * 2020-12-21 2021-09-06 부산대학교 산학협력단 Apparatus and method for predicting iot-based augmented process
CN114090406A (en) * 2021-11-29 2022-02-25 中国电力科学研究院有限公司 Electric power Internet of things equipment behavior safety detection method, system, equipment and storage medium
CN114254716A (en) * 2022-03-02 2022-03-29 浙江鹏信信息科技股份有限公司 High-risk operation identification method and system based on user behavior analysis
CN114328075A (en) * 2021-09-09 2022-04-12 广东电网有限责任公司广州供电局 Intelligent power distribution room sensor multidimensional data fusion abnormal event detection method and system and computer readable storage medium
CN115409132A (en) * 2022-10-31 2022-11-29 广东电网有限责任公司佛山供电局 Method and system for processing power distribution network data
CN115871745A (en) * 2022-12-28 2023-03-31 江苏安防科技有限公司 Intelligent maintenance method and device applied to rail transit
CN115879616A (en) * 2022-12-02 2023-03-31 国网江苏省电力有限公司 High-risk meteorological identification method and device based on power transmission line microclimate station monitoring data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8522085B2 (en) * 2010-01-27 2013-08-27 Tt Government Solutions, Inc. Learning program behavior for anomaly detection

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015059272A1 (en) * 2013-10-24 2015-04-30 Universite Libre De Bruxelles Improved non-intrusive appliance load monitoring method and device
WO2019114947A1 (en) * 2017-12-13 2019-06-20 Telefonaktiebolaget Lm Ericsson (Publ) Technique for analyzing quality of service in a telecommunications network
CN110288004A (en) * 2019-05-30 2019-09-27 武汉大学 A kind of diagnosis method for system fault and device excavated based on log semanteme
CN110958136A (en) * 2019-11-11 2020-04-03 国网山东省电力公司信息通信公司 Deep learning-based log analysis early warning method
KR102298452B1 (en) * 2020-12-21 2021-09-06 부산대학교 산학협력단 Apparatus and method for predicting iot-based augmented process
CN113205700A (en) * 2021-03-26 2021-08-03 福建新大陆软件工程有限公司 High-speed vehicle position identification method based on mobile phone signaling road network matching
CN114328075A (en) * 2021-09-09 2022-04-12 广东电网有限责任公司广州供电局 Intelligent power distribution room sensor multidimensional data fusion abnormal event detection method and system and computer readable storage medium
CN114090406A (en) * 2021-11-29 2022-02-25 中国电力科学研究院有限公司 Electric power Internet of things equipment behavior safety detection method, system, equipment and storage medium
CN114254716A (en) * 2022-03-02 2022-03-29 浙江鹏信信息科技股份有限公司 High-risk operation identification method and system based on user behavior analysis
CN115409132A (en) * 2022-10-31 2022-11-29 广东电网有限责任公司佛山供电局 Method and system for processing power distribution network data
CN115879616A (en) * 2022-12-02 2023-03-31 国网江苏省电力有限公司 High-risk meteorological identification method and device based on power transmission line microclimate station monitoring data
CN115871745A (en) * 2022-12-28 2023-03-31 江苏安防科技有限公司 Intelligent maintenance method and device applied to rail transit

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
汤少梁.《大数据管理与应用专业导论》.《东南大学出版社》,2021,正文第118-122页. *

Also Published As

Publication number Publication date
CN116595425A (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN106779505B (en) Power transmission line fault early warning method and system based on big data driving
CN109635928A (en) A kind of voltage sag reason recognition methods based on deep learning Model Fusion
CN113935562A (en) Intelligent rating and automatic early warning method for health condition of power equipment
Li et al. Adaptive hierarchical cyber attack detection and localization in active distribution systems
CN116154900A (en) Active safety three-stage prevention and control system and method for battery energy storage power station
CN115563477B (en) Harmonic data identification method, device, computer equipment and storage medium
CN106649765A (en) Smart power grid panoramic data analysis method based on big data technology
CN116595425B (en) Defect identification method based on power grid dispatching multi-source data fusion
CN113740666A (en) Method for positioning storm source fault of data center power system alarm
Xu et al. Fault diagnosis and identification of malfunctioning protection devices in a power system via time series similarity matching
CN114880584B (en) Generator set fault analysis method based on community discovery
Pan et al. Study on intelligent anti–electricity stealing early-warning technology based on convolutional neural networks
Du et al. Transformer fault identification with an IF‐1DCNN based on informative integration of heterogeneous sources
Yao et al. AdaBoost-CNN: a hybrid method for electricity theft detection
CN110415136A (en) A kind of electric power scheduling automatization system service ability assessment system and method
Xie et al. Knowledge Acquisition for Transformer Condition Assessment Using Synthetic Minority Over-sampling Technique and Decision Tree Algorithm
CN115372752A (en) Fault detection method, device, electronic equipment and storage medium
CN111143296A (en) Substation SCD file classification method and device, terminal and storage medium
Chang et al. Fault diagnosis of lithium-ion batteries based on voltage dip behavior
CN110263811A (en) A kind of equipment running status monitoring method and system based on data fusion
Huang et al. Fractional dimensionless indicator with random forest for bearing fault diagnosis under variable speed conditions
CN113159516B (en) Three-dimensional visual information analysis system based on power grid operation data
CN117172139B (en) Performance test method and system for copper-clad aluminum alloy cable for communication
Zhao et al. Research on machine learning-based correlation analysis method for power equipment alarms
Huang et al. Graph embedding and its application in defect detection system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant