CN112163682B - Power dispatching automation system fault tracing method based on information difference graph model - Google Patents

Power dispatching automation system fault tracing method based on information difference graph model Download PDF

Info

Publication number
CN112163682B
CN112163682B CN202011118535.0A CN202011118535A CN112163682B CN 112163682 B CN112163682 B CN 112163682B CN 202011118535 A CN202011118535 A CN 202011118535A CN 112163682 B CN112163682 B CN 112163682B
Authority
CN
China
Prior art keywords
information
time series
automation system
fault
power dispatching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011118535.0A
Other languages
Chinese (zh)
Other versions
CN112163682A (en
Inventor
任昺
高欣
贾欣
李康生
刘治宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202011118535.0A priority Critical patent/CN112163682B/en
Publication of CN112163682A publication Critical patent/CN112163682A/en
Application granted granted Critical
Publication of CN112163682B publication Critical patent/CN112163682B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a power dispatching automation system fault tracing method based on an information difference graph model, which comprises the following steps: selecting historical data before and after alarming of the power dispatching automation system, obtaining a clustering center through a k-means algorithm, taking the clustering center as an endpoint of interval division, and taking the mean value of each interval as a discretization result of continuous characteristics; calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the components, establishing an information correlation matrix with or without an alarm section, measuring the difference degree before and after the alarm through the change rate of the information correlation matrix, and obtaining an information difference matrix by adopting a normalization technology; extracting the characteristics with high alarm information change of the power dispatching automation system and the interactive information among the characteristics, further constructing an information difference graph model combining a digraph and node self-information, and fitting fault degree indexes to perform fault degree sequencing. According to the technical scheme provided by the embodiment of the invention, the performance of tracing the fault of the power dispatching automation system is improved.

Description

Power dispatching automation system fault tracing method based on information difference graph model
[ technical field ] A method for producing a semiconductor device
The invention relates to a fault tracing method for solving unknown system topological relation in the field of fault positioning, in particular to a power dispatching automation system fault tracing method based on an information difference graph model.
[ background of the invention ]
With the continuous maturity of intelligent technology and network technology, the power dispatching automation system is a complex system integrating computing, communication and physical environments as a whole of computing and physical processes. The system generally comprises a plurality of components such as a server, a storage, a network device, application software and the like, and once a component fails, the association relationship of the component affects other components, so that the whole system is abnormal or fluctuated. Due to the fact that the topological relation among the components is unknown and complex, fault tracing becomes extremely difficult, and early tracing has important significance for guaranteeing safe and stable operation of the power dispatching automation system. Existing fault tracing methods mainly include rule-based and model-based. The rule-based fault tracing method relies heavily on expert experience, and operation and maintenance personnel are required to clearly master logical topological relations and fault cases. The fault tracing method based on modeling mainly comprises undirected graph models, directed graph models, fault trees, invariant networks and other models, wherein the undirected graph models have the problem that the fault propagation direction cannot be determined; the directed graph model is established in a complex system logic relation, once a part of components are increased or reduced, relevant knowledge in the professional field is required to be modified and updated, and maintenance cost is greatly increased; the fault tree needs the participation of experts in different fields, can be correctly constructed by means of detailed system knowledge, cannot meet the requirements of a large-scale system, neglects certain cause and effect relationships due to subjective factors of expert experience, and finally limits the deployment of the fault tree; the invariant network model is modeled in a data-driven mode, so that the problem of serious dependence on prior knowledge is greatly solved, but a huge fault candidate set needs to be trained to backtrack the root-cause fault component.
[ summary of the invention ]
In view of this, the embodiment of the present invention provides a power dispatching automation system fault tracing method based on an information difference graph model, and under the condition that a logical topological relation is ambiguous, a fault source can be effectively located through information transfer in an information model simulation actual system, and the performance of fault tracing is improved.
The embodiment of the invention provides a power dispatching automation system fault tracing method based on an information difference graph model, which comprises the following steps:
selecting historical data before and after alarming of the power dispatching automation system, obtaining a clustering center through a k-means algorithm, taking the clustering center as an endpoint of interval division, and taking the mean value of each interval as a discretization result of continuous characteristics;
calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the components, establishing an information correlation matrix with or without an alarm section, measuring the difference degree before and after the alarm through the change rate of the information correlation matrix, and obtaining an information difference matrix by adopting a normalization technology;
extracting the characteristics with high alarm information change of the power dispatching automation system and the interactive information among the characteristics, further constructing an information difference graph model combining a digraph and node self-information, and fitting fault degree indexes to perform fault degree sequencing.
In the method, the discretization of the continuous features is adopted, the historical data before and after the alarm of the power dispatching automation system is selected, the clustering center is obtained through a k-means algorithm and is used as an endpoint of interval division, and the method that the mean value of each interval is used as the discretization result of the continuous features comprises the following steps: collecting resource occupation data of CPUs (central processing units), memories, disks, networks and processes of all servers in the power dispatching automation system, wherein the resource occupation conditions of each server comprise IO (input/output) read-write conditions, utilization rates, collision rates and waiting time, taking time sequences of the characteristics as input of a tracing method, and assuming that values of a certain characteristic are distributed in [ a, b ]]And (4) interval, removing the duplication of all values of the characteristic to obtain the total number num, and setting the number k of the centroids of the clusters as
Figure BDA0002731198530000021
The sum of the squared errors of the centroid and the sample points is SSE, the relationship graph of SSE and k is the shape of one elbow, the k value of the elbow position is selected as the optimal clustering number,
Figure BDA0002731198530000022
wherein, CiIs the ith cluster, and poi is CiSample point of (1), miIs CiK centroids of { m ] are obtained according to a k-means algorithm1,m2,...,mk}(m1<m2<...<mk) The value of the feature is divided into k +1 intervals [ a, m1],[m1,m2],…,[mk-1,mk],[mk,b]And averaging the characteristic values in each interval to obtain a discretization result of the interval.
In the method, the information entropy of the components of the power dispatching automation system and the transfer entropy among the components are calculated, an information correlation matrix with or without an alarm section is established, and the information correlation matrix is used for calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the componentsThe change rate of the matrix measures the difference degree before and after the alarm, and the method for obtaining the information difference matrix by adopting the normalization technology comprises the following steps: discretizing resource occupation data of CPUs, memories, disks, networks and processes of all servers of the power dispatching automation system, collecting discretization characteristics of IO read-write conditions, utilization rates, collision rates and waiting time of the CPUs, the memories, the disks, the networks and the processes to obtain N characteristic time sequences { S }1,S2,...,SNCalculating self-information entropy of each time series { H }1,H2,...,HNH (S) is the self-entropy of time series S:
Figure BDA0002731198530000031
wherein x represents each time point in the characteristic time sequence S, p (x) represents the output probability of x in the time sequence S, and alphaxAll possible values of x in the characteristic time series S are represented, and the transfer entropy { T ] of every two characteristic time series is calculated1→2,T1→3,...,T1→N,T2→1,T2→3,...,T2→N,...,TN→N-1The mutual information of any two time series is defined as the measure of the amount of information shared by two variables, mutual information entropy I (S)I;SJ) For measuring time series SIAnd SJAmount of shared information of (2):
Figure BDA0002731198530000032
wherein S isIRepresenting the I-th characteristic time series, SJRepresents the J-th characteristic time series, and x and y represent the characteristic time series SIAnd SJP (x, y) represents the probability of joint distribution, p (x | y) represents the conditional probability, αxAnd alphayRepresenting a characteristic time series SIAnd SJAll possible values of x and y are further expanded into transfer entropy on the basis of mutual information entropy, and the transfer entropy is measuredDivide by S in calculationIIn addition to the information itself, SJAdditional information is also provided to predict SI(t+1)Setting the transfer entropy TJ→IFor a characteristic time series SJTo SIThe measure of mutual information of (1):
Figure BDA0002731198530000041
wherein itRepresenting a characteristic time series SIState at time t, it (k)Represents it+1The first k most recent characteristic time series SIState of (j)tRepresenting a characteristic time series SJState at time t, jt (l)Denotes jt+1Previous l most recent characteristic time series SJState (b), transfer entropy has directionality, so TI→JThe cause and effect relationship matrix A can be established by exchanging variables in the formula and calculating self information and mutual information of the time sequence without the alarm period:
Figure BDA0002731198530000042
similarly, a cause and effect relation matrix B with an alarm section is obtained, and an information difference matrix is established
Figure BDA0002731198530000043
And respectively carrying out normalization processing on the change rates of the diagonal line information and the off-diagonal line information to obtain a final information difference matrix.
In the method, the characteristics with high alarm information change of the power dispatching automation system and the interactive information among the characteristics are extracted, an information difference graph model combining a digraph and node self-information is further constructed, and fault degree indexes are fitted for fault degree sequencing, specifically: setting a threshold theta epsilon (0,1), cm,n(m<N,n<N,cm,nE (0, 1)) represents the value of the m row and n column in the information difference matrix C, traverses the matrix C and marks Cm,nRows and columns > Θ, SaKeeping all marked row and column values and setting other elements to zero to obtain an information difference matrix C ', establishing an information difference graph model according to a causal relationship between the extracted fault characteristics and the links among the characteristics, wherein the model comprises two pieces of information, the diagonal value of the C ' matrix represents the confidence difference value inside the node, the non-diagonal value of the C ' matrix represents the mutual information difference value between the node and the node, and the link between the node and the node represents SIInfluence SJAnd SJInfluence SICalculating the Fault degree through a Fault _ degree index:
Figure BDA0002731198530000044
wherein, ViRepresenting a single node, BINNs of V, in an information-difference graph modeliIs represented by the formula ViAdjacent node, VjIs represented by the formula ViSingle one of the adjacent nodes, NUM (BINNs of V)i) Is represented by the formula ViAnd traversing all nodes of the information difference graph model by the total number of the adjacent nodes, calculating the Fault degree of each node by adopting a Fault _ degree index, calculating the Fault degree of each node, and sequencing from large to small to obtain a final result.
The power dispatching automation system fault tracing method improves the performance of power dispatching automation system fault tracing.
According to the technical scheme, the invention has the following beneficial effects:
according to the technical scheme, the idea of mining information difference before and after alarming is adopted, an information difference graph model is established, and the fault sequencing indexes fused from information difference and mutual information difference are fitted through the network characteristics of the graph, so that the fault tracing positioning is realized under the condition that a system topological graph is unknown, and the fault tracing performance of the power dispatching automation system is improved.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a schematic flowchart of a method for tracing a fault of an electric power dispatching automation system based on an information difference graph model according to an embodiment of the present invention;
fig. 2 is a flowchart of a framework of a power dispatching automation system fault tracing method based on an information difference graph model according to an embodiment of the present invention.
[ detailed description ] embodiments
For better understanding of the technical solutions of the present invention, the following detailed descriptions of the embodiments of the present invention are provided with reference to the accompanying drawings.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides a method for tracing a fault of an electric power dispatching automation system based on an information difference graph model, please refer to fig. 1, which is a schematic flow diagram of the method for tracing a fault of an electric power dispatching automation system based on an information difference graph model, as shown in fig. 1, the method includes the following steps:
selecting historical data before and after alarming of the power dispatching automation system, obtaining a clustering center through a k-means algorithm, taking the clustering center as an endpoint of interval division, and taking the mean value of each interval as a discretization result of continuous characteristics;
calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the components, establishing an information correlation matrix with or without an alarm section, measuring the difference degree before and after the alarm through the change rate of the information correlation matrix, and obtaining an information difference matrix by adopting a normalization technology;
extracting the characteristics with high alarm information change of the power dispatching automation system and the interactive information among the characteristics, further constructing an information difference graph model combining a digraph and node self-information, and fitting fault degree indexes to perform fault degree sequencing.
Fig. 2 is a flowchart of a frame of a fault tracing method for an automatic power dispatching system based on an information difference graph model according to an embodiment of the present invention, in which a continuous characteristic is discretized, and a single-sequence self-information operation and a double-sequence mutual information operation are performed; secondly, constructing an information correlation matrix with or without an alarm section, and obtaining a normalized information difference matrix by comparing the difference change rate before and after the alarm; and finally, establishing an information difference graph model according to the information difference matrix and fitting the fault degrees for sequencing.
Step 101, selecting historical data before and after alarming of the power dispatching automation system, obtaining a clustering center through a k-means algorithm, taking the clustering center as an endpoint of interval division, and taking the mean value of each interval as a discretization result of continuous characteristics;
specifically, resource occupation data of CPUs (central processing units), memories, disks, networks and processes of all servers in the power dispatching automation system are collected, the resource occupation conditions of each server comprise IO (input/output) read-write conditions, utilization rates, collision rates and waiting time, time sequences of the characteristics are used as input of a tracing method, and values of certain characteristics are assumed to be distributed in [ a, b ]]And (4) interval, removing the duplication of all values of the characteristic to obtain the total number num, and setting the number k of the centroids of the clusters as
Figure BDA0002731198530000071
The sum of squared errors of the centroid and the sample points is SSE, the relationship graph of SSE and k is the shape of one elbow, the k value of the elbow position is selected as the optimal clustering number,
Figure BDA0002731198530000072
wherein, CiIs the ith cluster, and poi is CiSample point of (1), miIs CiK centroids of { m ] are obtained according to a k-means algorithm1,m2,...,mk}(m1<m2<...<mk) The value of the feature is divided into k +1 intervals [ a, m1],[m1,m2],…,[mk-1,mk],[mk,b]And averaging the characteristic values in each interval to obtain a discretization result of the interval.
102, calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the components, establishing an information correlation matrix with or without an alarm section, measuring the difference degree before and after the alarm through the change rate of the information correlation matrix, and obtaining an information difference matrix by adopting a normalization technology;
specifically, discretizing resource occupation data of CPUs (central processing units), memories, disks, networks and processes of all servers of the power dispatching automation system, and collecting discretization characteristics of IO (input/output) read-write conditions, utilization rate, collision rate and waiting time of the CPUs, the memories, the disks, the networks and the processes to obtain N characteristic time sequences { S }1,S2,...,SNCalculating self-information entropy of each time series { H }1,H2,...,HNH (S) is the self-entropy of time series S:
Figure BDA0002731198530000073
wherein x represents each time point in the characteristic time sequence S, p (x) represents the output probability of x in the time sequence S, and alphaxAll possible values of x in the characteristic time series S are represented, and the transfer entropy { T ] of every two characteristic time series is calculated1→2,T1→3,...,T1→N,T2→1,T2→3,...,T2→N,...,TN→N-1The mutual information of any two time series is defined as the measure of the amount of information shared by two variables, mutual information entropy I (S)I;SJ) For measuring time series SIAnd SJAmount of shared information of (2):
Figure BDA0002731198530000074
wherein S isIRepresenting the I-th characteristic time series, SJRepresents the J-th characteristic time series, and x and y represent the characteristic time series SIAnd SJP (x, y) represents the probability of joint distribution, p (x | y) represents the conditional probability, αxAnd alphayRepresenting a characteristic time series SIAnd SJAll possible values of x and y are expanded into transfer entropy on the basis of mutual information entropy, and S is divided in the calculation of the transfer entropyIIn addition to its own information, SJAdditional information is also provided to predict SI(t+1)Setting the transfer entropy TJ→IFor a characteristic time series SJTo SIThe measure of mutual information of (1):
Figure BDA0002731198530000081
wherein itRepresenting a characteristic time series SIState at time t, it (k)Represents it+1The first k most recent characteristic time series SIState of (j)tRepresenting a characteristic time series SJState at time t, jt (l)Denotes jt+1Previous l most recent characteristic time series SJState (b), transfer entropy has directionality, so TI→JThe cause and effect relationship matrix A can be established by exchanging variables in the formula and calculating self information and mutual information of the time sequence without the alarm period:
Figure BDA0002731198530000082
similarly, a cause and effect relation matrix B with an alarm section is obtained, and an information difference matrix is established
Figure BDA0002731198530000083
Respectively carrying out normalization processing on the change rates of diagonal line information and off-diagonal line information to obtain the final informationAn information difference matrix.
103, extracting features with high alarm information change of the power dispatching automation system and interactive information among the features, further constructing an information difference graph model combining a digraph and node self-information, and fitting fault degree indexes to perform fault degree sequencing;
specifically, a threshold Θ ∈ (0,1), c is setm,n(m<N,n<N,cm,nE (0, 1)) represents the value of the m row and n column in the information difference matrix C, traverses the matrix C and marks Cm,nKeeping the values of all marked rows and columns and setting other elements to zero to obtain an information difference matrix C ', establishing an information difference graph model according to the causal relationship of the extracted fault characteristics and the links among the characteristics, wherein the model comprises two pieces of information, the diagonal value of the C ' matrix represents the confidence difference value inside the node, the non-diagonal value of the C ' matrix represents the mutual information difference value between the node and the node, and the link between the node and the node represents SIInfluence SJAnd SJInfluence SIThe Fault causal association degree of (2) is calculated through a Fault _ degree index:
Figure BDA0002731198530000091
wherein, ViRepresenting a single node, BINNs of V, in an information-difference graph modeliIs represented by the formula ViAdjacent node, VjIs represented by the formula ViSingle one of the adjacent nodes, NUM (BINNs of V)i) Is represented by the formula ViAnd traversing all nodes of the information difference graph model by the total number of the adjacent nodes, calculating the Fault degree of each node by adopting a Fault _ degree index, calculating the Fault degree of each node, and sequencing the Fault degrees from large to small to obtain a final result.
For a specific embodiment, a power dispatching automation system data set is used, the data set comprises 718 time series collected by the power dispatching automation system, each time series has 30 minutes of data, and the sampling period is 1 second. Table 1 shows components and characteristic information of the power dispatching automation system, and the characteristic quantities collected by the components of the system together form the time series.
TABLE 1 Components and characteristic information of Power dispatching Automation System
Figure BDA0002731198530000092
The resource occupation data of CPUs, memories, disks, networks and processes of all servers in the power dispatching automation system are input, the resource occupation conditions of all the servers include IO read-write conditions, utilization rates, collision rates and waiting time, the characteristics are input as time sequences, the last 5-minute sequence with system alarm information is used as an alarm segment, the rest time sequences are used as normal segments, a threshold theta is set to be 0.7, the number k of the clustered centroids is determined along with the change of the characteristics, and therefore specific numerical values do not need to be set.
In order to visually display the results of the failure tracing, in table 2, the first ten tracing results of different methods are listed. For convenience of description, server numbers 1 to 85 are defined, and comparison methods for calculating fault degrees on the basis of the information difference model include mRank, gRank and RCA. The Benchmark represents that the time series is sequenced according to the sequence of the change rate of the time series from high to low and is used as a reference for verifying the effect of each algorithm. The first three servers are marked with a five-pointed star, a diamond and a triangle, respectively. As can be seen from Table 2, the Fault _ degree performs better than the other three methods, and the server with the top rank can well verify the Fault sorting benchmark.
Table 2 fault tracing result of power dispatching automation system
Figure BDA0002731198530000101
In order to verify the effectiveness of the method, the accuracy, the recall rate and the nDCG are adopted to evaluate the failure tracing effect of the algorithm. Usually, nDCG (Normalized divided Cumulative Gain) is a commonly used ranking index, and is used to determine whether a ranking result is good or bad, and a larger value of nDCG indicates a better performance of the method.
For the results of the failure tracing, the tracing effect ranked at the top is preferentially considered, so that twenty results before the tracing list are selected for index performance analysis, and the results are shown in table 3. The method provided by the embodiment of the invention improves the accuracy, the recall rate and the nDCG value, wherein the accuracy is improved by 3.6-14%, and the nDCG value is improved by 0.5-0.1. The method for tracing the fault of the power dispatching automation system, which embodies the information difference graph model provided by the embodiment of the invention, has better performance of tracing the fault.
TABLE 3 comparison of Performance of twenty prior to failure tracing
mRank gRank RCA Fault_degree
Rate of accuracy 0.75 0.8 0.82 0.857142857
Recall rate 0.86862 0.797831 0.85597 0.876578
nDCG 0.85 0.8 0.85 0.9
In summary, the embodiments of the present invention have the following beneficial effects:
in the technical scheme, historical data before and after the alarm of the power dispatching automation system is selected, a clustering center is obtained through a k-means algorithm and is used as an endpoint of interval division, and the mean value of each interval is used as a discretization result of continuous characteristics; calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the components, establishing an information correlation matrix with or without an alarm section, measuring the difference degree before and after the alarm through the change rate of the information correlation matrix, and obtaining an information difference matrix by adopting a normalization technology; extracting the characteristics with high alarm information change of the power dispatching automation system and the interactive information among the characteristics, further constructing an information difference graph model combining a digraph and node self-information, and fitting fault degree indexes to perform fault degree sequencing.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (2)

1. A power dispatching automation system fault tracing method based on an information difference graph model is characterized by comprising the following steps:
(1) selecting historical data before and after alarming of the power dispatching automation system, obtaining a clustering center through a k-means algorithm, taking the clustering center as an endpoint of interval division, and taking the mean value of each interval as a discretization result of continuous characteristics;
(2) calculating the information entropy of the components of the power dispatching automation system and the transfer entropy among the components, establishing an information correlation matrix with or without an alarm section, measuring the difference degree before and after the alarm through the change rate of the information correlation matrix, and obtaining an information difference matrix by adopting a normalization technology, wherein the information difference matrix specifically comprises the following steps: discretizing resource occupation data of CPUs, memories, disks, networks and processes of all servers of the power dispatching automation system, collecting discretization characteristics of IO read-write conditions, utilization rates, collision rates and waiting time of the CPUs, the memories, the disks, the networks and the processes to obtain N characteristic time sequences { S }1,S2,...,SNCalculating self-information entropy of each time series { H }1,H2,...,HNH (S) is the self-information entropy of the time series S:
Figure FDA0002731198520000011
wherein x represents each time point in the characteristic time sequence S, p (x) represents the output probability of x in the time sequence S, and alphaxAll possible values of x in the characteristic time series S are represented, and the transfer entropy { T ] of every two characteristic time series is calculated1→2,T1→3,...,T1→N,T2→1,T2→3,...,T2→N,...,TN→N-1The mutual information of any two time series is defined as the measure of the amount of information shared by two variables, mutual information entropy I (S)I;SJ) For measuring time series SIAnd SJAmount of shared information of (2):
Figure FDA0002731198520000012
wherein S isIRepresenting the I-th characteristic time series, SJRepresents the J-th characteristic time series, and x and y represent the characteristic time series SIAnd SJP (x, y) represents the joint distribution probability, and p (x | y) represents the conditional probability,αxAnd alphayRepresenting a characteristic time series SIAnd SJAll possible values of x and y are expanded into transfer entropy on the basis of mutual information entropy, and S is divided in the calculation of the transfer entropyIIn addition to its own information, SJAdditional information is also provided to predict SI(t+1)Setting the transfer entropy TJ→IFor a characteristic time series SJTo SIThe measure of mutual information of (1):
Figure FDA0002731198520000021
wherein itRepresenting a characteristic time series SIState at time t, it (k)Represents it+1The first k most recent characteristic time series SIState of (j)tRepresenting a characteristic time series SJState at time t, jt (l)Denotes jt+1Previous l most recent characteristic time series SJState (b), transfer entropy has directionality, so TI→JThe cause and effect relationship matrix A can be established by exchanging variables in the formula and calculating self information and mutual information of the time sequence without the alarm period:
Figure FDA0002731198520000022
similarly, a cause and effect relation matrix B with an alarm section is obtained, and an information difference matrix is established
Figure FDA0002731198520000023
Respectively carrying out normalization processing on the change rates of diagonal line information and off-diagonal line information to obtain a final information difference matrix;
(3) extracting the characteristics with high alarm information change of the power dispatching automation system and the interactive information among the characteristics, further constructing an information difference graph model combining a digraph and node self-information, and fitting a fault degree index to carry out fault degree rankingThe sequence specifically comprises the following steps: setting a threshold theta epsilon (0,1), cm,n(m<N,n<N,cm,nE (0, 1)) represents the value of the m row and n column in the information difference matrix C, traverses the matrix C and marks Cm,nKeeping the values of all marked rows and columns and setting other elements to zero to obtain an information difference matrix C ', establishing an information difference graph model according to the causal relationship of the extracted fault characteristics and the links among the characteristics, wherein the model comprises two pieces of information, the diagonal value of the C ' matrix represents the confidence difference value inside the node, the non-diagonal value of the C ' matrix represents the mutual information difference value between the node and the node, and the link between the node and the node represents SIInfluence SJAnd SJInfluence SICalculating the Fault degree through a Fault _ degree index:
Figure FDA0002731198520000031
wherein, ViRepresenting a single node, BINNs of V, in an information-difference graph modeliIs represented by the formula ViAdjacent node, VjIs represented by the formula ViSingle one of the adjacent nodes, NUM (BINNs of V)i) Is represented by the formula ViAnd traversing all nodes of the information difference graph model by the total number of the adjacent nodes, calculating the Fault degree of each node by adopting a Fault _ degree index, calculating the Fault degree of each node, and sequencing from large to small to obtain a final result.
2. The method according to claim 1, wherein the discretization of the continuous features is adopted, historical data before and after the alarm of the power dispatching automation system is selected, a clustering center is obtained through a k-means algorithm and is used as an endpoint of interval division, and a mean value of each interval is used as a discretization result of the continuous features, and the method is specifically described as follows: collecting resource occupation data of CPUs (central processing units), memories, disks, networks and processes of all servers in the power dispatching automation system, wherein the resource occupation conditions of each server comprise IO (input/output) read-write conditions, utilization rates and collision ratesWaiting time, using the time series of these characteristics as the input of tracing method, assuming that the value of a certain characteristic is distributed in [ a, b ]]And (4) interval, removing the duplication of all values of the characteristic to obtain the total number num, and setting the number k of the centroids of the clusters as
Figure FDA0002731198520000032
The sum of squared errors of the centroid and the sample points is SSE, the relationship graph of SSE and k is the shape of one elbow, the k value of the elbow position is selected as the optimal clustering number,
Figure FDA0002731198520000033
wherein, CiIs the ith cluster, and poi is CiSample point of (1), miIs CiK centroids of { m ] are obtained according to a k-means algorithm1,m2,...,mk}(m1<m2<...<mk) The value of the feature is divided into k +1 intervals [ a, m1],[m1,m2],…,[mk-1,mk],[mk,b]And averaging the characteristic values in each interval to obtain a discretization result of the interval.
CN202011118535.0A 2020-10-19 2020-10-19 Power dispatching automation system fault tracing method based on information difference graph model Active CN112163682B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011118535.0A CN112163682B (en) 2020-10-19 2020-10-19 Power dispatching automation system fault tracing method based on information difference graph model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011118535.0A CN112163682B (en) 2020-10-19 2020-10-19 Power dispatching automation system fault tracing method based on information difference graph model

Publications (2)

Publication Number Publication Date
CN112163682A CN112163682A (en) 2021-01-01
CN112163682B true CN112163682B (en) 2022-05-17

Family

ID=73867448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011118535.0A Active CN112163682B (en) 2020-10-19 2020-10-19 Power dispatching automation system fault tracing method based on information difference graph model

Country Status (1)

Country Link
CN (1) CN112163682B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113162904B (en) * 2021-02-08 2022-11-08 国网重庆市电力公司电力科学研究院 Power monitoring system network security alarm evaluation method based on probability graph model
CN115086154A (en) * 2021-03-11 2022-09-20 中国电信股份有限公司 Fault delimitation method and device, storage medium and electronic equipment
CN113225226B (en) * 2021-04-30 2022-10-21 上海爱数信息技术股份有限公司 Cloud native system observation method and system based on information entropy
CN113128076A (en) * 2021-05-18 2021-07-16 北京邮电大学 Power dispatching automation system fault tracing method based on bidirectional weighted graph model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423414A (en) * 2017-07-28 2017-12-01 西安交通大学 A kind of process industry complex electromechanical systems fault source tracing method based on information transmission model
CN108319780A (en) * 2018-02-01 2018-07-24 南京航空航天大学 Electric traction system failure detection method based on data-driven
CN110378036A (en) * 2019-07-23 2019-10-25 沈阳天眼智云信息科技有限公司 Fault Diagnosis for Chemical Process method based on transfer entropy
CN111401785A (en) * 2020-04-09 2020-07-10 国网山东省电力公司 Power system equipment fault early warning method based on fuzzy association rule

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10489716B2 (en) * 2016-07-08 2019-11-26 Intellergy, Inc. Method for performing automated analysis of sensor data time series

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423414A (en) * 2017-07-28 2017-12-01 西安交通大学 A kind of process industry complex electromechanical systems fault source tracing method based on information transmission model
CN108319780A (en) * 2018-02-01 2018-07-24 南京航空航天大学 Electric traction system failure detection method based on data-driven
CN110378036A (en) * 2019-07-23 2019-10-25 沈阳天眼智云信息科技有限公司 Fault Diagnosis for Chemical Process method based on transfer entropy
CN111401785A (en) * 2020-04-09 2020-07-10 国网山东省电力公司 Power system equipment fault early warning method based on fuzzy association rule

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于信息熵的油液监测数据特征及磨损故障诊断研究;霍华等;《内燃机学报》;20041125(第06期);全文 *

Also Published As

Publication number Publication date
CN112163682A (en) 2021-01-01

Similar Documents

Publication Publication Date Title
CN112163682B (en) Power dispatching automation system fault tracing method based on information difference graph model
CN113496262B (en) Data-driven active power distribution network abnormal state sensing method and system
CN113128076A (en) Power dispatching automation system fault tracing method based on bidirectional weighted graph model
Capozzoli et al. Fault detection analysis using data mining techniques for a cluster of smart office buildings
CN109816031B (en) Transformer state evaluation clustering analysis method based on data imbalance measurement
CN108287782A (en) A kind of multidimensional data method for detecting abnormality and device
Yuan et al. Outage detection in partially observable distribution systems using smart meters and generative adversarial networks
CN110474808B (en) Flow prediction method and device
CN105471647B (en) A kind of power communication network fault positioning method
CN104133143B (en) A kind of Guangdong power system diagnostic system and method calculating platform based on Hadoop cloud
CN111367777B (en) Alarm processing method, device, equipment and computer readable storage medium
CN103197983A (en) Service component reliability online time sequence predicting method based on probability graph model
CN110391936B (en) Clustering method based on time sequence alarm
CN111291822A (en) Equipment running state judgment method based on fuzzy clustering optimal k value selection algorithm
Petrozziello et al. Distributed neural networks for missing big data imputation
CN112367191A (en) Service fault positioning method under 5G network slice
CN114553671A (en) Diagnosis method for power communication network fault alarm
CN115034278A (en) Performance index abnormality detection method and device, electronic equipment and storage medium
CN117609818A (en) Power grid association relation discovery method based on clustering and information entropy
CN112819208A (en) Spatial similarity geological disaster prediction method based on feature subset coupling model
CN115734274A (en) Cellular network fault diagnosis method based on deep learning and knowledge graph
CN113572639B (en) Carrier network fault diagnosis method, system, equipment and medium
CN113505818A (en) Aluminum melting furnace energy consumption abnormity diagnosis method, system and equipment with improved decision tree algorithm
CN117424791B (en) Large-scale power communication network fault diagnosis system
Chen et al. Operational scenario definition in traffic simulation-based decision support systems: Pattern recognition using a clustering algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant