CN104616205B - A kind of operation states of electric power system monitoring method based on distributed information log analysis - Google Patents

A kind of operation states of electric power system monitoring method based on distributed information log analysis Download PDF

Info

Publication number
CN104616205B
CN104616205B CN201410681737.4A CN201410681737A CN104616205B CN 104616205 B CN104616205 B CN 104616205B CN 201410681737 A CN201410681737 A CN 201410681737A CN 104616205 B CN104616205 B CN 104616205B
Authority
CN
China
Prior art keywords
log
log information
point
information
electric power
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410681737.4A
Other languages
Chinese (zh)
Other versions
CN104616205A (en
Inventor
曹宇
王梓
张岩
孟伶智
郄洪涛
舒力
李华
阎博
王桂茹
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Tianjin Electric Power Co Ltd
Beijing Kedong Electric Power Control System Co Ltd
State Grid Jibei Electric Power Co Ltd
Original Assignee
State Grid Tianjin Electric Power Co Ltd
Beijing Kedong Electric Power Control System Co Ltd
State Grid Jibei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Tianjin Electric Power Co Ltd, Beijing Kedong Electric Power Control System Co Ltd, State Grid Jibei Electric Power Co Ltd filed Critical State Grid Tianjin Electric Power Co Ltd
Priority to CN201410681737.4A priority Critical patent/CN104616205B/en
Publication of CN104616205A publication Critical patent/CN104616205A/en
Application granted granted Critical
Publication of CN104616205B publication Critical patent/CN104616205B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Water Supply & Treatment (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Power Engineering (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of operation states of electric power system based on distributed information log analysis to monitor method, includes the following steps: S1, obtains the log information of electric system, and is merged into journal file;Journal file is split by S2, is handled it to obtain the log information of unified format, serializes the log information in journal file one by one and be output in distributed memory system;S3, log information is extracted from distributed memory system, in conjunction with Map-Reduce mechanism, classification processing is carried out to log information using the log analysis algorithm for removing cluster based on state noise, and by being analyzed sorted log information come monitoring system operating status.The present invention can have found the exception of operation states of electric power system when system is abnormal in time, and handled at the first time, effectively meet that electric system is timely, efficient service requirement.

Description

A kind of operation states of electric power system monitoring method based on distributed information log analysis
Technical field
The present invention relates to a kind of operation states of electric power system to monitor method, more particularly to a kind of based on distributed information log analysis Operation states of electric power system monitor method, belong to electric power system dispatching technical field.
Background technique
With power grid scale expand increasingly and complexity is continuously increased, extra-high voltage interconnected power grid to power grid one Bodyization runs and is uniformly coordinated control and puts forward new requirements, the requirement that country runs power grid security, stabilization, economy, environmental protection Also higher and higher.Electric power big data is come into being, it is the practice of big data theory, technology and methods in power industry, electric power Big data is related to power generation, transmission of electricity, power transformation, distribution, electricity consumption, each link of scheduling, combines across unit, multi-disciplinary, trans-sectoral business The function of data analysis, excavation and data visualization.
In electric power system dispatching link, with putting into operation for smart grid supporting system technology, electric network data acquisition Range and type constantly extend, and play important function in terms of meeting comprehensive to interconnected power grid.Mesh Before, regulation centers at different levels are completed to be run by a series of scheduling production managements of core of smart grid supporting system technology System mainly has SCADA/EMS, WAMS, water power and new energy, secondary device in-service monitoring and analysis, operation plan, safe school The systems such as core, management and running, system have put into operation, substantially meet scheduling production needs, play weight in scheduling production management It acts on.
The safe and stable operation of electric system needs the protection of the subterranean equipments such as relay protection and automatic device, but only according to The safe operation of electric system can't be completely secured by these subterranean equipments, because these devices are all often according to part Information handles the failure of electric system, and cannot with global information come predict, the operating condition and processing system of analysis system The various challenges occurred in system, for this purpose, the log analysis technology for system running state monitoring is urgently developed.
Currently, the syslog analysis technology of domestic electrical enterprise is still immature, the discovery of most systems failure also according to Rely in fault alarm and manpower verification, and in many cases, when fault alarm or manpower are verified when finding failure, system has been sent out The misoperation of some time has been given birth to, the operation exception of system cannot have been found in time, and handled at the first time, prolong significantly The O&M time for having grown system, it is not able to satisfy that network system is timely, efficient service requirement.In addition to this, electric power enterprise is daily Might have many different data analysis requirements, the daily record data provided be also it is diversified, how to diversified log Data carry out united analysis processing and a urgent problem.
Summary of the invention
Technical problem to be solved by the present invention lies in provide a kind of Operation of Electric Systems based on distributed information log analysis State monitoring method.
For achieving the above object, the present invention uses following technical solutions:
A kind of operation states of electric power system monitoring method based on distributed information log analysis, includes the following steps:
S1, obtains the log information of electric system, and is merged into journal file;
Journal file is split by S2, is handled it to obtain the log information of unified format, be made in journal file Log information serialize be output in distributed memory system one by one;
S3 extracts log information from distributed memory system, in conjunction with Map-Reduce mechanism, using based on state noise The log analysis algorithm for removing cluster carries out classification processing to log information, and by analyzing sorted log information To monitor operation states of electric power system.
Wherein more preferably, in step sl, using the log scan based on syslog mode when obtaining the log information Grasping means.
Wherein more preferably, the log scan grasping means includes the following steps:
The log information for each seed module crawl being located on each node of electric system is carried out selection merging, obtained by S11 To all kinds of log informations of this node;
S12 carries out crawl merging to all kinds of log informations of each node in each region of electric system, obtains each The integral data in region, and be sent to local area data processing node and data are handled, it is stored in journal file;
S13 obtains all kinds of log informations chosen and merged, and obtains crawl record data from the node of crawl log information, The merging crawl strategy for obtaining log information by analysis, is as needed adjusted merging crawl strategy.
Wherein more preferably, in conjunction with Map-Reduce mechanism, using the log analysis algorithm for removing cluster based on state noise Monitoring system operating status, specifically comprises the following steps:
S31 extracts log information from distributed memory system, it is in place according to the node institute of crawl log information It sets, carries out rough sort according to log information classification, in middle its similarity matrix of building of all categories, and selected a bit in category set Centered on point;
S32, using k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, with the similarity moment after sparse Battle array building includes the shared closest to figure of whole log categories;
S33 summarizes distance of this away from other points for shared closest to each point in figure using Map mechanism Degree;
S34, using Reduce mechanism, the distance length that Map mechanism is summarized is summed, and generates new key-value pair;
S35 selects distance length and maximum point as similarity matrix central point, former central point is covered, for from length Degree and less than length threshold point, be marked as noise, be not re-used as class cluster central point;
S36 removes the small link of weight ratio threshold value, chooses the point conduct linked each other in the linking of all the points and point One class cluster makes each class cluster represent a classification log information;
S37 takes further analysis according to different classes of log information, obtains the letter of reflection operation states of electric power system The monitoring to operation of power networks state is realized in breath, the variation by observing these information.
Wherein more preferably, in step S31, the log information classification includes: system log, access log and user's row For log three classes.
Wherein more preferably, in step s 32, the shared of whole log categories is constructed to include the following steps: closest to figure
The neighbouring point list for determining log information A and B with k k-nearest neighbor first, when A and B is in the point of proximity of other side When in list, point-to-point transmission establishes a link;Then it will be put with certain without similarity corresponding to the point linked in similarity matrix It is set as zero, realizes the rarefaction of similarity matrix;Finally the two o'clock established the link and its weight side are drawn out, complete building Whole log categories are shared closest to figure;
Weight, that is, two o'clock similarity str (i, j) of link between two o'clock, calculating formula of similarity are as follows: str (i, j) =∑ (k+1-m) * (k+1-n);
Wherein, k is size of the A and B adjacent to point list, and the section of closing on that m and n are A and B is respectively closed in list at it Serial number.
Wherein more preferably, the operation states of electric power system based on distributed information log analysis monitors method, further includes walking as follows It is rapid:
S4 determines the index and its affiliated log information classification for needing to pay special attention to according to Operation of Electric Systems situation, leads to It crosses and monitoring of the monitoring realization to operation states of electric power system individually is carried out to the index in corresponding log information classification.
Wherein more preferably, in step s 4 further comprise following steps:
S41 parses log information, determines log information classification belonging to the index for needing to pay special attention to;
S42 extracts the keyword for needing to pay special attention in parsing logged result, is spliced into field name, value Value is set as 1;
S43, using Reduce mechanism, in the log information classification, calculating summarizes the field name in the category The number of appearance generates and exports new key-value pair;
S44 extracts the information of key assignments centering, analyzes it, realize the monitoring of operation states of electric power system.
Operation states of electric power system provided by the present invention monitors method, and log information is obtained from electric system, in conjunction with Map-Reduce mechanism carries out classification processing to log information using the log analysis algorithm for removing cluster based on state noise, And operation states of electric power system is monitored by being analyzed sorted log information, thus when system is abnormal, It can find and handled at the first time in time, effectively meet that electric system is timely, efficient service requirement.
Detailed description of the invention
Fig. 1 is the flow chart that operation states of electric power system provided by the invention monitors method;
Fig. 2 is to realize the network crawler system structure chart of log information acquisition in the present invention;
Fig. 3 is log information collecting flowchart figure in the present invention;
Fig. 4 is the stream for removing the log analysis algorithm monitoring system operating status of cluster in the present invention based on state noise Cheng Tu;
Fig. 5 is the statistical analysis flow chart of specific fields log information.
Specific embodiment
Technology contents of the invention are described in further detail in the following with reference to the drawings and specific embodiments.
As shown in Figure 1, the operation states of electric power system provided by the invention based on distributed information log analysis monitors method, tool Body includes the following steps: to obtain electric system by the log scan crawl technology based on syslog (system log) mode first Log information, be combined into journal file;Then by dividing processing, journal file is split, is sewed before and after combination message Content makes log information have unified Log data format, and log information is serialized one by one and is output to distributed memory system (HDFS/HBase) in;The Map-Reduce mechanism in Hadoop is finally combined, using the day for removing cluster based on state noise Will parser carries out classification processing to log information, and monitors power train by being analyzed sorted log information System operating status.Detailed specific description is done to this process below.
S1, obtains the log information of electric system by log scan grasping means based on syslog mode, and by its It is merged into journal file.
Data acquisition, also known as data acquisition are to acquire data from exterior using a kind of tool and be input in system The process in portion.In today of internet industry fast development, data collecting field has occurred that important variation, is answered extensively For internet and field of distributed type.In power industry, data acquisition is exactly logical to safety equipment of concern, application system etc. Cross the acquisition work of log information needed for certain concrete mode (file, syslog, http etc.) carries out power system monitor, accident analysis Make.
Log collection technology is one of key technology of log analysis.Log collection technology needs to acquire various safety and sets The log informations such as standby, application system provide data source for the event analysis work on upper layer, therefore log collection process is system The basis of detection and decision is carried out, its accuracy, reliability and its efficiency directly influence the performance of whole system.
In one embodiment of the invention, the log information of analysis specifically includes that system log, access log, user User behaviors log three classes obtain the log information of electric system by the log scan grasping means based on syslog mode.System Log (syslog) agreement is developed in the TCP/IP system in University of California Bai Keli software distribution research center (BSD) is implemented , oneself becomes industry-standard protocol at present, and the log of system and equipment can be recorded with it.In the routing of UNIX/Linux system In the network equipments such as device, interchanger, syslog records any event in system, and manager can be by checking that system is remembered Record, grasps system status at any time.The system log of UNIX/Linux records system related events by syslogd process, can also Event is operated with records application program, by appropriately configured, can also realize the communication between the machine of operation syslog agreement. By analyzing these network behavior logs, the situation related with system, equipment and network is tracked and grasped.
In one embodiment of the invention, the log scan grasping means based on syslog mode, which uses, is applied to system The network crawler system of log scan crawl comes real time scan and grasping system log, does standard for subsequent running state monitoring It is standby.Web crawlers (Spider), which refers to, follows http protocol, according to the index between hyperlink therein and Web page document Relationship carrys out the software program in traversal information space.
Network crawler system includes Seed Management Module, handling module and crawler daily record data information extraction and statistics mould Block;The network crawler system structure chart of log information acquisition is realized as shown in Fig. 2, crawler daily record data information extraction and statistics mould Block obtains log information from Seed Management Module and handling module crawl node, backs up in local server, then presses first It is compressed according to the mode of HadoopLzop, compressed data is uploaded to by HDFS by network transmission, Hive is parsed according to log Plan generates Map-Reduce task, Hadoop cluster is submitted in a manner of Job, calculated result is stored in crawler data system System.Cluster Job scheduling system is responsible for Job task schedule, and to realize the effective use of resource, group operation monitoring record Job appoints The operating status of business, network monitoring can be monitored the operating status of system.
Wherein, realize that the acquisition log information of log information specifically comprises the following steps: by network crawler system
S11, Seed Management Module are distributed on each node of electric system, and each seed module being located on the node is grabbed The daily record data taken carries out selection merging, obtains all kinds of log informations of this node.
For electric system, multiple seed modules are distributed on each node of electric system, for grabbing electric system The log informations such as system information, access information and each advanced application message that the node generates when operation.Seed Management Module It is also distributed about on each node of electric system, the log information to grab each seed module carries out selection merging, obtains this All kinds of log informations of node.
Handling module is distributed in power train and unifies area, 2nd area, 3rd area, summarized to the Seed Management Module of each node by S12 Obtained log information carries out crawl merging, obtains the integral data in each area, is sent to local area data processing node, to data into Row processing is stored in journal file.
Seed Management Module, handling module are dispersed on each node that the area, 2nd area, 3rd area of electric system include It is respectively distributed to the area, 2nd area, 3rd area of electric system, the seed management mould for including by the area, 2nd area, 3rd area of electric system The log information that block summarizes carries out crawl merging, obtains the integral data in each area, and is sent to local area data processing node, Data are handled, by the storage of processed log information into journal file.
S13, crawler daily record data information extraction and statistical module obtain selection from Seed Management Module and handling module and close And all kinds of log informations, from the node of crawl log information obtain crawl record data, obtain log information by analysis Merge crawl strategy, can according to need and merging crawl strategy is adjusted in time.
Crawler daily record data information extraction and statistical module play the effect of adjustment crawl strategy, on the one hand obtain seed pipe It manages module and handling module chooses all kinds of log informations merged, on the other hand obtain crawl note from the node of crawl log information Data are recorded, by analyzing these information, obtain the merging crawl strategy of entire crawler system, it, can be with when encountering system problem The log type being related to aiming at the problem that occurring in time as needed is adjusted correspondingly to crawl strategy is merged, and is made in system Seed Management Module and handling module only grab log information relevant to problem, reduce log information processing quantity with Time improves the efficiency of O&M.
Journal file is split by S2, is handled it to obtain the log information of unified format, be made in journal file Log information serialize be output in distributed memory system (HDFS/HBase) one by one.
Journal file is split by Flume tool, by the way of sewing before and after combination message, customizes daily record data Format makes different classes of log information obtain unified Log data format, serializes log information one by one and is output to point In cloth storage system (HDFS/HBase), convenience is created for next step log analysis.
According to the actual needs of electric system, the log information of analysis specifically includes that system log, access log, user User behaviors log three classes.System log is monitored for system running state, including system resource utilization rate, network equipment behaviour in service Deng;Access log is used for the interaction scenario of statistical system host, such as system amount of access, accessed node information, access time;With Family user behaviors log is used to dispatch the mining analysis of behavior pattern, mainly carries out modeling analysis to the operation data of operations staff.Three Class journal file is grabbed crawler technology and is sent to distributed storage system in the way of batch, timing by Flume tool In system.Flume tool be a kind of distributed information log collect, means of transport.It includes data receiver using Agent as basic unit End, transmitting terminal, channel are the distributed tools with high scalability and high-freedom degree, can not only collect non-structured text This document can also collect the files such as non-structured video, audio.Log information collecting flowchart is as shown in figure 3, the process is first It first detects whether that new journal file generates, if there is being then split journal file, format is carried out to log information It is uniformly processed, then by treated, log information serializes storage into distributed system one by one, convenient for later concentration point Analysis.
S3, from log information is extracted in distributed memory system (HDFS/HBase), in conjunction with the Map- in Hadoop Reduce mechanism is carried out classification processing to log information using the log analysis algorithm for being removed cluster based on state noise, passed through Sorted log information is analyzed to monitor operation states of electric power system.
In power grid distributed data frame, multiple data acquisition units (in one embodiment of the invention, climb by network The seed module of worm serves as data acquisition unit, to acquire the log information in electric system.) disperse to be deployed in network environment In.Therefore the operation in control centre's Centralized Monitoring data acquisition unit and host is needed, and by log information to system shape State is monitored.
Cluster (Clustering) is exactly that the similarity between data set to be divided into same group objects is maximum, right in different groups As similarity minimize multiple groups (group) or cluster (cluster) process.Clustering is one in data analysis Kind important technology, application are very extensive.From the point of view of statistics, clustering as multi-variate statistical analysis Main Branches it One, it is a kind of method for simplifying data by data modeling, is mainly based upon distance and the clustering method based on similarity.Slave From the point of view of device study, cluster is a kind of training example without class predetermined or with class label without instructing machine learning Method.
In one embodiment of the invention, log information, knot are extracted from distributed memory system (HDFS/HBase) Close Hadoop in Map-Reduce mechanism, using based on state noise remove cluster log analysis algorithm to log information into Row classification processing only includes a classification log information in each class cluster after handling, can find in single classification The corresponding index of operation states of electric power system, such as information on services are represented, is compared, is worked as by the analysis to index operation information Preceding operation states of electric power system.Such as: when operating normally not with electric system occurs in the corresponding index of operation states of electric power system When consistent operation data, illustrate that electric system corresponds to index and is abnormal, can rapidly to corresponding index is associated sets It is standby to carry out O&M, greatly reduce the O&M time of system.
As shown in figure 4, in conjunction with the Map-Reduce mechanism in Hadoop, using the log for removing cluster based on state noise Parser monitoring system operating status, specifically comprises the following steps:
S31, from log information is extracted in distributed memory system (HDFS/HBase), by it according to crawl log information Rough sort is done according to system log, application log, access log etc. in node position, in middle its phase of building of all categories Like degree matrix, and point centered on selecting at random in category set a bit.
In the power system, some Node distributions mainly include system log in basic platform, the log information of crawl, are had A little nodes are application nodes, and the log information of crawl mainly includes application log, some nodes do not interact other nodes Movement, just there is no access log, all types log has on some nodes.In one embodiment of the invention, according to The log information summarized is carried out rough sort by node position.First single point of node only comprising unitary class log information One kind, the node for then all including with three classes log information again take union.Different classes of journal file is formed after rough sort, In middle its similarity matrix of building of all categories, and point centered on selecting at random in category set a bit.At of the invention one In embodiment, each log information constitutes a point in journal file.Similarity matrix is a square matrix, each point Similarity with other points is as matrix element.
According to the shared closest to its similarity of section definition of two log informations, i.e., the similarities of two log informations by It is determined between its nearest-neighbour.In one embodiment of the invention, the neighbor point of log A and log X are determined with k k-nearest neighbor List, and if only if A and X all in other side when closing in point list, point-to-point transmission just establishes a link.There is the point X linked with A It is a set, having the point linked with B is also a set, the two intersection of sets collection are exactly shared closest to section.If Log A is close to log B, and they are close to class set C, then, A close to B with regard to confidence level with higher because A's and B is similar Degree is determined that class set C is shared closest to section by class set C simultaneously.
S32 will be put with certain without chain using k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories in matrix Similarity corresponding to the point connect is set as zero, is most faced with the similarity matrix building after sparse comprising whole the shared of log category Nearly figure.
In one embodiment of the invention, using k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, For the similarity matrix structure after sparse, similarity matrix midpoint and point and its weight side are drawn out, to construct whole days Will classification is shared closest to figure.Specifically comprise the following steps:
The neighbouring point list that A and B is determined with k k-nearest neighbor, and if only if A and B all closing in point list in other side When, point-to-point transmission just establishes a link, will be set as zero without similarity corresponding to the point linked with certain point in similarity matrix, in fact The rarefaction of existing similarity matrix, then the two o'clock established the link and its weight side are drawn out, can construct whole logs Classification is shared closest to figure.Wherein, weight, that is, two o'clock similarity of the link, calculating formula of similarity are as follows:
Str (i, j)=∑ (k+1-m) * (k+1-n)
Wherein, k is size of the A and B adjacent to point list, and the section of closing on that m and n are A and B is respectively closed in list at it Serial number.
S33 summarizes distance length of this away from other points for shared closest to point each in figure using Map mechanism.
S34, using Reduce mechanism, the distance length that Map mechanism is summarized is summed, and calculates the distance length of each point With generate new key-value pair.Wherein, the key value in key-value pair be log information, value value be each point distance length and.
S35 selects distance length and maximum point as similarity matrix central point, former central point is covered, for distance Length and point less than length threshold, are marked as noise, are not re-used as class cluster central point.
S36 removes the small link of weight ratio threshold value, chooses the point conduct linked each other in the linking of all the points and point One class cluster guarantees that all the points are all central point or are directly connected with central point that each class cluster represents a classification in class cluster Log information.
By step S32 it is found that weight, that is, two o'clock similarity of link, calculating formula of similarity are as follows:
Str (i, j)=∑ (k+1-m) * (k+1-n)
Wherein, k is size of the A and B adjacent to point list, and the section of closing on that m and n are A and B is respectively closed in list at it Serial number.The distance between two o'clock length is bigger, and the weight of link is smaller, and similarity is lower.The small link of weight ratio threshold value is removed, Can guarantee that in the link that remaining point is formed be same category of log information, choose link each other o'clock as one Class cluster guarantees that all the points are all central point or are directly connected with central point that each class cluster represents a classification log in class cluster Information.
S37 takes further analysis according to different classes of log information, and the items for obtaining reflection POWER SYSTEM STATE refer to Mark, system amount of access and the information such as accessed node and dispatcher's operation data, the variation by observing these information are realized Monitoring to operation of power networks state.
Further analysis is taken according to different classes of log information, using Hive, for the day for being subordinate to system log classification Will file, statistics obtain the indices of reflection system mode, such as CPU usage, memory headroom, hard drive space, network interface card stream The status informations such as amount, process and information on services.To the journal file for being subordinate to access log classification, analysis obtains access of concern The information such as amount, accessed node.To the journal file for being subordinate to User action log classification, dispatcher's operation data is counted.Work as electricity When Force system occurs abnormal, certain variation can occur for these information, and the variation by observing these information is realized to power train The monitoring for operating status of uniting.
S4 determines the index and its affiliated log information classification for needing to pay special attention to according to Operation of Electric Systems situation, It is monitored by the index individually paid special attention to needs in corresponding log information classification to realize and be transported to electric system The monitoring of row state.
If in system operation, due to the particular/special requirement of operation, such as in certain period electric system certain fingers Mark is easy to happen exception, causes electric power system fault, user is needed to pay special attention to certain period or the operating status of certain index, can It is monitored with the index individually paid special attention to needs.By paying special attention to it, electric system can be found in time The exception of operating status.As shown in figure 5, specifically comprising the following steps:
S41 parses log information, determines log information classification belonging to the index for needing to pay special attention to, i.e., of concern Problem is to belong to system log or access log or User action log.
S42 extracts the keyword for needing to pay special attention in parsing logged result, and " schedule job ", " ERROR " etc. will It is spliced into field name, and value value is set as 1.
S43, using Reduce mechanism, in the log information classification, calculating summarizes value value, i.e., the field name is at this The number occurred in classification generates and exports new key-value pair.
S44 extracts the information of key assignments centering, analyzes it, realize the monitoring of operation states of electric power system.
In conclusion operation states of electric power system provided by the present invention monitors method, by based on syslog mode Log scan grabs the log information that technology obtains electric system, then combines message front and back and sews content, makes every log information All there is the preceding suffix information of customization, serialize log information one by one and be output to distributed memory system (HDFS/HBase) In, in conjunction with the Map-Reduce mechanism in Hadoop, system is monitored using the log analysis algorithm for removing cluster based on state noise System operating status, so as to find the exception of operation states of electric power system in time, and is being handled at the first time, is effectively met Electric system is timely, efficient service requirement.In addition to this, the web crawlers technology applied to system log scanning crawl can Diversified daily record data is grabbed from electric system, and united analysis processing is carried out to it by Flume tool, is improved more The treatment effeciency of sample daily record data.
The operation of power networks state monitoring method provided by the present invention based on distributed information log analysis has been carried out in detail above Thin explanation.For those of ordinary skill in the art, it is done under the premise of without departing substantially from true spirit Any obvious change, the infringement for all weighing composition to the invention patent, will undertake corresponding legal liabilities.

Claims (7)

1. a kind of operation states of electric power system based on distributed information log analysis monitors method, it is characterised in that including walking as follows It is rapid:
S1, obtains the log information of electric system, and is merged into journal file;
Journal file is split by S2, is handled it to obtain the log information of unified format, makes the day in journal file Will information serializes be output in distributed memory system one by one;
S3 extracts log information from distributed memory system, in conjunction with Map-Reduce mechanism, removes using based on state noise The log analysis algorithm of cluster carries out classification processing to log information;It is monitored by being analyzed sorted log information Operation states of electric power system;Include the following steps:
S31 extracts log information from distributed memory system, by it according to the node position of crawl log information, presses Lighting system log, application log, access log carry out rough sort, single point one of node only comprising unitary class log information Class, then the node for all including with three classes log information take union;Different classes of journal file is formed after rough sort, all kinds of Its similarity matrix is constructed in, and point centered on selecting in category set a bit;
S32 determines the neighbouring point list of log A and log B with k k-nearest neighbor, and if only if log A and log B all right When closing in point list of side, point-to-point transmission just establishes a link, will put with certain without corresponding to the point linked in similarity matrix Similarity be set as zero, the rarefaction of similarity matrix is realized, for the similarity matrix after sparse, by similarity matrix midpoint It is drawn out with point and its weight side, to construct comprising the shared closest to figure of whole log categories;
S33 summarizes distance length of this away from other points for shared closest to each point in figure using Map mechanism;
S34, using Reduce mechanism, the distance length that Map mechanism is summarized is summed, and generates new key-value pair;
S35 selects the central point of distance length and maximum point as similarity matrix, covers former central point, long for separation Degree and less than length threshold point, be marked as noise, be not re-used as class cluster central point;
S36 removes the small link of weight ratio threshold value in the linking of all the points and point, choose link each other o'clock as one Class cluster makes each class cluster represent a classification log information;
S37 takes further analysis according to different classes of log information, obtains the information of reflection operation states of electric power system, leads to Cross monitoring of the variation realization for observing these information to operation of power networks state.
2. operation states of electric power system according to claim 1 monitors method, it is characterised in that:
In step sl, using the log scan grasping means based on syslog mode when obtaining the log information.
3. operation states of electric power system according to claim 2 monitors method, it is characterised in that the log scan crawl Method includes the following steps:
The log information for each seed module crawl being located on each node of electric system is carried out selection merging, obtains this by S11 All kinds of log informations of node;
S12 carries out crawl merging to all kinds of log informations of each node, obtains each region in each region of electric system Integral data, and be sent to local area data processing node and data handled, be stored in journal file;
S13 obtains all kinds of log informations chosen and merged, and obtains crawl record data from the node of crawl log information, passes through Analysis obtains the merging crawl strategy of log information, is adjusted as needed to merging crawl strategy.
4. operation states of electric power system according to claim 1 monitors method, it is characterised in that:
In step S31, the log information classification includes: system log, access log and User action log three classes.
5. operation states of electric power system according to claim 1 monitors method, it is characterised in that in step s 32, building The shared of whole log categories includes the following steps: closest to figure
The neighbouring point list for determining log information A and B with k k-nearest neighbor first, when A and B closes on point list in other side When middle, point-to-point transmission establishes a link;Then it will be set as with certain point without similarity corresponding to the point linked in similarity matrix Zero, realize the rarefaction of similarity matrix;Finally the two o'clock established the link and its weight side are drawn out, complete building all Log category is shared closest to figure;
Weight, that is, two o'clock similarity str (i, j) of link between two o'clock, calculating formula of similarity are as follows: str (i, j)=∑ (k+1-m)*(k+1-n);
Wherein, k is size of the A and B adjacent to point list, the sequence closing on section and respectively being closed in list at it that m and n are A and B Number.
6. operation states of electric power system according to claim 1 monitors method, it is characterised in that further include following steps:
S4 determines the index and its affiliated log information classification for needing to pay special attention to according to Operation of Electric Systems situation, by Monitoring of the monitoring realization to operation states of electric power system individually is carried out to the index in corresponding log information classification.
7. operation states of electric power system according to claim 6 monitors method, it is characterised in that in step s 4 further Include the following steps:
S41 parses log information, determines log information classification belonging to the index for needing to pay special attention to;
S42 extracts the keyword for needing to pay special attention in parsing logged result, is spliced into field name, value value is set It is 1;
S43, using Reduce mechanism, in the log information classification, calculating summarizes the field name to be occurred in the category Number, generate and export new key-value pair;
S44 extracts the information of key assignments centering, analyzes it, realize the monitoring of operation states of electric power system.
CN201410681737.4A 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis Expired - Fee Related CN104616205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410681737.4A CN104616205B (en) 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410681737.4A CN104616205B (en) 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis

Publications (2)

Publication Number Publication Date
CN104616205A CN104616205A (en) 2015-05-13
CN104616205B true CN104616205B (en) 2019-10-25

Family

ID=53150638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410681737.4A Expired - Fee Related CN104616205B (en) 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis

Country Status (1)

Country Link
CN (1) CN104616205B (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138661B (en) * 2015-09-02 2018-10-30 西北大学 A kind of network security daily record k-means cluster analysis systems and method based on Hadoop
CN105608203B (en) * 2015-12-24 2019-09-17 Tcl集团股份有限公司 A kind of Internet of Things log processing method and device based on Hadoop platform
CN105516355B (en) * 2016-01-13 2018-07-17 国家电网公司 Intelligent electric energy meter error big data safe storage device based on fountain codes and method
CN105701621A (en) * 2016-02-19 2016-06-22 云南电网有限责任公司电力科学研究院 Intelligent power grid real time load analyzing method and system
CN106209826A (en) * 2016-07-08 2016-12-07 瑞达信息安全产业股份有限公司 A kind of safety case investigation method of Network Security Device monitoring
CN106022664A (en) * 2016-07-08 2016-10-12 大连大学 Big data analysis based network intelligent power saving monitoring method
CN107291614B (en) * 2017-05-04 2020-10-30 平安科技(深圳)有限公司 File abnormity detection method and electronic equipment
CN107483238A (en) * 2017-08-04 2017-12-15 郑州云海信息技术有限公司 A kind of blog management method, cluster management node and system
CN107704594B (en) * 2017-10-13 2021-02-09 东南大学 Real-time processing method for log data of power system based on spark streaming
CN108133043B (en) * 2018-01-12 2022-07-29 福建星瑞格软件有限公司 Structured storage method for server running logs based on big data
CN110389874B (en) * 2018-04-20 2021-01-19 比亚迪股份有限公司 Method and device for detecting log file abnormity
CN108804606B (en) * 2018-05-29 2021-08-31 上海欣能信息科技发展有限公司 Method and system for migrating power measurement data to HBase
CN108845560B (en) * 2018-05-30 2021-07-13 国网浙江省电力有限公司宁波供电公司 Power dispatching log fault classification method
CN108833156B (en) * 2018-06-08 2022-08-30 中国电力科学研究院有限公司 Evaluation method and system for simulation performance index of power communication network
CN108984610A (en) * 2018-06-11 2018-12-11 华南理工大学 A kind of method and system based on the offline real-time processing data of big data frame
CN108959445A (en) * 2018-06-13 2018-12-07 云南电网有限责任公司信息中心 Distributed information log processing method and processing device
CN109213091A (en) * 2018-06-27 2019-01-15 中国电子科技集团公司第五十五研究所 A kind of semiconductor chip process equipment method for monitoring state based on document analysis
CN109685399B (en) * 2019-02-19 2022-09-09 贵州电网有限责任公司 Method and system for integrating and analyzing logs of power system
CN110069572B (en) * 2019-03-19 2022-08-02 深圳壹账通智能科技有限公司 HIVE task scheduling method, device, equipment and storage medium based on big data platform
CN110231998B (en) * 2019-06-13 2021-07-20 泰康保险集团股份有限公司 Detection method and device for distributed timing task and storage medium
CN110555010B (en) * 2019-09-11 2022-04-05 中国南方电网有限责任公司 Power grid real-time operation data storage system
CN110825873B (en) * 2019-10-11 2022-04-12 支付宝(杭州)信息技术有限公司 Method and device for expanding log exception classification rule
CN111049684B (en) * 2019-12-12 2023-04-07 闻泰通讯股份有限公司 Data analysis method, device, equipment and storage medium
CN111158997B (en) * 2019-12-24 2023-05-23 广西电网有限责任公司 Safety monitoring method and device for multi-log system
CN112184490A (en) * 2020-02-13 2021-01-05 吴龙圣 Terminal data processing method based on Internet of things and computer equipment
CN112948211A (en) * 2021-02-26 2021-06-11 杭州安恒信息技术股份有限公司 Alarm method, device, equipment and medium based on log processing
CN114172921A (en) * 2021-12-02 2022-03-11 国网山东省电力公司信息通信公司 Log auditing method and device for scheduling recording system
CN114169651B (en) * 2022-02-14 2022-04-19 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103607291A (en) * 2013-10-25 2014-02-26 北京科东电力控制***有限责任公司 Alarm analysis merging method for power secondary system intranet security monitoring platform

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103607291A (en) * 2013-10-25 2014-02-26 北京科东电力控制***有限责任公司 Alarm analysis merging method for power secondary system intranet security monitoring platform

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"基于Web的电力***自适应安全事件管理设计";马茜;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20080415;第C042-144页(正文第25-46、51-58页) *
"基于层次聚类的日志分析技术研究";薛文娟;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130815;第I139-69页(正文第5-9、14-33页) *
"爬虫日志数据信息抽取与统计***设计与实现";王高垒;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130215;第I138-2005页(正文第21-23页) *
"电网调度日志***的设计与开发";庞传军;《湖北电力》;20130228;第37卷(第1期);第59-61页 *

Also Published As

Publication number Publication date
CN104616205A (en) 2015-05-13

Similar Documents

Publication Publication Date Title
CN104616205B (en) A kind of operation states of electric power system monitoring method based on distributed information log analysis
CN103439629B (en) Fault Diagnosis of Distribution Network systems based on data grids
CN107294764A (en) Intelligent supervision method and intelligent monitoring system
CN104616092B (en) A kind of behavior pattern processing method based on distributed information log analysis
CN107945086A (en) A kind of big data resource management system applied to smart city
CN109726246A (en) One kind being associated with reason retrogressive method with visual power grid accident based on data mining
CN102880802B (en) A kind of assay method for the major hazard source towards industrial and mining establishment's safety production cloud service platform system
CN107317724A (en) Data collecting system and method based on cloud computing technology
CN107145959A (en) A kind of electric power data processing method based on big data platform
CN107517131A (en) A kind of analysis and early warning method based on log collection
CN111787090A (en) Intelligent treatment platform based on block chain technology
CN112580831B (en) Intelligent auxiliary operation and maintenance method and system for power communication network based on knowledge graph
Lin et al. A general framework for quantitative modeling of dependability in cyber-physical systems: A proposal for doctoral research
CN101854652A (en) Telecommunications network service performance monitoring system
CN112668841A (en) Comprehensive traffic monitoring system and method based on data fusion
CN106528809A (en) Police service big data mining and analyzing platform based on PGIS and cloud computing
CN102930372A (en) Data analysis method for association rule of cloud service platform system orienting to safe production of industrial and mining enterprises
CN107104951A (en) The detection method and device of Attack Source
CN106980055A (en) A kind of student dormitory based on data mining electrical equipment violating the regulations uses monitoring system
CN109002901A (en) A kind of province ground county's integration electric network information total management system and device
CN111125450A (en) Management method of multilayer topology network resource object
CN102915482A (en) Safety production process control and management method for cloud service platforms of industrial and mining enterprises
CN102903009A (en) Malfunction diagnosis method based on generalized rule reasoning and used for safety production cloud service platform facing industrial and mining enterprises
CN109460829A (en) Based on the intelligent monitoring method and platform under big data processing and cloud transmission
CN111353085A (en) Cloud mining network public opinion analysis method based on feature model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191025

Termination date: 20211124

CF01 Termination of patent right due to non-payment of annual fee