CN109542737A - Platform alert processing method, device, electronic device and storage medium - Google Patents
Platform alert processing method, device, electronic device and storage medium Download PDFInfo
- Publication number
- CN109542737A CN109542737A CN201811151626.7A CN201811151626A CN109542737A CN 109542737 A CN109542737 A CN 109542737A CN 201811151626 A CN201811151626 A CN 201811151626A CN 109542737 A CN109542737 A CN 109542737A
- Authority
- CN
- China
- Prior art keywords
- alarm
- error
- platform
- big data
- log
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/302—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
A kind of platform alert processing method, comprising: receive at least alarm signal from a big data platform;The corresponding error log of the alarm signal is obtained, the corresponding alarm classification of the error log is analyzed;Corresponding control instruction is determined according to the alarm classification analyzed;And identified control instruction is sent to the big data platform, the control instruction executes corresponding operation for controlling the big data platform.The present invention also provides a kind of platform alarm treatment device, electronic device and computer readable storage mediums.The present invention can be conducive to improve the efficiency of alarming processing, realize alarming processing automation, improve security monitoring efficiency.
Description
Technical field
The present invention relates to alert processing methods, and in particular to a kind of platform alert processing method, platform alarm treatment device,
Electronic device and computer readable storage medium.
Background technique
As the continuous improvement of social informatization technology and Internet technology are quickly popularized, every field is to mass data
The demand of processing is also more and more, can come into being to the big data processing platform that mass data is efficiently treated through.Big number
It can be regarded as being decided by a variety of serviced components by business according to platform and be combined at the distribution of building with real data process demand
Platform.When carrying out data processing according to big data platform, each serviced component in big data platform works independently but each clothes
Business inter-module cooperates again, if exception occur in the service processes in some serviced component, it is likely that at entire data
Reason process has an impact.It is monitored therefore, it is necessary to the operation to big data platform and is alerted in time when occurring abnormal, thus
Guarantee the timeliness and accuracy of data processing.However, it is directed to warning information, warning system processing alarm on the market at present
Process automates not enough, and alarming processing efficiency is lower.
Summary of the invention
In view of the foregoing, it is necessary to propose a kind of platform alert processing method, platform alarm treatment device, electronic device
And computer readable storage medium, it is able to solve problem above.
A better embodiment of the invention provides a kind of platform alert processing method, comprising: receives flat from a big data
An at least alarm signal for platform;The corresponding error log of the alarm signal is obtained, the corresponding alarm of the error log is analyzed
Classification;Corresponding control instruction is determined according to the alarm classification analyzed;And identified control instruction is sent to
The big data platform, the control instruction execute corresponding operation for controlling the big data platform.
In one possible implementation, the alarm signal includes run-time error or generation in the big data platform
The identification information of the service processes of resource problem, the corresponding error log of the alarm signal that obtains includes: described in analysis
The identification information of the service processes included in alarm signal;A log is generated according to the identification information that analysis obtains
Acquisition request;Log acquisition request is sent to the big data platform, the log acquisition request is described for controlling
The error log that corresponding service processes generate is sent to the electronic device by big data platform.
In one possible implementation, the corresponding alarm classification of the analysis error log includes: identification institute
It whether states in error log comprising at least one default error-critical word;And working as includes the default mistake in the error log
When keyword, the corresponding alarm classification of the default error-critical word is determined according to an error information table, wherein the letter that reports an error
Breath table includes multiple default error-critical words and multiple alarm classifications, the corresponding at least one default error-critical of each alarm classification
Word.
In one possible implementation, the error log is marked with a flag bit, and the flag bit corresponding one is accused
Alert rank, the alarm level are used to indicate when there are multiple error logs, priority processing alarm level higher wrong day
Whether will identifies in the error log when the alarm signal received is more than one comprising at least one default error-critical
Before word further include: identify the flag bit that each error log is recorded, determine the error log according to the flag bit
Alarm level, wherein identify whether the error log includes default error-critical word successively to identify institute according to alarm level
State whether error log includes default error-critical word.
In one possible implementation, the alarm classification include at least the first kind alarm, the second class alarm and
The alarm of third class, the first kind alarm include the alarm to big data platform environment and resource problem, the second class alarm
Including the alarm to mission script problem, the third class alarm includes that the alarm of problem is not completed to day task, and described first
Class alerts corresponding first control instruction of classification, and first control instruction is appointed for controlling the big data platform and directly running again
Business.Corresponding second control instruction of second class alarm classification, second control instruction is for controlling the big data platform
Stopping task.The third class alarm classification corresponds to third control instruction, and the third control instruction is for controlling the big number
According to platform suspended task.
A better embodiment of the invention also provides a kind of platform alarm treatment device, comprising: receiving module, for receiving
An at least alarm signal from a big data platform, is also used to obtain the corresponding error log of the alarm signal;Analyze mould
Block, for analyzing the corresponding alarm classification of the error log;Determining module, for analyzing to obtain according to the analysis module
Alarm classification determine corresponding control instruction;And sending module, it is used for control instruction determined by the determining module
It is sent to the big data platform, the control instruction executes corresponding operation for controlling the big data platform.
A better embodiment of the invention also provides a kind of electronic device, including processor and memory, the memory
In be stored with platform alarming processing program, the processor is realized above-mentioned flat for executing the platform alarming processing program
Platform alert processing method.
A better embodiment of the invention also provides a kind of computer readable storage medium, the computer-readable storage medium
Platform alarming processing program is stored in matter, the platform alarming processing program is realized above-mentioned described flat when being executed by processor
Platform alert processing method.
When the embodiment of the present invention can be for the serviced component run-time error or generation resource problem of the big data platform
Alarm analyzed, and alarm cause sort out and corresponding processing strategie is executed according to generic, be conducive to mention
The efficiency of high alarming processing realizes alarming processing automation.Furthermore since alarm can be handled in time, be conducive to the big number
According to serviced component cooperation interaction operations multiple in platform, avoid a certain serviced component when occurring abnormal to entire data handling procedure
It has an impact.
Detailed description of the invention
Fig. 1 is the flow chart for the platform alert processing method that a preferred embodiment of the present invention provides.
Fig. 2 is the structural schematic diagram for the platform alarm treatment device that a preferred embodiment of the present invention provides.
Fig. 3 is the structural schematic diagram for the electronic device that a preferred embodiment of the present invention provides.
Main element symbol description
The present invention that the following detailed description will be further explained with reference to the above drawings.
Specific embodiment
To better understand the objects, features and advantages of the present invention, with reference to the accompanying drawing and specific real
Applying example, the present invention will be described in detail.It should be noted that in the absence of conflict, embodiments herein and embodiment
In feature can be combined with each other.
In the following description, numerous specific details are set forth in order to facilitate a full understanding of the present invention, described embodiment is only
It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill
Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention
The normally understood meaning of technical staff is identical.Term as used herein in the specification of the present invention is intended merely to description tool
The purpose of the embodiment of body, it is not intended that in the limitation present invention.
Wherein, the electronic device includes memory and processor.It will be understood by those skilled in the art that the present invention is real
It applies in example, schematic diagram shown in Fig. 3 is only the example of electronic device, does not constitute the restriction to electronic device, can also be wrapped
It includes than illustrating more or fewer components, perhaps combines certain components or different components, such as the electronic device may be used also
To include input-output equipment, network access equipment, bus etc..
The electronic device be it is a kind of can according to the instruction for being previously set or store, automatic progress numerical value calculating and/or
The equipment of information processing, hardware include but is not limited to microprocessor, specific integrated circuit (Application Specific
Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), number
Word processing device (Digital Signal Processor, DSP), embedded device etc..
Specifically, the electronic device include but is not limited to any one can with user by keyboard, mouse, remote controler,
The modes such as touch tablet or voice-operated device carry out the electronic product of human-computer interaction, for example, personal computer, tablet computer, intelligent hand
Machine, personal digital assistant (Personal Digital Assistant, PDA), Interactive Internet TV (Internet
Protocol Television, IPTV) etc..
Fig. 1 is the flow chart for the platform alert processing method that a preferred embodiment of the present invention provides.At the platform alarm
Reason method is applied in an electronic device 1.According to different demands, sequence can change the step of the platform alert processing method
Become, certain steps can be omitted or merge.The platform alert processing method the following steps are included:
Step S11: at least alarm signal from a big data platform is received;
Wherein, the big data platform can be a distributed platform, and operation has multiple serviced components, and the serviced component can
It is made of a host node and at least one from node.For example, for the HDFS on Hadoop distributed platform, host node can be with
It is expressed as NameNode, DataNode can be expressed as from node.The host node of the serviced component and can be respectively from node
Make a service processes, and the operation of serviced component depends on corresponding service processes, it therefore, can be by monitoring the clothes
Be engaged in component service processes operating condition and its physical characteristic (such as the service conditions of the resources such as CPU, memory) realize to institute
State the management of serviced component.
Service processes in the serviced component can generate corresponding running log, the running log in the process of running
Record has the operation information of the service processes.Wherein, with behavior unit, each row records when having generation the running log respectively
Between, log rank, the information such as service processes, class, code position, specific log content for executing program.For operation
For the service processes of mistake or generation resource problem, error flag can be carried out in corresponding running log and records generation mistake
Details (hereinafter referred to as: error log) accidentally.
For the big data platform when an at least service processes generate error log, Xiang Suoshu electronic device sends the announcement
Alert signal.More specifically, the big data platform is carried out by way of cable network or wireless network with the electronic device
The alarm signal is sent to the electronic device by network type by connection, the big data platform.
Step S12: obtaining the corresponding error log of the alarm signal, analyzes the corresponding alarm class of the error log
Not;
In the present embodiment, the alarm signal includes the mark of the service processes of run-time error or generation resource problem
Information.The electronic device after receiving the alarm signal, analyze included in the alarm signal it is described service into
The identification information of journey, the identification information obtained according to analysis generate a log acquisition and request and request the log acquisition
It is sent to the big data platform, the log acquisition request produces corresponding service processes for controlling the big data platform
Raw error log is sent to the electronic device.
In the present embodiment, the electronic device is previously stored with an error information table, and the error information table includes
Multiple default error-critical words and multiple alarm classifications, the corresponding at least one default error-critical word of each alarm classification.Wherein,
Each alarm classification can generate corresponding error log.Determine the error-critical that the error log of a certain alarm classification is included
Word can be obtained according to historical experience.Specifically, the electronic device collects passing error log, arranges each error log
The keyword that the details of generation mistake are included recorded in error reason and the error log, to obtain every
An at least error-critical word corresponding to one error reason.Then, by classifying to the error reason to determine its institute
The alarm classification of category, to obtain an at least error-critical word corresponding to each alarm classification (that is, default error-critical word).
Therefore, in the present embodiment, the corresponding alarm classification of the error log is analyzed in step S12 to specifically include:
Step S121: it whether identifies in the error log comprising at least one default error-critical word;
Wherein, the format for the error log that different service processes generate is different (e.g., can to show as character string or key-value pair
Format), therefore, corresponding preset rules can be selected based on the format of the error log, and according to the preset rules into
Row information is extracted, to identify whether the error log includes default error-critical word.Such as, when the error log is key-value pair
Format, then traverse the error log, and information extraction is carried out to the error log according to predefined key-value pair format.
Under normal circumstances, it is separated between the error-critical word in key-value pair and the corresponding value of error-critical word with "=", it therefore, can
Error-critical word is extracted by identification "=", and judges whether the error-critical word is default error-critical word.
Step S122: when in the error log including the default error-critical word, according to the error information table
Determine the corresponding alarm classification of the default error-critical word.
In the present embodiment, the error log is also marked with a flag bit, the corresponding alarm level of the flag bit, institute
It states alarm level to be used to indicate when there are multiple error logs, the higher error log of priority processing alarm level.For example, institute
Flag bit is stated including at least the first flag bit, the second flag bit and third flag bit.First flag bit indicates alert level
Not Wei level-one, indicate alarm level highest;Second flag bit indicates that alarm level is second level, indicates that alarm level is placed in the middle;
The third flag bit indicates that alarm level is three-level, indicates that alarm level is minimum.Therefore, it is more than in the alarm signal received
At one, whether step S121 is identified in the error log comprising before at least one default error-critical word further include:
Step S120: identifying the flag bit that each error log is recorded, and determines the wrong day according to the flag bit
The alarm level of will;
Wherein, identify the error log whether include default error-critical word be successively identified according to alarm level described in
Whether error log includes default error-critical word.
Step S13: corresponding control instruction is determined according to the alarm classification analyzed.
In the present embodiment, a command information table, described instruction information table packet are also previously stored in the electronic device
Include the multiple alarm classification and multiple control instructions, the corresponding wherein control instruction of each alarm classification.It therefore, can basis
Described instruction information table is according to the corresponding control instruction of each alarm classification of determination.
Step S14: identified control instruction is sent to the big data platform, the control instruction is for controlling institute
It states big data platform and executes corresponding operation.
In the present embodiment, the alarm classification includes at least first kind alarm, the alarm of the second class and the alarm of third class,
The first kind alarm includes the alarm to big data platform environment and resource problem, and the second class alarm includes to task foot
The alarm of this problem, the third class alarm includes that the alarm of problem is not completed to day task.The first kind alerts classification pair
The first control instruction is answered, first control instruction directly runs task for controlling the big data platform again.Second class
Corresponding second control instruction of classification is alerted, second control instruction stops task for controlling the big data platform.It is described
Third class alarm classification corresponds to third control instruction, and the third control instruction is appointed for controlling the big data platform pause
Business.
Such as, when the type of service of the big data platform is OLTP, running log record has each serviced component
The service condition of the resources such as CPU, number of concurrent index and memory.When the type of service of the big data platform is OLAP,
Running log records the service condition for having the resources such as disk I/O, network I/O and the memory of each serviced component.When a certain service
When resource occurs for component using problem, the serviced component generates warning information.At this point, first control instruction is for controlling
The big data platform corrects resource problem, and controls the service processes and run task again.
Such as, the big data platform can receive the mission script uploaded from a mobile terminal (not shown) by network,
It determines corresponding serviced component, when the serviced component meets task schedule condition, the mission script is sent to described
In serviced component, so that the serviced component executes corresponding task and returns to implementing result.It is described to mission script problem
Alarm is the mission script mistake that the mobile terminal uploads, and causes the serviced component can not be when running the mission script
Mistake occurs, and generates warning information.At this point, second control instruction, which controls the serviced component, stops task.
For another example, being executed for the task of a certain serviced component of the big data platform is day task, if the service group
Warning information will be generated when being not carried out the day task on the day of part.At this point, the third control command controls the clothes
Business component suspended task.Hereafter, the serviced component recycles the duty cycle for starting next round at second day, until described appoint
Business is completed.Certainly, in other embodiments, third class alarm may also include to all tasks, moon task, year task it is not complete
Problematic alarm.
Fig. 2 is the structural schematic diagram for the platform alarm treatment device 200 that a better embodiment of the invention provides.Some
In embodiment, the platform alarm treatment device 200 is run in electronic device.The platform alarm treatment device 200 can be with
Including multiple functional modules as composed by program code segments.The journey of each program segment of the platform alarm treatment device 200
Sequence code can store in the memory of electronic device, and as performed by least one described processor, to realize at alarm
Manage function.
In the present embodiment, function of the platform alarm treatment device 200 according to performed by it can be divided into multiple
Functional module.As shown in Fig. 2, the platform alarm treatment device 200 includes: receiving module 201, analysis module 202, determines mould
Block 203 and sending module 204.The so-called module of the present invention refers to that one kind can be performed by least one processor and energy
The series of computation machine program segment of fixed function is enough completed, storage is in memory.In the present embodiment, about each module
Function will be described in detail in subsequent embodiment.
The receiving module 201 is for receiving at least alarm signal from a big data platform.
Wherein, the big data platform can be a distributed platform, and operation has multiple serviced components, and the serviced component can
It is made of a host node and at least one from node.For example, for the HDFS on Hadoop distributed platform, host node can be with
It is expressed as NameNode, DataNode can be expressed as from node.The host node of the serviced component and can be respectively from node
Make a service processes, and the operation of serviced component depends on corresponding service processes, it therefore, can be by monitoring the clothes
Be engaged in component service processes operating condition and its physical characteristic (such as the service conditions of the resources such as CPU, memory) realize to institute
State the management of serviced component.
Service processes in the serviced component can generate corresponding running log, the running log in the process of running
Record has the operation information of the service processes.Wherein, with behavior unit, each row records when having generation the running log respectively
Between, log rank, the information such as service processes, class, code position, specific log content for executing program.For operation
For the service processes of mistake or generation resource problem, error flag can be carried out in corresponding running log and records generation mistake
Details (hereinafter referred to as: error log) accidentally.
For the big data platform when an at least service processes generate error log, Xiang Suoshu electronic device sends the announcement
Alert signal.More specifically, the big data platform is carried out by way of cable network or wireless network with the electronic device
The alarm signal is sent to the electronic device by network type by connection, the big data platform.
The receiving module 201 is also used to obtain the corresponding error log of the alarm signal, and the analysis module 202 is used
In the corresponding alarm classification of the analysis error log.
In the present embodiment, the alarm signal includes the mark of the service processes of run-time error or generation resource problem
Information.After the receiving module 201 receives the alarm signal, the analysis module 202 is analyzed in the alarm signal
The identification information for the service processes for being included generates log acquisition request simultaneously according to the identification information that analysis obtains
Log acquisition request is sent to the big data platform by above-mentioned sending module 204, the log acquisition request is used
The error log that corresponding service processes generate is sent to the electronic device in controlling the big data platform.
In the present embodiment, the electronic device is previously stored with an error information table, and the error information table includes
Multiple default error-critical words and multiple alarm classifications, the corresponding at least one default error-critical word of each alarm classification.Wherein,
Each alarm classification can generate corresponding error log.Determine the error-critical that the error log of a certain alarm classification is included
Word can be obtained according to historical experience.Specifically, the analysis module 202 collects passing error log, arranges each mistake
The keyword that the details of mistake are included accidentally is generated recorded in the error reason and the error log of log, from
And obtain an at least error-critical word corresponding to each error reason.Then, the analysis module 202 passes through to the mistake
Reason classify with determine its belonging to alarm classification, to obtain an at least error-critical corresponding to each alarm classification
Word (that is, default error-critical word).
Therefore, in the present embodiment, whether the analysis module 202 identifies default comprising at least one in the error log
Error-critical word determines institute according to the error information table when in the error log including the default error-critical word
State the corresponding alarm classification of default error-critical word.
Wherein, the format for the error log that different service processes generate is different (e.g., can to show as character string or key-value pair
Format), therefore, the analysis module 202 can select corresponding preset rules based on the format of the error log, and according to
The preset rules carry out information extraction, to identify whether the error log includes default error-critical word.Such as, when the mistake
Accidentally log is the format of key-value pair, then traverses the error log, and according to predefined key-value pair format to the wrong day
Will carries out information extraction.Under normal circumstances, "=" is used between the error-critical word in key-value pair and the corresponding value of error-critical word
It is separated, therefore, the analysis module 202 can extract error-critical word by identification "=", and judge the mistake
Whether keyword is default error-critical word.
In the present embodiment, the error log is also marked with a flag bit, the corresponding alarm level of the flag bit, institute
It states alarm level to be used to indicate when there are multiple error logs, the higher error log of priority processing alarm level.For example, institute
Flag bit is stated including at least the first flag bit, the second flag bit and third flag bit.First flag bit indicates alert level
Not Wei level-one, indicate alarm level highest;Second flag bit indicates that alarm level is second level, indicates that alarm level is placed in the middle;
The third flag bit indicates that alarm level is three-level, indicates that alarm level is minimum.Therefore, it is more than in the alarm signal received
At one, whether the analysis module 202 is in identifying the error log comprising going back before at least one default error-critical word
The flag bit that each error log is recorded for identification determines the alarm level of the error log according to the flag bit.
Wherein, identify the error log whether include default error-critical word be successively identified according to alarm level described in
Whether error log includes default error-critical word.
The determining module 203 is corresponding for being determined according to the alarm classification that the analysis module 202 is analyzed
Control instruction.
In the present embodiment, a command information table, described instruction information table packet are also previously stored in the electronic device
Include the multiple alarm classification and multiple control instructions, the corresponding wherein control instruction of each alarm classification.Therefore, described true
Cover half block 203 can be according to described instruction information table according to the corresponding control instruction of each alarm classification of determination.
Control instruction determined by the determining module 203 is sent to the big data platform by the sending module 204,
The control instruction executes corresponding operation for controlling the big data platform.
In the present embodiment, the alarm classification includes at least first kind alarm, the alarm of the second class and the alarm of third class,
The first kind alarm includes the alarm to big data platform environment and resource problem, and the second class alarm includes to task foot
The alarm of this problem, the third class alarm includes that the alarm of problem is not completed to day task.The first kind alerts classification pair
The first control instruction is answered, first control instruction directly runs task for controlling the big data platform again.Second class
Corresponding second control instruction of classification is alerted, second control instruction stops task for controlling the big data platform.It is described
Third class alarm classification corresponds to third control instruction, and the third control instruction is appointed for controlling the big data platform pause
Business.
Such as, when the type of service of the big data platform is OLTP, running log record has each serviced component
The service condition of the resources such as CPU, number of concurrent index and memory.When the type of service of the big data platform is OLAP,
Running log records the service condition for having the resources such as disk I/O, network I/O and the memory of each serviced component.When a certain service
When resource occurs for component using problem, the serviced component generates warning information.At this point, first control instruction is for controlling
The big data platform corrects resource problem, and controls the service processes and run task again.
Such as, the big data platform can receive the mission script uploaded from a mobile terminal (not shown) by network,
It determines corresponding serviced component, when the serviced component meets task schedule condition, the mission script is sent to described
In serviced component, so that the serviced component executes corresponding task and returns to implementing result.It is described to mission script problem
Alarm is the mission script mistake that the mobile terminal uploads, and causes the serviced component can not be when running the mission script
Mistake occurs, and generates warning information.At this point, second control instruction, which controls the serviced component, stops task.
For another example, being executed for the task of a certain serviced component of the big data platform is day task, if the service group
Warning information will be generated when being not carried out the day task on the day of part.At this point, the third control command controls the clothes
Business component suspended task.Hereafter, the serviced component recycles the duty cycle for starting next round at second day, until described appoint
Business is completed.Certainly, in other embodiments, third class alarm may also include to all tasks, moon task, year task it is not complete
Problematic alarm.
As shown in figure 3, Fig. 3 is the electronics dress for realizing the platform alert processing method in a better embodiment of the invention
Set 1 structural schematic diagram.The electronic device 1 includes memory 101, processor 102 and is stored in the memory 101
And the computer program 103 that can be run on the processor 102, such as platform alarming processing program.
The processor 102 realizes platform alert processing method in above-described embodiment when executing the computer program 103
The step of:
Step S11: at least alarm signal from the big data platform is received;
Wherein, the big data platform can be a distributed platform, and operation has multiple serviced components, and the serviced component can
It is made of a host node and at least one from node.For example, for the HDFS on Hadoop distributed platform, host node can be with
It is expressed as NameNode, DataNode can be expressed as from node.The host node of the serviced component and can be respectively from node
Make a service processes, and the operation of serviced component depends on corresponding service processes, it therefore, can be by monitoring the clothes
Be engaged in component service processes operating condition and its physical characteristic (such as the service conditions of the resources such as CPU, memory) realize to institute
State the management of serviced component.
Service processes in the serviced component can generate corresponding running log, the running log in the process of running
Record has the operation information of the service processes.Wherein, with behavior unit, each row records when having generation the running log respectively
Between, log rank, the information such as service processes, class, code position, specific log content for executing program.For operation
For the service processes of mistake or generation resource problem, error flag can be carried out in corresponding running log and records generation mistake
Details (hereinafter referred to as: error log) accidentally.
For the big data platform when an at least service processes generate error log, Xiang Suoshu electronic device sends the announcement
Alert signal.More specifically, the big data platform is carried out by way of cable network or wireless network with the electronic device
The alarm signal is sent to the electronic device by network type by connection, the big data platform.
Step S12: obtaining the corresponding error log of the alarm signal, analyzes the corresponding alarm class of the error log
Not;
In the present embodiment, the alarm signal includes the mark of the service processes of run-time error or generation resource problem
Information.The electronic device after receiving the alarm signal, analyze included in the alarm signal it is described service into
The identification information of journey, the identification information obtained according to analysis generate a log acquisition and request and request the log acquisition
It is sent to the big data platform, the log acquisition request produces corresponding service processes for controlling the big data platform
Raw error log is sent to the electronic device.
In the present embodiment, the electronic device is previously stored with an error information table, and the error information table includes
Multiple default error-critical words and multiple alarm classifications, the corresponding at least one default error-critical word of each alarm classification.Wherein,
Each alarm classification can generate corresponding error log.Determine the error-critical that the error log of a certain alarm classification is included
Word can be obtained according to historical experience.Specifically, the electronic device collects passing error log, arranges each error log
The keyword that the details of generation mistake are included recorded in error reason and the error log, to obtain every
An at least error-critical word corresponding to one error reason.Then, by classifying to the error reason to determine its institute
The alarm classification of category, to obtain an at least error-critical word corresponding to each alarm classification (that is, default error-critical word).
Therefore, in the present embodiment, the corresponding alarm classification of the error log is analyzed in step S12 to specifically include:
Step S121: it whether identifies in the error log comprising at least one default error-critical word;
Wherein, the format for the error log that different service processes generate is different (e.g., can to show as character string or key-value pair
Format), therefore, corresponding preset rules can be selected based on the format of the error log, and according to the preset rules into
Row information is extracted, to identify whether the error log includes default error-critical word.Such as, when the error log is key-value pair
Format, then traverse the error log, and information extraction is carried out to the error log according to predefined key-value pair format.
Under normal circumstances, it is separated between the error-critical word in key-value pair and the corresponding value of error-critical word with "=", it therefore, can
Error-critical word is extracted by identification "=", and judges whether the error-critical word is default error-critical word.
Step S122: when in the error log including the default error-critical word, according to the error information table
Determine the corresponding alarm classification of the default error-critical word.
In the present embodiment, the error log is also marked with a flag bit, the corresponding alarm level of the flag bit, institute
It states alarm level to be used to indicate when there are multiple error logs, the higher error log of priority processing alarm level.For example, institute
Flag bit is stated including at least the first flag bit, the second flag bit and third flag bit.First flag bit indicates alert level
Not Wei level-one, indicate alarm level highest;Second flag bit indicates that alarm level is second level, indicates that alarm level is placed in the middle;
The third flag bit indicates that alarm level is three-level, indicates that alarm level is minimum.Therefore, it is more than in the alarm signal received
At one, whether step S121 is identified in the error log comprising before at least one default error-critical word further include:
Step S120: identifying the flag bit that each error log is recorded, and determines the wrong day according to the flag bit
The alarm level of will;
Wherein, identify the error log whether include default error-critical word be successively identified according to alarm level described in
Whether error log includes default error-critical word.
Step S13: corresponding control instruction is determined according to the alarm classification analyzed.
In the present embodiment, a command information table, described instruction information table packet are also previously stored in the electronic device
Include the multiple alarm classification and multiple control instructions, the corresponding wherein control instruction of each alarm classification.It therefore, can basis
Described instruction information table is according to the corresponding control instruction of each alarm classification of determination.
Step S14: identified control instruction is sent to the big data platform, the control instruction is for controlling institute
It states big data platform and executes corresponding operation.
In the present embodiment, the alarm classification includes at least first kind alarm, the alarm of the second class and the alarm of third class,
The first kind alarm includes the alarm to big data platform environment and resource problem, and the second class alarm includes to task foot
The alarm of this problem, the third class alarm includes that the alarm of problem is not completed to day task.The first kind alerts classification pair
The first control instruction is answered, first control instruction directly runs task for controlling the big data platform again.Second class
Corresponding second control instruction of classification is alerted, second control instruction stops task for controlling the big data platform.It is described
Third class alarm classification corresponds to third control instruction, and the third control instruction is appointed for controlling the big data platform pause
Business.
Such as, when the type of service of the big data platform is OLTP, running log record has each serviced component
The service condition of the resources such as CPU, number of concurrent index and memory.When the type of service of the big data platform is OLAP,
Running log records the service condition for having the resources such as disk I/O, network I/O and the memory of each serviced component.When a certain service
When resource occurs for component using problem, the serviced component generates warning information.At this point, first control instruction is for controlling
The big data platform corrects resource problem, and controls the service processes and run task again.
Such as, the big data platform can receive the mission script uploaded from a mobile terminal (not shown) by network,
It determines corresponding serviced component, when the serviced component meets task schedule condition, the mission script is sent to described
In serviced component, so that the serviced component executes corresponding task and returns to implementing result.It is described to mission script problem
Alarm is the mission script mistake that the mobile terminal uploads, and causes the serviced component can not be when running the mission script
Mistake occurs, and generates warning information.At this point, second control instruction, which controls the serviced component, stops task.
For another example, being executed for the task of a certain serviced component of the big data platform is day task, if the service group
Warning information will be generated when being not carried out the day task on the day of part.At this point, the third control command controls the clothes
Business component suspended task.Hereafter, the serviced component recycles the duty cycle for starting next round at second day, until described appoint
Business is completed.Certainly, in other embodiments, third class alarm may also include to all tasks, moon task, year task it is not complete
Problematic alarm.
Alternatively, the processor 102 realizes that above-mentioned platform alarm treatment device is implemented when executing the computer program 103
The function of each module/unit in example, such as the unit 201-204 in Fig. 2.
When the embodiment of the present invention can be for the serviced component run-time error or generation resource problem of the big data platform
Alarm analyzed, and alarm cause sort out and corresponding processing strategie is executed according to generic, be conducive to mention
The efficiency of high alarming processing realizes alarming processing automation;Furthermore since alarm can be handled in time, be conducive to the big number
According to serviced component cooperation interaction operations multiple in platform, avoid a certain serviced component when occurring abnormal to entire data handling procedure
It has an impact.
Illustratively, the computer program 103 can be divided into one or more module/units, it is one or
Multiple module/the units of person are stored in the memory 101, and are executed by the processor 102, to complete the present invention.Institute
Stating one or more module/units can be the series of computation machine program instruction section that can complete specific function, the instruction segment
For describing implementation procedure of the computer program 103 in the electronic device 1.For example, the computer program 103 can
With acquisition module 301, training module 302, the execution module 303 being divided into Fig. 3.
The electronic device 1 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set
It is standby.It will be understood by those skilled in the art that the schematic diagram is only the example of electronic device 1, do not constitute to electronic device 1
Restriction, may include perhaps combining certain components or different components, such as institute than illustrating more or fewer components
Stating electronic device 1 can also include input-output equipment, network access equipment, bus etc..
Alleged processor 102 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor 30 is also possible to any conventional processor
Deng the processor 102 is the control centre of the electronic device 1, utilizes various interfaces and the entire electronic device 1 of connection
Various pieces.
The memory 101 can be used for storing the computer program 103 and/or module/unit, the processor 102
By running or execute the computer program and/or module/unit that are stored in the memory 101, and calls and be stored in
Data in memory 101 realize the various functions of the electronic device 1.The memory 101 can mainly include storage program
Area and storage data area, wherein storing program area can application program needed for storage program area, at least one function (such as
Sound-playing function, image player function etc.) etc.;Storage data area, which can be stored, uses created number according to electronic device 1
According to (such as audio data, phone directory etc.) etc..In addition, memory 101 may include high-speed random access memory, can also wrap
Include nonvolatile memory, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), peace
Digital (Secure Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device,
Or other volatile solid-state parts.
If the integrated module/unit of the electronic device 1 is realized in the form of SFU software functional unit and as independent
Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the present invention is real
All or part of the process in existing above-described embodiment method, can also instruct relevant hardware come complete by computer program
At the computer program can be stored in a computer readable storage medium, which is being executed by processor
When, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, described
Computer program code can be source code form, object identification code form, executable file or certain intermediate forms etc..The meter
Calculation machine readable medium may include: can carry the computer program code any entity or device, recording medium, USB flash disk,
Mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory
Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It needs to illustrate
It is that the content that the computer-readable medium includes can be fitted according to the requirement made laws in jurisdiction with patent practice
When increase and decrease, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium does not include electric carrier wave letter
Number and telecommunication signal.
In several embodiments provided by the present invention, it should be understood that disclosed electronic device and method, Ke Yitong
Other modes are crossed to realize.For example, electronics embodiment described above is only schematical, for example, the unit
Division, only a kind of logical function partition, there may be another division manner in actual implementation.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in same treatment unit
It is that each unit physically exists alone, can also be integrated in same unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds software function module.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie
In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power
Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims
Variation is included in the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This
Outside, it is clear that one word of " comprising " does not exclude other units or steps, and odd number is not excluded for plural number.It is stated in electrical device claims
Multiple units or electronic device can also be implemented through software or hardware by the same unit or electronic device.The first, the
Second-class word is used to indicate names, and is not indicated any particular order.
Finally it should be noted that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although reference
Preferred embodiment describes the invention in detail, those skilled in the art should understand that, it can be to of the invention
Technical solution is modified or equivalent replacement, without departing from the spirit and scope of the technical solution of the present invention.
Claims (9)
1. a kind of platform alert processing method characterized by comprising
Receive at least alarm signal from a big data platform;
The corresponding error log of the alarm signal is obtained, the corresponding alarm classification of the error log is analyzed;
Corresponding control instruction is determined according to the alarm classification that analysis obtains;And
Identified control instruction is sent to the big data platform, the control instruction is for controlling the big data platform
Execute corresponding operation.
2. platform alert processing method as described in claim 1, which is characterized in that the alarm signal includes the big data
The identification information of the service processes of run-time error or generation resource problem, described to obtain the corresponding mistake of the alarm signal in platform
Missing log includes:
Analyze the identification information of the service processes included in the alarm signal;
Log acquisition request is generated according to the identification information that analysis obtains;
Log acquisition request is sent to the big data platform, the log acquisition request is for controlling the big data
The error log that corresponding service processes generate is sent to the electronic device by platform.
3. platform alert processing method as described in claim 1, which is characterized in that the analysis error log is corresponding
Alerting classification includes:
It whether identifies in the error log comprising at least one default error-critical word;And
When in the error log including the default error-critical word, the default mistake is determined according to an error information table
The corresponding alarm classification of keyword, wherein the error information table includes multiple default error-critical words and multiple alarm classes
Not, the corresponding at least one default error-critical word of each alarm classification.
4. platform alert processing method as claimed in claim 3, which is characterized in that the error log is marked with a mark
Position, the corresponding alarm level of the flag bit, the alarm level are used to indicate when there are multiple error logs, priority processing
The higher error log of alarm level identifies in the error log whether wrap when the alarm signal received is more than one
Before at least one default error-critical word further include:
It identifies the flag bit that each error log is recorded, the alarm level of the error log is determined according to the flag bit,
Wherein, identify the error log whether include default error-critical word be the error log is successively identified according to alarm level
Whether default error-critical word is included.
5. platform alert processing method as claimed in claim 3, which is characterized in that in the identification error log whether
It is the corresponding preset rules of format selection based on the error log comprising at least one default error-critical word, and according to described
Preset rules carry out information extraction, to identify whether the error log includes default error-critical word.
6. platform alert processing method as described in claim 1, which is characterized in that the alarm classification includes at least the first kind
Alarm, the alarm of the second class and the alarm of third class, the first kind alarm includes to big data platform environment and resource problem
Alarm, the second class alarm include the alarm to mission script problem, and the third class alarm includes not completing to day task
The alarm of problem, corresponding first control instruction of first kind alarm classification, first control instruction are described big for controlling
Data platform directly runs task again.Corresponding second control instruction of second class alarm classification, second control instruction are used for
It controls the big data platform and stops task.The third class alarm classification corresponds to third control instruction, and the third control refers to
It enables for controlling the big data platform suspended task.
7. a kind of platform alarm treatment device characterized by comprising
Receiving module is also used to obtain the alarm signal for receiving at least alarm signal from a big data platform
Corresponding error log;
Analysis module, for analyzing the corresponding alarm classification of the error log;
Determining module, the alarm classification for being analyzed according to the analysis module determine corresponding control instruction;And
Sending module, for control instruction determined by the determining module to be sent to the big data platform, the control
Instruction executes corresponding operation for controlling the big data platform.
8. a kind of electronic device, including processor and memory, which is characterized in that be stored in the memory at platform alarm
Program is managed, the processor is for executing the platform alarming processing program to realize such as any one of claim 1 to 6 institute
The platform alert processing method stated.
9. a kind of computer readable storage medium, which is characterized in that be stored with platform announcement on the computer readable storage medium
Alert processing routine realizes that any one of such as claim 1-6's is described when the platform alarming processing program is executed by processor
Platform alert processing method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811151626.7A CN109542737A (en) | 2018-09-29 | 2018-09-29 | Platform alert processing method, device, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811151626.7A CN109542737A (en) | 2018-09-29 | 2018-09-29 | Platform alert processing method, device, electronic device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109542737A true CN109542737A (en) | 2019-03-29 |
Family
ID=65843669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811151626.7A Pending CN109542737A (en) | 2018-09-29 | 2018-09-29 | Platform alert processing method, device, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109542737A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110209644A (en) * | 2019-05-21 | 2019-09-06 | 上海易点时空网络有限公司 | The method, apparatus and system of log management |
CN111124859A (en) * | 2019-12-13 | 2020-05-08 | 北京浪潮数据技术有限公司 | Log processing method, device, equipment and storage medium |
CN111198850A (en) * | 2019-12-14 | 2020-05-26 | 深圳猛犸电动科技有限公司 | Log message processing method and device and Internet of things platform |
CN112882920A (en) * | 2021-04-29 | 2021-06-01 | 云账户技术(天津)有限公司 | Alarm policy verification method and device, electronic equipment and readable storage medium |
CN113485886A (en) * | 2021-06-25 | 2021-10-08 | 青岛海尔科技有限公司 | Alarm log processing method and device, storage medium and electronic device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102981943A (en) * | 2012-10-29 | 2013-03-20 | 新浪技术(中国)有限公司 | Method and system for monitoring application logs |
US20140149525A1 (en) * | 2012-11-28 | 2014-05-29 | Electronics And Telecommunications Research Institute | Method and apparatus for transmitting and receiving instant message |
CN105337765A (en) * | 2015-10-10 | 2016-02-17 | 上海新炬网络信息技术有限公司 | Distributed hadoop cluster fault automatic diagnosis and restoration system |
CN105550103A (en) * | 2015-12-03 | 2016-05-04 | 泰华智慧产业集团股份有限公司 | Custom test script based automated testing method |
CN107123314A (en) * | 2017-04-24 | 2017-09-01 | 努比亚技术有限公司 | A kind of method for realizing alarming processing, system, terminal and equipment |
CN107612740A (en) * | 2017-09-30 | 2018-01-19 | 武汉光谷信息技术股份有限公司 | A kind of daily record monitoring system and method under distributed environment |
CN107729206A (en) * | 2017-09-04 | 2018-02-23 | 上海斐讯数据通信技术有限公司 | Real-time analysis method, system and the computer-processing equipment of alarm log |
-
2018
- 2018-09-29 CN CN201811151626.7A patent/CN109542737A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102981943A (en) * | 2012-10-29 | 2013-03-20 | 新浪技术(中国)有限公司 | Method and system for monitoring application logs |
US20140149525A1 (en) * | 2012-11-28 | 2014-05-29 | Electronics And Telecommunications Research Institute | Method and apparatus for transmitting and receiving instant message |
CN105337765A (en) * | 2015-10-10 | 2016-02-17 | 上海新炬网络信息技术有限公司 | Distributed hadoop cluster fault automatic diagnosis and restoration system |
CN105550103A (en) * | 2015-12-03 | 2016-05-04 | 泰华智慧产业集团股份有限公司 | Custom test script based automated testing method |
CN107123314A (en) * | 2017-04-24 | 2017-09-01 | 努比亚技术有限公司 | A kind of method for realizing alarming processing, system, terminal and equipment |
CN107729206A (en) * | 2017-09-04 | 2018-02-23 | 上海斐讯数据通信技术有限公司 | Real-time analysis method, system and the computer-processing equipment of alarm log |
CN107612740A (en) * | 2017-09-30 | 2018-01-19 | 武汉光谷信息技术股份有限公司 | A kind of daily record monitoring system and method under distributed environment |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110209644A (en) * | 2019-05-21 | 2019-09-06 | 上海易点时空网络有限公司 | The method, apparatus and system of log management |
CN111124859A (en) * | 2019-12-13 | 2020-05-08 | 北京浪潮数据技术有限公司 | Log processing method, device, equipment and storage medium |
CN111198850A (en) * | 2019-12-14 | 2020-05-26 | 深圳猛犸电动科技有限公司 | Log message processing method and device and Internet of things platform |
CN112882920A (en) * | 2021-04-29 | 2021-06-01 | 云账户技术(天津)有限公司 | Alarm policy verification method and device, electronic equipment and readable storage medium |
CN112882920B (en) * | 2021-04-29 | 2021-06-29 | 云账户技术(天津)有限公司 | Alarm policy verification method and device, electronic equipment and readable storage medium |
CN113485886A (en) * | 2021-06-25 | 2021-10-08 | 青岛海尔科技有限公司 | Alarm log processing method and device, storage medium and electronic device |
CN113485886B (en) * | 2021-06-25 | 2023-07-21 | 青岛海尔科技有限公司 | Alarm log processing method and device, storage medium and electronic device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11165806B2 (en) | Anomaly detection using cognitive computing | |
CN109542737A (en) | Platform alert processing method, device, electronic device and storage medium | |
US11544721B2 (en) | Supporting automation of customer service | |
US11132358B2 (en) | Candidate name generation | |
CN110347888B (en) | Order data processing method and device and storage medium | |
CN113626241B (en) | Abnormality processing method, device, equipment and storage medium for application program | |
CN114244611B (en) | Abnormal attack detection method, device, equipment and storage medium | |
CN111582341A (en) | User abnormal operation prediction method and device | |
CN113515434A (en) | Abnormity classification method, abnormity classification device, abnormity classification equipment and storage medium | |
US11568344B2 (en) | Systems and methods for automated pattern detection in service tickets | |
CN114580933A (en) | Event distribution method and device, storage medium and electronic equipment | |
US11783221B2 (en) | Data exposure for transparency in artificial intelligence | |
CN112541447A (en) | Machine model updating method, device, medium and equipment | |
CN112801145A (en) | Safety monitoring method and device, computer equipment and storage medium | |
CN109558222A (en) | Batch service process monitoring method, device, computer and readable storage medium storing program for executing | |
US20210092159A1 (en) | System for the prioritization and dynamic presentation of digital content | |
CN114330720A (en) | Knowledge graph construction method and device for cloud computing and storage medium | |
CN112148461A (en) | Application scheduling method and device | |
CN110806961A (en) | Intelligent early warning method and system and recommendation system | |
CN113537519A (en) | Method and device for identifying abnormal equipment | |
US20190238400A1 (en) | Network element operational status ranking | |
CN113434404B (en) | Automatic service verification method and device for verifying reliability of disaster recovery system | |
CN115858325B (en) | Project log adjusting method, device, equipment and storage medium | |
US11551006B2 (en) | Removal of personality signatures | |
CN117215747A (en) | Client-oriented abnormal operation processing method, storage medium and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190329 |
|
RJ01 | Rejection of invention patent application after publication |