CN107273269B - Log analysis method and device - Google Patents

Log analysis method and device Download PDF

Info

Publication number
CN107273269B
CN107273269B CN201710440027.6A CN201710440027A CN107273269B CN 107273269 B CN107273269 B CN 107273269B CN 201710440027 A CN201710440027 A CN 201710440027A CN 107273269 B CN107273269 B CN 107273269B
Authority
CN
China
Prior art keywords
analysis
log
exploratory
field
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710440027.6A
Other languages
Chinese (zh)
Other versions
CN107273269A (en
Inventor
许飞
闫绍华
李振博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201710440027.6A priority Critical patent/CN107273269B/en
Publication of CN107273269A publication Critical patent/CN107273269A/en
Application granted granted Critical
Publication of CN107273269B publication Critical patent/CN107273269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0236Filtering by address, protocol, port number or service, e.g. IP-address or URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0245Filtering by information in the payload

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a log analysis method and a log analysis device, wherein the method comprises the following steps: determining the type of the log according to a preset log classification rule; generating a detective analytical model according to a log analytical template corresponding to the type; performing exploratory analysis on the log through the exploratory analysis model, and correcting the exploratory analysis model according to an exploratory analysis result to obtain a corrected log analysis model; and analyzing the log through the corrected log analysis model. By adopting the scheme, the automatic analysis of the log can be realized, and the log analysis efficiency is improved.

Description

Log analysis method and device
Technical Field
The invention relates to the technical field of communication, in particular to a log analysis method and a log analysis device.
Background
The log is a record file for recording various operations or events, and various information can be obtained by analyzing the log. At present, the logs are usually analyzed one by adopting a manual analysis method, that is, a system maintenance or a developer is required to compile corresponding analysis codes according to analysis requirements to analyze the logs, so as to obtain corresponding analysis results.
However, the inventor finds that the above mode in the prior art has at least the following defects in the process of implementing the invention: because the original logs in the actual service are various in types and complex and changeable in analysis requirements, if the method is adopted, corresponding codes need to be compiled for each field of each log for analysis, so that the automatic analysis of the logs cannot be realized, and when the number of the logs is large, the analysis efficiency of the method is very low.
Disclosure of Invention
In view of the above, the present invention has been made to provide a log parsing method and apparatus that overcomes or at least partially solves the above problems.
According to an aspect of the present invention, there is provided a log parsing method, including: determining the type of the log according to a preset log classification rule; generating a detectivity analysis model according to the log analysis template corresponding to the type; performing exploratory analysis on the log through the exploratory analysis model, and correcting the exploratory analysis model according to an exploratory analysis result to obtain a corrected log analysis model; and analyzing the log through the corrected log analysis model.
According to another aspect of the present invention, there is provided a log parsing apparatus including: the type determining module is suitable for determining the type of the log according to a preset log classification rule; the generating module is suitable for generating a exploratory analysis model according to the log analysis template corresponding to the type; the exploratory analysis module is suitable for exploratory analysis on the log through the exploratory analysis model; the correction module is suitable for correcting the exploratory analysis model according to the exploratory analysis result to obtain a corrected log analysis model; and the analysis module is suitable for analyzing the log through the corrected log analysis model.
According to still another aspect of the present invention, there is provided a terminal including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the log analysis method.
According to still another aspect of the present invention, a computer storage medium is provided, in which at least one executable instruction is stored, and the executable instruction causes a processor to perform an operation corresponding to the log parsing method.
The log analyzing method and device provided by the invention determine the type of the log according to the preset log classification rule; generating a detective analytical model according to a log analytical template corresponding to the type; performing exploratory analysis on the log through an exploratory analysis model, and correcting the exploratory analysis model according to an exploratory analysis result to obtain a corrected log analysis model; and finally, analyzing the log through the modified log analysis model. By adopting the scheme, the intelligent analysis of the log can be realized, and the log analysis efficiency is improved.
The above description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow diagram illustrating a method for log parsing according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a log parsing method according to another embodiment of the present invention;
fig. 3 is a block diagram illustrating a structure of a log parsing apparatus according to an embodiment of the present invention;
fig. 4 is a block diagram illustrating a structure of a log parsing apparatus according to another embodiment of the present invention;
fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Fig. 1 shows a flowchart of a log parsing method according to an embodiment of the present invention. As shown in fig. 1, the method includes:
step S110, determining the type of the log according to a preset log classification rule.
The logs are of various types, and may be classified according to a preset classification rule, for example, the logs may be classified according to a log format, such as nginx default format logs, JSON format logs, protobuf logs, and the like. The embodiment does not limit the specific log classification rule, and a person skilled in the art can formulate a corresponding log classification rule according to the actual service.
And step S120, generating a detectivity analysis model according to the log analysis template corresponding to the type.
The characteristics of log parameter expression, format and the like of the same type are basically the same. Taking an apache log as an example, the log format is basically fixed, each field corresponds to the same type of information, and the value range corresponding to each field is basically the same, for example, the c-IP field corresponds to the IP address of the client, and the data type is a character string type. Therefore, for the logs of the same type, a universal analysis method can be formulated according to the same characteristics of the logs, namely, a corresponding log analysis template is obtained.
After the type of the log is determined, an initial analytic model, namely a detective analytic model, can be constructed according to a log analytic template corresponding to the type. In the actual operation process, the exploratory analysis model can be obtained by properly modifying and loading the log analysis template into the running file.
And step S130, performing exploratory analysis on the log through the exploratory analysis model, and correcting the exploratory analysis model according to the exploratory analysis result to obtain a corrected log analysis model.
Performing exploratory analysis on the log by using the exploratory analysis model in the step S120, and executing a step S140 by directly using the exploratory analysis model of which the exploratory analysis result meets the requirement of a preset analysis result as a corrected log analysis model; and correcting the exploratory analysis model of which the exploratory analysis result does not meet the requirement of the preset analysis result. For example, if the analysis fails, and/or the analysis time exceeds a predetermined threshold and no result is analyzed, and/or the analysis result is not within a predetermined analysis result range (e.g., "1332,312,133,113" for the IP field analysis result), the exploratory analysis model may be modified. The specific correction method can be set by the person skilled in the art.
Optionally, when the analysis result of a certain field or a certain number of fields in the exploratory analysis result does not meet the preset analysis result, only the analysis model sub-method corresponding to the field may be corrected. If the analysis result of a certain field does not meet the requirement of the preset analysis result when the field is analyzed by adopting a character string method, the field can be analyzed by adopting an integer method, and the analysis rule corresponding to the field analysis in the exploratory analysis model is corrected, so that the corrected log analysis model is obtained.
In step S140, the log is analyzed by the modified log analysis model.
The log is analyzed by the finally corrected log analysis model in step S130, so as to obtain an analysis result. Optionally, the log analysis model may be further continuously modified according to the analysis result of the log.
Therefore, according to the log analysis method provided by the embodiment, logs are classified, and log analysis methods of the same type are basically the same, so that a corresponding analysis template is set for each type of log, and a exploratory analysis model is generated according to the analysis template, wherein the exploratory analysis model is a prototype of the log analysis model; the logs can be corrected through the analysis result of the exploratory analysis model on the logs, so that a more accurate log analysis model is obtained, and the subsequent logs are analyzed through the log analysis model, so that the defect that all logs need to be analyzed manually one by one is overcome, the logs are automatically analyzed, and the log analysis efficiency is improved.
Fig. 2 is a flowchart illustrating a log parsing method according to another embodiment of the present invention. As shown in fig. 2, the method includes:
step S210, receiving an analysis configuration parameter input by a user through a preset analysis configuration entry.
The user can input the analysis configuration parameters through a preset analysis configuration inlet. The analysis configuration parameters comprise field names to be analyzed, field types and/or field value ranges and the like. For example, the parsing configuration parameter may be "viewer-IP; char; 0.0.0-255.255.255.255 ", wherein" director-IP "is the field name corresponding to the IP address," char "indicates that the field type is a character string type, and" 0.0.0-255.255.255.255 "is the value range of the field (the value range of the legal IP address).
The parsing configuration parameters in this embodiment include, but are not limited to, "field names to be parsed, field types, and/or field value ranges", which may also include parsing indexes, and/or deleting conditions of the logs to be parsed. For example, the parsing index may be PV (Page View, Page View volume) of a certain Page; or the 1-month log needs to be analyzed, the 1 month log can be selected from the corresponding time deleting conditions, so that only the log with the log recording time of 1 month is read in the subsequent analysis process for analysis. In addition, the analysis configuration parameters can also comprise original log paths, so that the storage position of the log to be analyzed can be specified by a user, and the flexibility of log analysis is improved.
Step S220, setting log classification rules, and respectively setting log analysis templates corresponding to various types of logs according to the log classification rules.
Wherein the log classification rule comprises: and classifying according to preset log classification characteristics, and/or classifying the logs through a machine learning algorithm.
Specifically, in one classification manner, the logs may be classified according to preset log classification features, for example, the logs may be classified in a log format, such as a nginx default format log, or a JSON log with a nested format, and the like.
Alternatively, in another classification approach, the logs may be classified by a machine learning algorithm. For example, a classification attribute may be set in advance, and usually, the classification attribute is plural. For example, the classification attribute may include log usage (an attribute value may be a security log, an operation and maintenance log, and the like), log format, log producer (such as a system log, an application log, a web log, and the like), and the log may be classified by training a certain sample (for example, using some common log types as samples) and obtaining a log classification model using a specific machine learning algorithm such as naive bayes and/or a decision tree.
According to the log classification rule, log analysis templates corresponding to various types of logs can be respectively set. Because logs of the same type have many same characteristics, the parsing method is basically consistent. Therefore, a common log analysis template can be made according to the same characteristics of the logs of the same type. Wherein, the log analysis template comprises one or more template analysis classes.
Step S230, determining the type of the log according to a preset log classification rule.
The type of the log to be parsed may be determined according to the log classification rule set in step S220. Taking a preset log classification rule as an example of classification through a machine learning algorithm, because a log classification model can be obtained after sample training is carried out through the machine learning algorithm, the log to be analyzed can obtain the classification attribute of the log, and the type of the log is further determined according to the log classification model.
Step S240, generating a exploratory analysis model according to the log analysis template corresponding to the log type and the analysis configuration parameters.
The exploratory analysis model is a model that can perform exploratory analysis on the log after the analysis configuration parameters are properly modified on the log analysis template corresponding to the log type determined in step S230. For example, if the parsing index included in the parsing configuration parameter is "PV of a certain page", a PV template parsing class in the log parsing model may be selected, and when some names in the template parsing class are different from names in the running of the actual code (for example, the named length in the template parsing class may not be limited, but the named length in the running of the actual code is limited), the information such as the names may be modified appropriately and loaded into the running file, so as to provide the subsequent step with exploratory parsing of the log.
Specifically, at least one log analysis class included in the exploratory analysis model is set according to the type and number of template analysis classes included in the log analysis template. And searching corresponding template analysis classes in the log analysis template according to the analysis configuration parameters, wherein the number of the searched template analysis classes is more than or equal to 1, and the number of the log analysis classes in the exploratory analysis model is also more than or equal to 1. For example, for the IP address of the visitor in the analyzed log, the number of corresponding template analysis classes may be 1, and the number of log analysis classes included in the exploratory analysis model generated based on the template analysis classes is 1; for calculating the PV value ranking of each page, a combination of a plurality of template analysis classes is required to realize the PV value ranking, so the exploratory analysis model comprises a plurality of log analysis classes.
When the number of log analysis classes contained in the exploratory analysis model is multiple, the execution logic among the log analysis classes contained in the exploratory analysis model is further set according to the process setting rules contained in the log analysis template. Wherein the execution logic comprises: the execution order among the log resolution classes, and/or the execution times of each log resolution class.
Specifically, when there are a plurality of log analysis classes included in the exploratory analysis model, corresponding execution logic needs to be set for the log exploratory analysis to ensure its smooth execution. For example, in an actual service, when the amount of log data to be processed is large, in order to improve the log processing efficiency, a distributed system is often used to process the log, so when a plurality of log analysis classes are included, one or more of the classes can be parallelized once or more, and the result after each processing is saved in a memory or a hard disk (since the data access speed in the memory is fast, the processing result can be preferably saved in the memory in this embodiment) for the reduction processing or the parallelization processing of the next class or classes, and a specific execution logic can be set by a person skilled in the art.
For example, taking the log type as an access log in the WEB as an example, when analyzing the PV rows of each page, the execution logic may be: and after the page PV calculation analysis class is executed in a parallelization mode once, the PV value of each page is calculated, the result is stored in a memory, the result is read from the memory by the ranking analysis class to be subjected to a reduction processing, and finally the PV ranking of each page is analyzed.
Step S250, selecting one analysis rule from a plurality of preset analysis rules for analyzing each field in the log.
After the exploratory analysis model is generated in step S240, the log to be analyzed is analyzed by using the exploratory analysis model. The exploratory analysis model comprises a plurality of analysis rules for the same log field, and during the exploratory analysis of the field in the log, one analysis rule is selected from the plurality of analysis rules to analyze the field. And if the analysis result is successful, determining the currently selected analysis rule as the field analysis rule of the corresponding field. If the analysis result is failure, the analysis rule of the corresponding field is changed according to the preset multiple analysis rules until the analysis result is successful.
For example, the exploratory analysis model has a plurality of analysis rules for the plugin _ ver field, such as integer analysis, floating point analysis, or character analysis. When the plugin _ ver field is analyzed, the field can be analyzed in an integer mode, and if the analysis is successful, the integer analysis is used as an analysis rule of the field in the subsequent log analysis; if the analysis fails, the field is analyzed by replacing the analysis rule (for example, character type analysis rule or floating point type analysis rule is adopted) until the analysis result is successful.
Step S260, determining the field range of each field in the log according to the analysis result of the exploratory analysis, and generating a filtering rule for filtering error data according to the field range of the field in the log.
Specifically, after a certain number of logs are subjected to exploratory analysis, the effective range of fields in the logs can be determined according to the exploratory analysis result, and a filtering rule can be generated according to the effective range, wherein the filtering rule can be used for filtering out fields in a non-effective range. For example, after performing the probing analysis on the plugin _ ver field, if it is found that the analysis is successful only by using the character string analysis method, it may be determined that the valid range of the field is the range in which the character form is located, and then the filtering rule may be set to filter out the log field in which the field is in the non-character form.
Step S270, adding a filtering rule into the exploratory analysis model, and correcting the exploratory analysis model according to the field analysis rule of each field in the log.
The filtering rule in step S260 is added to the exploratory parsing model, and the exploratory parsing model is modified according to the field parsing rule of the field in step S250. And analyzing the subsequent log by the corrected analysis model. Optionally, in the subsequent analysis process, the log analysis model may also be corrected in time according to the analysis result.
Therefore, according to the log analysis method provided by the embodiment, firstly, the logs are classified, and as the log analysis methods of the same type are basically the same, a corresponding analysis template is set for each type of log, and a exploratory analysis model is generated according to the analysis template, wherein the exploratory analysis model is a prototype of the log analysis model; the log can be corrected through the analysis result of the exploratory analysis model on the log, so that a more accurate log analysis model is obtained, and the subsequent logs are analyzed through the log analysis model, so that the defect that all logs need to be analyzed manually one by one is overcome, the automatic analysis of the log is realized, and the log analysis efficiency is improved; moreover, the accuracy of log analysis is further improved by adding the filtering rules and correcting the analysis rules; in addition, the exploratory analysis model is generated by the combined action of the user configuration parameters and the log analysis template, so that the final log analysis result can better meet the user requirements, and the communication cost caused by the fact that the log needs to be analyzed by development or maintenance personnel in the prior art is also avoided.
Fig. 3 shows a functional block diagram of a log parsing apparatus according to an embodiment of the present invention. As shown in fig. 3, the apparatus includes: a type determination module 31, a generation module 32, a detectivity resolution module 33, a modification module 34, and a resolution module 35.
The type determining module 31 is adapted to determine the type of the log according to a preset log classification rule.
The logs are of various types, and may be classified according to a preset classification rule, for example, the logs may be classified according to a log format, such as nginx default format logs, JSON format logs, protobuf logs, and the like. The embodiment does not limit the specific log classification rule, and a person skilled in the art can formulate a corresponding log classification rule according to the actual service.
A generating module 32 adapted to generate a exploratory analytic model from a log analytic template corresponding to the type.
The characteristics of log parameter expression, format and the like of the same type are basically the same. Taking an apache log as an example, the log format is basically fixed, each field corresponds to the same type of information, and the value range corresponding to each field is basically the same, for example, the c-IP field corresponds to the IP address of the client, and the data type is a character string type. Therefore, for the logs of the same type, a universal analysis method can be formulated according to the same characteristics of the logs, namely, a corresponding log analysis template is obtained.
After the type of the log is determined, an initial analytic model, namely a detective analytic model, can be constructed according to a log analytic template corresponding to the type. In the actual operation process, the exploratory analysis model can be obtained by properly modifying and loading the log analysis template into the running file.
And the exploratory analysis module 33 is adapted to perform exploratory analysis on the log through an exploratory analysis model.
Specifically, the exploratory analysis model is used for exploratory analysis of the log to be analyzed, and the exploratory analysis model with the exploratory analysis result meeting the requirement of the preset analysis result is directly used as the corrected log analysis model.
And the correcting module 34 is adapted to correct the exploratory analysis model according to the exploratory analysis result to obtain a corrected log analysis model.
Specifically, the exploratory analysis model whose exploratory analysis result does not meet the preset analysis result requirement needs to be corrected. For example, the exploratory analysis model may be modified if the analysis fails, and/or if the analysis time exceeds a predetermined threshold and no analysis result is obtained, and/or if the analysis result is not within a predetermined analysis result range (e.g., "1332,312,133,113" for the IP field analysis result). The specific correction method can be set by the person skilled in the art.
Optionally, when the analysis result of a certain field or a certain number of fields in the exploratory analysis result does not meet the preset analysis result, only the analysis model sub-method corresponding to the field may be corrected. If the analysis result of a certain field does not meet the requirement of the preset analysis result when the field is analyzed by adopting a character string method, the field can be analyzed by adopting an integer method, and the analysis rule corresponding to the field analysis in the exploratory analysis model is corrected, so that the corrected log analysis model is obtained.
And the analysis module 35 is adapted to analyze the log through the modified log analysis model.
And analyzing the log through the corrected log analysis model so as to obtain an analysis result. Optionally, the log parsing model can be further continuously modified according to the parsing result of the log.
Therefore, according to the log analysis device provided by the embodiment, by classifying logs, because log analysis methods of the same type are basically the same, a corresponding analysis template is set for each type of log, and a exploratory analysis model is generated according to the analysis template, wherein the exploratory analysis model is a prototype of the log analysis model; the logs can be corrected through the analysis result of the exploratory analysis model on the logs, so that a more accurate log analysis model is obtained, and the subsequent logs are analyzed through the log analysis model, so that the defect that all logs need to be analyzed manually one by one is overcome, the logs are automatically analyzed, and the log analysis efficiency is improved.
Fig. 4 shows a functional block diagram of a log parsing apparatus according to another embodiment of the present invention. As shown in fig. 4, the apparatus further includes, in addition to the apparatus shown in fig. 3: a receiving module 41, a setting module 42, and a filtering module 43.
The receiving module 41 receives the parsing configuration parameters input by the user through a preset parsing configuration entry.
The user can input the analysis configuration parameters through a preset analysis configuration inlet. The analysis configuration parameters comprise field names to be analyzed, field types and/or field value ranges and the like. For example, the parsing configuration parameter may be "viewer-IP; char; 0.0.0-255.255.255.255 ", wherein" director-IP "is the field name corresponding to the IP address," char "indicates that the field type is a character string type, and" 0.0.0-255.255.255.255 "is the value range of the field (the value range of the legal IP address).
The parsing configuration parameters in this embodiment include, but are not limited to, "field names to be parsed, field types, and/or field value ranges", which may also include parsing indexes, and/or deleting conditions of the logs to be parsed. For example, the parsing index may be PV (Page View, Page View volume) of a certain Page; or the 1-month log needs to be analyzed, the 1 month log can be selected from the corresponding time deleting conditions, so that only the log with the log recording time of 1 month is read in the subsequent analysis process for analysis. In addition, the analysis configuration parameters can also comprise original log paths, so that the storage position of the log to be analyzed can be specified by a user, and the flexibility of log analysis is improved.
And the setting module 42 is suitable for setting log classification rules and respectively setting log analysis templates corresponding to various types of logs according to the log classification rules.
Wherein the log classification rule comprises: and classifying according to preset log classification characteristics, and/or classifying the logs through a machine learning algorithm.
Specifically, in one classification manner, the logs may be classified according to preset log classification features, for example, the logs may be classified in a log format, such as a nginx default format log, or a JSON log with a nested format, and the like.
Alternatively, in another classification approach, the logs may be classified by a machine learning algorithm. For example, a classification attribute may be set in advance, and usually, the classification attribute is plural. For example, the classification attribute may include log usage (an attribute value may be a security log, an operation and maintenance log, and the like), log format, log producer (such as a system log, an application log, a web log, and the like), and the log may be classified by training a certain sample (for example, using some common log types as samples) and obtaining a log classification model using a specific machine learning algorithm such as naive bayes and/or a decision tree.
According to the log classification rule, log analysis templates corresponding to various types of logs can be respectively set. Because logs of the same type have many same characteristics, the parsing method is basically consistent. Therefore, a common log analysis template can be made according to the same characteristics of the logs of the same type. Wherein, the log analysis template comprises one or more template analysis classes.
The generation module 32 is further adapted to: and generating a exploratory analysis model according to the log analysis template corresponding to the log type and the analysis configuration parameters.
The exploratory analysis model is a model which can perform exploratory analysis on the log after analysis configuration parameters are properly modified on a log analysis template corresponding to the log type determined in the type determination module 31. For example, if the parsing index included in the parsing configuration parameter is "PV of a certain page", a PV template parsing class in the log parsing model may be selected, and when some names in the template parsing class and the like are in or out of the name in the actual code running (for example, the named length in the template parsing class may not be limited, but the named length in the actual code running is limited), the information such as the name may be modified appropriately and loaded into the running file, so as to provide the subsequent step with exploratory parsing of the log.
The generation module 32 is further adapted to: and setting at least one log analysis class contained in the exploratory analysis model according to the type and the number of the template analysis classes contained in the log analysis template.
And searching corresponding template analysis classes in the log analysis template according to the analysis configuration parameters, wherein the number of the searched template analysis classes is more than or equal to 1, and the number of the log analysis classes in the exploratory analysis model is also more than or equal to 1. For example, for the IP address of the visitor in the analyzed log, the number of corresponding template analysis classes may be 1, and the log analysis classes included in the exploratory analysis model generated based on the template analysis classes are 1; for calculating the PV value ranking of each page, a combination of a plurality of template analysis classes is required to realize the PV value ranking, so the exploratory analysis model comprises a plurality of log analysis classes.
Optionally, when the log analysis classes included in the exploratory analysis model are multiple, the apparatus further includes: and the logic setting module 44 is adapted to set the execution logic among the plurality of log analysis classes included in the exploratory analysis model according to the process setting rules included in the log analysis template.
Wherein the execution logic comprises: the execution order among the log resolution classes, and/or the execution times of each log resolution class.
Specifically, when there are a plurality of log analysis classes included in the exploratory analysis model, corresponding execution logic needs to be set for the log exploratory analysis to ensure its smooth execution. For example, in an actual service, when the amount of log data to be processed is large, in order to improve the log processing efficiency, a distributed system is often used to process the log, so when a plurality of log analysis classes are included, one or more of the classes can be parallelized once or more, and the result after each processing is saved in a memory or a hard disk (since the data access speed in the memory is fast, the processing result can be preferably saved in the memory in this embodiment) for the reduction processing or the parallelization processing of the next class or classes, and a specific execution logic can be set by a person skilled in the art.
For example, taking the log type as an access log in the WEB as an example, when analyzing the PV rows of each page, the execution logic may be: and after the page PV calculation analysis class is executed in a parallelization mode once, the PV value of each page is calculated, the result is stored in a memory, the result is read from the memory by the ranking analysis class to be subjected to a reduction processing, and finally the PV ranking of each page is analyzed.
The exploratory resolution module 33 is further adapted to: and selecting one analysis rule from a plurality of preset analysis rules for analyzing each field in the log.
After the exploratory analysis model is generated, the journal to be analyzed is analyzed by utilizing the exploratory analysis model. The exploratory analysis model comprises a plurality of analysis rules for the same log field, and during the exploratory analysis of the field in the log, one analysis rule is selected from the plurality of analysis rules to analyze the field. And if the analysis result is successful, determining the currently selected analysis rule as the field analysis rule of the corresponding field. If the analysis result is failure, the analysis rule of the corresponding field is changed according to the preset multiple analysis rules until the analysis result is successful.
For example, the exploratory analysis model has a plurality of analysis rules for the plugin _ ver field, such as integer analysis, floating point analysis, or character analysis. When the plugin _ ver field is analyzed, the field can be analyzed in an integer mode, and if the analysis is successful, the integer analysis is used as an analysis rule of the field in the subsequent log analysis; if the analysis fails, the field is analyzed by replacing the analysis rule (for example, character type analysis rule or floating point type analysis rule is adopted) until the analysis result is successful.
And the filtering module 43 is adapted to determine a field range of each field in the log according to the analysis result of the exploratory analysis, and generate a filtering rule for filtering the error data according to the field range of each field in the log.
Specifically, after a certain number of logs are subjected to exploratory analysis, the effective range of fields in the logs can be determined according to the exploratory analysis result, and a filtering rule can be generated according to the effective range, wherein the filtering rule can be used for filtering out fields in a non-effective range. For example, after performing the probing analysis on the plugin _ ver field, if it is found that the analysis is successful only by using the character string analysis method, it may be determined that the valid range of the field is the range in which the character form is located, and then the filtering rule may be set to filter out the log field in which the field is in the non-character form.
The correction module 34 is further adapted to: and adding a filtering rule into the exploratory analysis model, and correcting the exploratory analysis model according to the field analysis rule of each field in the log.
Specifically, filtering rules are added to the exploratory parsing model, and the exploratory parsing model is modified according to field parsing rules. And analyzing the subsequent log by the corrected analysis model. Optionally, the log analysis model may also be corrected in time according to the analysis result in the subsequent analysis process.
Therefore, according to the log analysis device provided by the embodiment, firstly, the logs are classified, and as the log analysis methods of the same type are basically the same, a corresponding analysis template is set for each type of log, and a exploratory analysis model is generated according to the analysis template, wherein the exploratory analysis model is a prototype of the log analysis model; the log can be corrected through the analysis result of the exploratory analysis model on the log, so that a more accurate log analysis model is obtained, and the subsequent logs are analyzed through the log analysis model, so that the defect that all logs need to be analyzed manually one by one is overcome, the automatic analysis of the log is realized, and the log analysis efficiency is improved; moreover, the accuracy of log analysis is further improved by adding the filtering rules and correcting the analysis rules; in addition, the exploratory analysis model is generated by the combined action of the user configuration parameters and the log analysis template, so that the final log analysis result can better meet the user requirements, and the communication cost caused by the fact that the log needs to be analyzed by development or maintenance personnel in the prior art is also avoided.
According to an embodiment of the present invention, a non-volatile computer storage medium is provided, where at least one executable instruction is stored, and the computer executable instruction can execute the log parsing method in any of the above method embodiments.
Fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the terminal.
As shown in fig. 5, the terminal may include: a processor (processor)502, a Communications Interface 504, a memory 506, and a communication bus 508.
Wherein:
the processor 502, communication interface 504, and memory 506 communicate with one another via a communication bus 508.
A communication interface 504 for communicating with network elements of other devices, such as clients or other servers.
The processor 502 is configured to execute the program 510, and may specifically perform relevant steps in the above-described log parsing method embodiment.
In particular, program 510 may include program code that includes computer operating instructions.
The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the present invention. The terminal comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 506 for storing a program 510. The memory 506 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 510 may specifically be used to cause the processor 502 to perform the following operations:
determining the type of the log according to a preset log classification rule;
generating a detective analytical model according to a log analytical template corresponding to the type;
performing exploratory analysis on the log through the exploratory analysis model, and correcting the exploratory analysis model according to an exploratory analysis result to obtain a corrected log analysis model;
and analyzing the log through the corrected log analysis model.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and placed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Moreover, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than others, the combination of features of different embodiments is intended to be within the scope of the invention and form part of different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a log parsing apparatus according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on a computer-readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (14)

1. A log parsing method, comprising:
determining the type of the log according to a preset log classification rule;
generating a detective analytical model according to a log analytical template corresponding to the type;
performing exploratory analysis on the log through the exploratory analysis model, and correcting the exploratory analysis model according to an exploratory analysis result to obtain a corrected log analysis model; when the analysis result of a preset field in the exploratory analysis result does not accord with the preset analysis result, correcting the analysis model sub-method corresponding to the field;
analyzing the log through the corrected log analysis model;
before the step of generating the exploratory analysis model according to the log analysis template corresponding to the type, the method further includes: receiving an analysis configuration parameter input by a user through a preset analysis configuration inlet, wherein the step of generating the exploratory analysis model according to the log analysis template corresponding to the type specifically comprises the following steps: generating a exploratory analysis model according to a log analysis template corresponding to the type and the analysis configuration parameters; wherein analyzing the configuration parameters comprises: the field name to be analyzed, the field type, the field value range, the analysis index and/or the deletion condition of the log to be analyzed.
2. The method as claimed in claim 1, wherein the step of determining the type of the log according to the preset log classification rule is preceded by the steps of: setting the log classification rule, and respectively setting log analysis templates corresponding to various types of logs according to the log classification rule;
wherein the log classification rule comprises: and classifying according to preset log classification characteristics, and/or classifying through a machine learning algorithm.
3. The method of claim 1, wherein the step of generating a exploratory analytic model from the log analytic template corresponding to the type comprises:
and setting at least one log analysis class contained in the exploratory analysis model according to the type and the number of the template analysis classes contained in the log analysis template.
4. The method according to claim 3, wherein when there are a plurality of log resolution classes included in the exploratory resolution model, the step of setting at least one log resolution class included in the exploratory resolution model further comprises:
setting execution logic among a plurality of log analysis classes contained in a exploratory analysis model according to a process setting rule contained in the log analysis template; wherein the execution logic comprises: the execution order among the log resolution classes, and/or the execution times of each log resolution class.
5. The method according to any of claims 1-4, wherein said exploratory parsing of said log by said exploratory parsing model comprises in particular:
selecting one analysis rule from a plurality of preset analysis rules for analyzing each field in the log;
if the analysis result is successful, determining the currently selected analysis rule as the field analysis rule of the corresponding field; if the analysis result is failure, the analysis rule of the corresponding field is changed according to the preset multiple analysis rules until the analysis result is successful.
6. The method of claim 5, wherein the exploratory parsing the log via the exploratory parsing model further comprises:
determining the field range of each field in the log according to the analysis result of the exploratory analysis, and generating a filtering rule for filtering error data according to the field range of each field in the log;
the step of correcting the exploratory analysis model according to the exploratory analysis result to obtain a corrected log analysis model specifically includes:
and adding the filtering rules into the exploratory analysis model, and correcting the exploratory analysis model according to the field analysis rules of all the fields in the log.
7. A log parsing apparatus, comprising:
the type determining module is suitable for determining the type of the log according to a preset log classification rule;
the generating module is suitable for generating a exploratory analysis model according to a log analysis template corresponding to the type;
the exploratory analysis module is suitable for exploratory analysis on the log through the exploratory analysis model;
the correction module is suitable for correcting the exploratory analysis model according to the exploratory analysis result to obtain a corrected log analysis model; when the analysis result of a preset field in the exploratory analysis result does not accord with the preset analysis result, correcting the analysis model sub-method corresponding to the preset field;
the analysis module is suitable for analyzing the log through the corrected log analysis model;
wherein the apparatus further comprises: the receiving module is suitable for receiving the analysis configuration parameters input by a user through a preset analysis configuration inlet; the generation module is further adapted to: generating a exploratory analysis model according to a log analysis template corresponding to the type and the analysis configuration parameters; wherein analyzing the configuration parameters comprises: the field name to be analyzed, the field type, the field value range, the analysis index and/or the deletion condition of the log to be analyzed.
8. The apparatus of claim 7, wherein the apparatus further comprises:
the setting module is suitable for setting the log classification rule and respectively setting log analysis templates corresponding to various types of logs according to the log classification rule;
wherein the log classification rule comprises: and classifying according to preset log classification characteristics, and/or classifying through a machine learning algorithm.
9. The apparatus of claim 7, wherein the generation module is further adapted to:
and setting at least one log analysis class contained in the exploratory analysis model according to the type and the number of the template analysis classes contained in the log analysis template.
10. The apparatus of claim 9, wherein when the logging resolution class included in the exploratory resolution model is plural, the apparatus further comprises:
the logic setting module is suitable for setting execution logic among a plurality of log analysis classes contained in the exploratory analysis model according to the process setting rules contained in the log analysis template; wherein the execution logic comprises: the execution order among the log resolution classes, and/or the execution times of each log resolution class.
11. The apparatus of claim 7, wherein the exploratory parsing module is further adapted to:
selecting one analysis rule from a plurality of preset analysis rules for analyzing each field in the log;
if the analysis result is successful, determining the currently selected analysis rule as the field analysis rule of the corresponding field; if the analysis result is failure, the analysis rule of the corresponding field is changed according to the preset multiple analysis rules until the analysis result is successful.
12. The apparatus of claim 11, wherein the apparatus further comprises:
the filtering module is suitable for determining the field range of each field in the log according to the analysis result of the exploratory analysis and generating a filtering rule for filtering error data according to the field range of each field in the log;
the correction module is further adapted to: and adding the filtering rules into the exploratory analysis model, and correcting the exploratory analysis model according to the field analysis rules of all the fields in the log.
13. A terminal, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the log analysis method according to any one of claims 1-6.
14. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the log parsing method of any one of claims 1-6.
CN201710440027.6A 2017-06-12 2017-06-12 Log analysis method and device Active CN107273269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710440027.6A CN107273269B (en) 2017-06-12 2017-06-12 Log analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710440027.6A CN107273269B (en) 2017-06-12 2017-06-12 Log analysis method and device

Publications (2)

Publication Number Publication Date
CN107273269A CN107273269A (en) 2017-10-20
CN107273269B true CN107273269B (en) 2021-04-23

Family

ID=60066087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710440027.6A Active CN107273269B (en) 2017-06-12 2017-06-12 Log analysis method and device

Country Status (1)

Country Link
CN (1) CN107273269B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763461A (en) * 2018-05-28 2018-11-06 上海七牛信息技术有限公司 Data processing method, device, system and storage medium
CN108418842A (en) * 2018-05-31 2018-08-17 郑州信大天瑞信息技术有限公司 A kind of intranet security log collection method and system
CN109947715B (en) * 2018-09-07 2021-08-27 网联清算有限公司 Log alarm method and device
CN109325009B (en) * 2018-09-19 2021-11-30 亚信科技(成都)有限公司 Log analysis method and device
CN109347827B (en) * 2018-10-22 2021-06-22 东软集团股份有限公司 Method, device, equipment and storage medium for predicting network attack behavior
CN109688027A (en) * 2018-12-24 2019-04-26 努比亚技术有限公司 A kind of collecting method, device, equipment, system and storage medium
CN110808965B (en) * 2019-10-22 2022-11-25 许继集团有限公司 Debugging method and device of monitoring system
CN111061696B (en) * 2019-12-17 2023-03-31 中国银行股份有限公司 Method and device for analyzing transaction message log
CN111367964B (en) * 2020-02-29 2023-11-17 上海爱数信息技术股份有限公司 Method for automatically analyzing log
CN115480998A (en) * 2021-06-16 2022-12-16 深圳富桂精密工业有限公司 Log analysis system and log analysis method
CN115065536B (en) * 2022-06-16 2023-08-25 北京天融信网络安全技术有限公司 Network security data parser, parsing method, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104104734A (en) * 2014-08-04 2014-10-15 浪潮(北京)电子信息产业有限公司 Log analysis method and device
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record
CN106656607A (en) * 2016-12-27 2017-05-10 上海爱数信息技术股份有限公司 Equipment log parsing method and system, and server side having system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103929321A (en) * 2013-01-15 2014-07-16 腾讯科技(深圳)有限公司 Log processing method and device
US10679135B2 (en) * 2015-11-09 2020-06-09 Nec Corporation Periodicity analysis on heterogeneous logs
CN105447099B (en) * 2015-11-11 2018-12-14 中国建设银行股份有限公司 Log-structuredization information extracting method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104104734A (en) * 2014-08-04 2014-10-15 浪潮(北京)电子信息产业有限公司 Log analysis method and device
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record
CN106656607A (en) * 2016-12-27 2017-05-10 上海爱数信息技术股份有限公司 Equipment log parsing method and system, and server side having system

Also Published As

Publication number Publication date
CN107273269A (en) 2017-10-20

Similar Documents

Publication Publication Date Title
CN107273269B (en) Log analysis method and device
CN110427331B (en) Method for automatically generating performance test script based on interface test tool
US11144817B2 (en) Device and method for determining convolutional neural network model for database
US20190095318A1 (en) Test-assisted application programming interface (api) learning
CN108459954B (en) Application program vulnerability detection method and device
US11651014B2 (en) Source code retrieval
Chaqfeh et al. Jscleaner: De-cluttering mobile webpages through javascript cleanup
CN109918296B (en) Software automation test method and device
CN111428273A (en) Dynamic desensitization method and device based on machine learning
US20190370384A1 (en) Ensemble-based data curation pipeline for efficient label propagation
CN111611581B (en) Internet of things-based network big data information anti-disclosure method and cloud communication server
US20210209011A1 (en) Systems and methods for automated testing using artificial intelligence techniques
CN110750433A (en) Interface test method and device
US20180247226A1 (en) Classifier
CN106156098B (en) Error correction pair mining method and system
CN111931179A (en) Cloud malicious program detection system and method based on deep learning
CN107015986B (en) Method and device for crawling webpage by crawler
CN111338692A (en) Vulnerability classification method and device based on vulnerability codes and electronic equipment
US9519872B2 (en) Systematic discovery of business ontology
CN110909361A (en) Vulnerability detection method and device and computer equipment
CN110334262B (en) Model training method and device and electronic equipment
CN115729817A (en) Method and device for generating and optimizing test case library, electronic equipment and storage medium
CN115982053A (en) Method, device and application for detecting software source code defects
CN111580821B (en) Script binding method and device, electronic equipment and computer readable storage medium
CN111783812A (en) Method and device for identifying forbidden images and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant