CN111737091B - Log processing method and device and readable medium - Google Patents

Log processing method and device and readable medium Download PDF

Info

Publication number
CN111737091B
CN111737091B CN202010874406.8A CN202010874406A CN111737091B CN 111737091 B CN111737091 B CN 111737091B CN 202010874406 A CN202010874406 A CN 202010874406A CN 111737091 B CN111737091 B CN 111737091B
Authority
CN
China
Prior art keywords
regular expression
log
identification information
determining
attribute information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010874406.8A
Other languages
Chinese (zh)
Other versions
CN111737091A (en
Inventor
周磊
吴长城
姜双林
饶志波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Andi Technology Co Ltd
Original Assignee
Beijing Andi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Andi Technology Co Ltd filed Critical Beijing Andi Technology Co Ltd
Priority to CN202010874406.8A priority Critical patent/CN111737091B/en
Publication of CN111737091A publication Critical patent/CN111737091A/en
Application granted granted Critical
Publication of CN111737091B publication Critical patent/CN111737091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a log processing method, a log processing device and a readable medium, wherein a plurality of original logs acquired by at least one acquisition mode are acquired, a plug-in corresponding to each original log is determined according to each original log, first identification information corresponding to each original log and second identification information corresponding to each attribute information of each original log are determined according to the plug-in, whether a regular expression can be matched with the attribute information is judged, if yes, the attribute information which can be matched with the regular expression is determined as key information, the first identification information and the second identification information sent by a user are received, and the original logs corresponding to the first identification information and the key information corresponding to the second identification information are determined according to the first identification information and the second identification information. According to the scheme, the plug-in form is combined, the key information of the original log is determined by utilizing the first identification information and the second identification information, and the log collection efficiency can be improved.

Description

Log processing method and device and readable medium
Technical Field
The present invention relates to the field of network information technologies, and in particular, to a log processing method, device and readable medium.
Background
With the continuous development of internet technology, the variety of internet services increases, and the network system architecture becomes more and more complex, so that the situation analysis of the network security system is indispensable. The network system can generate a large amount of log information in the running process, the logs can usually record problem information and event information of hardware, software and application in the system, and a user can check events occurring in the network system through the logs so as to quickly find errors in the running process of the network system.
Currently, each log is usually collected by a file monitoring method, and the collected logs are stored in a database or a local file. However, when the network system is more complex, the system logs generated by more and more devices are increased, and the log collection efficiency is low through the log collection method.
Therefore, in view of the above disadvantages, it is desirable to provide a log processing method, device and readable medium for solving the problem of low acquisition efficiency in the prior art.
Disclosure of Invention
The embodiment of the invention provides a log processing method, a log processing device and a readable medium, which can improve the efficiency of log collection.
In a first aspect, an embodiment of the present invention provides a log processing method, including:
acquiring a plurality of original logs acquired by at least one acquisition mode; each original log carries a plurality of attribute information;
determining a plug-in corresponding to each original log according to each original log; each plug-in comprises a plurality of regular expressions;
for each original log, the following steps are performed:
s1, according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log;
s2, judging whether the regular expression can be matched with the attribute information or not, and if so, determining the attribute information which can be matched with the regular expression as key information;
the method comprises the steps of receiving first identification information and second identification information sent by a user, and determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information.
Optionally, the determining, according to each piece of original log, a plug-in corresponding to the piece of original log includes:
classifying the plurality of original logs, and determining plug-ins corresponding to each type of original logs;
wherein each type of original log comprises at least one original log.
Optionally, the insert, comprising: a plurality of daughter cards;
according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log, including:
for each type of original log, the following operations are performed:
determining first identification information corresponding to each original log in the type of original log according to the plug-in;
and determining second identification information corresponding to each attribute information of each original log in the type of original log according to the daughter board and the attribute information.
Optionally, the determining, as key information, attribute information that can be matched with the regular expression includes:
dividing a sub regular expression from the regular expressions used for attribute information matching to serve as a sub regular expression to be processed;
for each sub-regular expression to be processed, the following steps are executed:
s10, matching the sub regular expression to be processed with the attribute information;
s20, if the matching can be carried out, determining the attribute information which can be matched with the sub regular expression to be processed as key information;
s30, if the matching can not be carried out, continuing to divide a sub regular expression from the rest part of the regular expression except the sub regular expression to be processed, and executing S10 until the sub regular expression can not be divided from the regular expression, and deleting the attribute information which can not be matched with the sub regular expression.
Optionally, the regular expression includes at least two characters, and a space is provided between two adjacent characters; wherein the characters comprise combined characters and/or single characters;
the dividing of one sub regular expression from the regular expressions for attribute information matching includes:
taking the regular expression used for attribute information matching as a target regular expression;
a1, searching the target regular expression according to a preset searching sequence, determining a first space, and taking characters before the first space as the sub-regular expression to be processed;
and A2, taking all characters after the first space and all spaces as the target regular expression, and executing A1 until no spaces are found.
Optionally, further comprising:
receiving a formatting rule;
for each piece of key information, the following steps are carried out:
according to the formatting rule, the key information is divided into a plurality of data fields; wherein a number of said data fields have different meaning data fields, each meaning data field having at least one data field;
filtering a plurality of the data fields to filter out data fields with different meanings;
classifying the filtered data fields with different meanings;
the data fields of each meaning are deduplicated to preserve one data field with that meaning.
Optionally, further comprising:
receiving an encryption rule;
encrypting the key information according to the encryption rule;
optionally, further comprising:
receiving a forwarding rule or a storage rule;
and carrying out forwarding processing or storage processing on the key information according to the forwarding rule or the storage rule.
In a second aspect, an embodiment of the present invention further provides a log processing apparatus, including:
the acquisition module is used for acquiring a plurality of original logs acquired by at least one acquisition mode; each original log carries a plurality of attribute information;
the first determining module is used for determining a plug-in corresponding to each original log according to each original log acquired by the acquiring module;
a loop determination module to perform, for each raw log: according to the plug-in determined by the first determining module, determining first identification information corresponding to the original log acquired by the acquiring module and second identification information corresponding to each attribute information of the original log; judging whether the regular expression can be matched with the attribute information or not, and if so, determining the attribute information which can be matched with the regular expression as key information;
and the second determining module is used for determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information determined by the circulation determining module after receiving the first identification information and the second identification information sent by the user.
Optionally, the determining, according to each piece of original log, a plug-in corresponding to the piece of original log includes:
the first determining module is used for classifying the plurality of original logs and determining the plug-in corresponding to each type of original logs; wherein each type of original log comprises at least one original log.
Optionally, the insert, comprising: a plurality of daughter cards;
the loop determination module is configured to perform the following steps:
for each type of original log, determining first identification information corresponding to each type of original log according to the plug-in;
and for each type of original log, determining second identification information corresponding to each attribute information of each original log in the type of original log according to the sub-plug-ins and the attribute information.
Optionally, the loop determination module is configured to perform the following steps:
dividing a sub regular expression from the regular expressions used for attribute information matching to serve as a sub regular expression to be processed;
for each sub-regular expression to be processed, the following steps are executed:
s10, matching the sub regular expression to be processed with the attribute information;
s20, if the matching can be carried out, determining the attribute information which can be matched with the sub regular expression to be processed as key information;
s30, if the matching can not be carried out, continuing to divide a sub regular expression from the rest part of the regular expression except the sub regular expression to be processed, and executing S10 until the sub regular expression can not be divided from the regular expression, and deleting the attribute information which can not be matched with the sub regular expression.
Optionally, the regular expression includes at least two characters, and a space is provided between two adjacent characters; wherein the characters comprise combined characters and/or single characters;
the loop determination module is configured to perform the following steps:
taking the regular expression used for attribute information matching as a target regular expression;
a1, searching the target regular expression according to a preset searching sequence, determining a first space, and taking characters before the first space as the sub-regular expression to be processed;
and A2, taking all characters after the first space and all spaces as the target regular expression, and executing A1 until no spaces are found.
Optionally, further comprising:
a formatting module for performing the steps of:
receiving a formatting rule;
for each piece of key information, the following steps are carried out:
according to the formatting rule, the key information is divided into a plurality of data fields; wherein a number of said data fields have different meaning data fields, each meaning data field having at least one data field;
filtering a plurality of the data fields to filter out data fields with different meanings;
classifying the filtered data fields with different meanings;
the data fields of each meaning are deduplicated to preserve one data field with that meaning.
Optionally, further comprising:
receiving an encryption rule;
and encrypting the key information according to the encryption rule.
Optionally, further comprising:
a processing module for performing the steps of:
receiving a forwarding rule or a storage rule;
and carrying out forwarding processing or storage processing on the key information according to the forwarding rule or the storage rule.
In a third aspect, an embodiment of the present invention provides a log processing apparatus, including: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine-readable program to perform the method according to the first aspect or any possible implementation manner of the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable medium, on which computer instructions are stored, and when executed by a processor, the computer instructions cause the processor to perform the method provided by the first aspect or any possible implementation manner of the first aspect.
The log processing method, the log processing device and the readable medium provided by the embodiment of the invention are used for acquiring a plurality of original logs acquired by at least one acquisition mode, wherein each original log carries a plurality of attribute information, plug-ins corresponding to each original log are determined according to each original log, each plug-in comprises a plurality of regular expressions, and the following steps are executed for each original log: s1, according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log, S2, judging whether the regular expression can be matched with the attribute information, and if so, determining the attribute information which can be matched with the regular expression as key information; the method comprises the steps of receiving first identification information and second identification information sent by a user, and determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information. According to the embodiment of the invention, the key information of the log to be searched is determined by using the first identification information and the second identification information in combination with the form of the plug-in, so that the efficiency of log collection can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a log processing method according to an embodiment of the present invention;
fig. 2 is a flowchart of a log processing method according to another embodiment of the present invention;
fig. 3 is a schematic diagram of a device in which a log processing apparatus according to an embodiment of the present invention is located;
fig. 4 is a schematic diagram of a log processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Fig. 1 is a flowchart of a log processing method according to an embodiment of the present invention, and as shown in fig. 1, the method may include the following steps:
step 101, acquiring a plurality of original logs acquired by at least one acquisition mode;
step 102, determining plug-ins corresponding to each original log according to the original log;
step 103, executing for each original log: determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log according to the plug-in; judging whether the regular expression can be matched with the attribute information or not, and if so, determining the attribute information which can be matched with the regular expression as key information;
and 104, receiving first identification information and second identification information sent by a user, and determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information.
The log processing method provided by the embodiment of the invention obtains a plurality of original logs acquired by at least one acquisition mode, wherein each original log carries a plurality of attribute information, and plug-ins corresponding to the original logs are determined according to each original log, each plug-in comprises a plurality of regular expressions, and the following steps are executed for each original log: s1, according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log, S2, judging whether the regular expression can be matched with the attribute information, and if so, determining the attribute information which can be matched with the regular expression as key information; the method comprises the steps of receiving first identification information and second identification information sent by a user, and determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information. According to the embodiment of the invention, the key information of the log to be searched is determined by using the first identification information and the second identification information in combination with the form of the plug-in, so that the efficiency of log collection can be improved.
Based on the log processing method shown in fig. 1, in an embodiment of the present invention, the determining, according to each original log, a plug-in corresponding to the original log includes:
classifying the plurality of original logs, and determining plug-ins corresponding to each type of original logs;
wherein each type of original log comprises at least one original log.
In the embodiment of the invention, a plurality of original logs acquired by a plurality of acquisition modes are classified, plug-ins corresponding to the logs are created according to the original logs, and meanwhile, each log is ensured to comprise at least one log. According to the arrangement, the situation that each plug-in unit is different greatly due to the fact that the logs of different types are different can be avoided, when the same type of logs create the same plug-in unit, different attribute information in each log is filled in the plug-in unit, the different standardability of the log plug-in unit except log data is guaranteed, the logs of different types can be collected conveniently through the plug-in unit, and the log collection efficiency is improved.
Based on the log processing method shown in fig. 1, in an embodiment of the present invention, the plug-in includes: a plurality of daughter cards;
according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log, including:
for each type of original log, the following operations are performed:
determining first identification information corresponding to each original log in the type of original log according to the plug-in;
and determining second identification information corresponding to each attribute information of each original log in the type of original log according to the daughter board and the attribute information.
In the embodiment of the invention, each log corresponds to one plug-in, a plurality of attribute information in each log respectively corresponds to each daughter plug-in, the plug-in and the plurality of daughter plug-ins are subjected to identification information, and first identification information corresponding to each log and second identification information corresponding to each attribute information are formed, so that the logs can be distinguished quickly according to the first identification information of the plug-ins and the second identification information of the daughter plug-ins.
Based on the log processing method shown in fig. 1, in an embodiment of the present invention, the determining, as key information, attribute information that can be matched with the regular expression includes:
dividing a sub regular expression from the regular expressions used for attribute information matching to serve as a sub regular expression to be processed;
for each sub-regular expression to be processed, the following steps are executed:
s10, matching the sub regular expression to be processed with the attribute information;
s20, if the matching can be carried out, determining the attribute information which can be matched with the sub regular expression to be processed as key information;
s30, if the matching can not be carried out, continuing to divide a sub regular expression from the rest part of the regular expression except the sub regular expression to be processed, and executing S10 until the sub regular expression can not be divided from the regular expression, and deleting the attribute information which can not be matched with the sub regular expression.
In the embodiment of the invention, the regular expressions are segmented, and the attribute information is sequentially matched by using the simpler sub-regular expressions compared with the original regular expression, so that the matching speed can be improved.
Based on a log processing method shown in fig. 1, in an embodiment of the present invention, the regular expression includes at least two characters, and a space is provided between two adjacent characters; wherein the characters comprise combined characters and/or single characters;
the dividing of one sub regular expression from the regular expressions for attribute information matching includes:
taking the regular expression used for attribute information matching as a target regular expression;
a1, searching the target regular expression according to a preset searching sequence, determining a first space, and taking characters before the first space as the sub-regular expression to be processed;
and A2, taking all characters after the first space and all spaces as the target regular expression, and executing A1 until no spaces are found.
In the embodiment of the invention, the sub regular expressions segmented by the method can reduce a large amount of backtrails generated in the whole matching process, and improve the matching efficiency.
Based on the log processing method shown in fig. 1, in an embodiment of the present invention, the method further includes:
receiving a formatting rule;
for each piece of key information, the following steps are carried out:
according to the formatting rule, the key information is divided into a plurality of data fields; wherein a number of said data fields have different meaning data fields, each meaning data field having at least one data field;
filtering a plurality of the data fields to filter out data fields with different meanings;
classifying the filtered data fields with different meanings;
the data fields of each meaning are deduplicated to preserve one data field with that meaning.
Based on the log processing method shown in fig. 1, in an embodiment of the present invention, the method further includes:
receiving an encryption rule;
encrypting the key information according to the encryption rule;
in the embodiment of the invention, before forwarding or storing, when a user needs to encrypt the collected log, the key information in the log can be encrypted according to an encryption rule. The importance of network security enables the security degree of key information in the log to be improved by utilizing a domestic cryptographic algorithm determined by China to encrypt when network information is exchanged between China and abroad.
Based on the log processing method shown in fig. 1, in an embodiment of the present invention, the method further includes:
receiving a forwarding rule or a storage rule;
and carrying out forwarding processing or storage processing on the key information according to the forwarding rule or the storage rule.
In the embodiment of the invention, the log sources are collected by combining the plug-in, and after the collected log is subjected to the extraction of the key information, the key information can be forwarded or stored in various ways by combining the plug-in as well, so that the efficiency of log forwarding or storing is improved.
As shown in fig. 2, another log processing method is provided in the embodiment of the present invention. The method comprises the following steps:
step 201, a plurality of original logs collected by at least one collection mode are obtained.
In this step, the collection mode of the plurality of logs may be to actively collect the original logs in a form of reading files by using a monitor text, or to actively connect the database to collect the original logs by using a database-related configuration, and specifically, the collection mode of the original logs includes the following steps: the method comprises the steps of utilizing a local file to actively collect, configuring syslog parameters to passively receive logs, connecting a database to actively collect logs, actively collecting logs in a form of SSH public and private keys, connecting FTP to actively collect logs, actively collecting logs in a subscription mode, actively collecting logs by authorizing active log collection, actively collecting logs with information exchange with a server and collecting logs in a monitoring or active connection mode.
Step 202, according to each original log, determining a plug-in corresponding to the original log.
In this step, a plurality of original logs obtained through a plurality of collection modes are classified, plug-ins corresponding to the logs are created according to the original logs of each type, and meanwhile, each type of logs is ensured to include at least one log. According to the arrangement, the situation that each plug-in unit is different greatly due to the fact that the logs of different types are different can be avoided, when the same type of logs create the same plug-in unit, different attribute information in each log is filled in the plug-in unit, the different standardability of the log plug-in unit except log data is guaranteed, the logs of different types can be collected conveniently through the plug-in unit, and the log collection efficiency is improved.
Step 203, determining a plurality of regular expressions in the plug-in according to each original log.
In the step, the regular expression is determined in the plug-in, the requirement of a user for self-defining collection and analysis of the log can be met, and the collection and analysis method is high in universality and flexibility based on the regular expression.
For example, the plug-in includes the following information:
plugin_id=4003
[config]
type=detector
enable=yes
source=log
location=/var/log/actiontec.log
create_file=false
[out-put]
Enable=yes
output=’udp’
host=‘127.0.0.1’
port=514
[encryption]
Enable=no
Key=””
Iv=””
the regular expression in the plug-in is:
[regexp]
Engine=“NFA”
[0001 - Failed password]
event_type=event
Figure 19020DEST_PATH_IMAGE001
and 204, according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log.
In this step, each log corresponds to one plug-in, a plurality of attribute information in each log corresponds to each daughter plug-in, and the plug-in and the plurality of daughter plug-ins are identified to form first identification information corresponding to each log and second identification information corresponding to each attribute information, so that logs can be distinguished quickly according to the first identification information of the plug-ins and the second identification information of the daughter plug-ins.
Step 205, judging whether the regular expression can be matched with the attribute information, and if so, determining the attribute information which can be matched with the regular expression as key information.
In this step, a sub regular expression is divided from the regular expressions for attribute information matching, and is used as a sub regular expression to be processed, and the following steps are executed for the sub regular expressions to be processed: s10, matching the sub regular expressions to be processed with the attribute information, S20, if the sub regular expressions to be processed can be matched, determining the attribute information which can be matched with the sub regular expressions to be processed as key information, finishing the matching process, S30, if the sub regular expressions to be processed can not be matched, continuously dividing one sub regular expression from the rest part of the regular expressions except the sub regular expressions to be processed, taking the sub regular expression as a new sub regular expression to be processed, returning to S10 for the new sub regular expression to be processed, and continuously executing the steps until the new sub regular expression to be processed can not be divided from the regular expressions again, and finishing the matching; and deleting the attribute information which can not be matched after the steps are executed. By segmenting the regular expressions, the attribute information is matched by using the simpler regular sub-expressions compared with the original regular expressions, the matching speed can be improved, the key information of the log is extracted by using the regular sub-expressions in a self-defined mode, and by the arrangement, the log can be decomposed, the key information is extracted to be stored in a formatted mode, and further accurate analysis in the future is facilitated.
For example, the key information extracted using regular expressions is:
date={normalize_date($date)}
plugin_sid=1
device={$dst}
src_ip={$src}
dst_ip={$dst}
src_port={$sport}
username={$user}
userdata1={$info}
userdata2={$dst}
step 206, receiving the first identification information and the second identification information sent by the user, and determining the original log corresponding to the first identification information and the key information corresponding to the second identification information according to the first identification information and the second identification information.
In this step, the first identification information and the second identification information are utilized to quickly distinguish the logs to be collected from the plurality of logs, and the key information extracted from the logs is determined according to the logs.
For example, plugin _ id = "4003" is defined as: the SSH log comprises a plurality of pieces of attribute information, each piece of attribute information corresponds to one daughter card, first identification information corresponding to the log and second identification information corresponding to key information extracted from each piece of attribute information are determined, the plugin _ id = '1', which represents that SSH login is successful, and the plugin _ id = '2', which represents that SSH login is failed, so that the log can be rapidly distinguished and the key information of the log can be checked according to the first identification information and the second identification information.
And step 207, formatting the key information according to the received formatting rule.
In the step, the extracted key information is formatted according to a set formatting rule to form a fixed format, when a user needs to check a plurality of pieces of key information in the log, the key information required by the user can be quickly found according to the formatted key information, and by the arrangement, the complexity of query when the key information does not have the fixed format can be effectively avoided.
For example, the format of the formatting is set as follows:
[Event_type="detector" date="1588080854388219" sensor="" device="0.0.0.0" interface="eth5" plugin_id="100000" plugin_sid="24" priority="" protocol="" src_ip="10.180.38.33" src_port="" dst_ip="0.0.0.0" dst_port="" username="" password="" filename="" userdata1="4" userdata2="99" userdata3="97" userdata4="1" userdata5="AD " userdata6="{V2.0F-20191230 22:30:00 0027} 14.1 3.3 43.7 " userdata7="" userdata8="" userdata9="" occurrences=""]
and step 208, encrypting the key information according to the received encryption rule.
In the embodiment of the invention, before forwarding or storing, when a user needs to encrypt the collected log, the key information in the log can be encrypted according to an encryption rule. The importance of network security enables the security degree of key information in the log to be improved by a high degree by utilizing a domestic cryptographic algorithm determined by China to encrypt when network information exchange is carried out between China and abroad.
For example, the method supports an AES encryption algorithm and a SM encryption algorithm, and for the encryption process of the national secret, an encryption mode and an encryption key are firstly obtained, and key information in a collected log is encrypted.
Step 209, according to the received forwarding rule or storage rule, performing forwarding processing or storage processing on the key information.
In the embodiment of the invention, the log sources are collected by combining the plug-in, and after the collected log is subjected to the extraction of the key information, the key information can be forwarded or stored in various ways by combining the plug-in as well, so that the efficiency of log forwarding or storing is improved.
For example, forwarding or storing the collected log includes, but is not limited to, the following ways: the method comprises the steps of storing by using a local file, sending by using a standard syslog, forwarding by using a standard TCP, forwarding by using a standard UDP, publishing to a redis server provided by a user in a publishing mode, sending to a message queue provided by the user, storing by using a distributed file system such as HDFS, storing by using a mainstream database and other acquisition modes.
As shown in fig. 3 and 4, an embodiment of the present invention provides a device in which a log processing apparatus is located and a log processing apparatus. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. From a hardware level, as shown in fig. 3, a hardware structure diagram of a device in which a log processing apparatus according to an embodiment of the present invention is located is provided, where in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 3, the device in which the apparatus is located in the embodiment may also include other hardware, such as a forwarding chip responsible for processing a packet, and the like. Taking a software implementation as an example, as shown in fig. 4, as a logical apparatus, the apparatus is formed by reading a corresponding computer program instruction in a non-volatile memory into a memory by a CPU of a device in which the apparatus is located and running the computer program instruction.
As shown in fig. 4, the log processing apparatus provided in this embodiment includes:
an obtaining module 401, configured to obtain a plurality of original logs collected by at least one collection method; each original log carries a plurality of attribute information;
a first determining module 402, configured to determine, according to each original log acquired by the acquiring module 401, a plug-in corresponding to the original log;
a loop determination module 403, configured to perform, for each raw log: according to the plug-in determined by the first determining module 402, determining first identification information corresponding to the original log acquired by the acquiring module 401 and second identification information corresponding to each attribute information of the original log; judging whether the regular expression can be matched with the attribute information or not, and if so, determining the attribute information which can be matched with the regular expression as key information;
a second determining module 404, configured to determine, after receiving the first identification information and the second identification information sent by the user, an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information of the loop determining module 403.
In an embodiment of the present invention, the obtaining module 401 may be configured to perform step 101 in the above-described method embodiment, the first determining module 402 may be configured to perform step 102 in the above-described method embodiment, the loop determining module 403 may be configured to perform step 103 in the above-described method embodiment, and the second determining module 404 may be configured to perform step 104 in the above-described method embodiment.
In an embodiment of the present invention, the determining, according to each original log, a plug-in corresponding to the original log includes:
the first determining module is used for classifying the plurality of original logs and determining the plug-in corresponding to each type of original logs; wherein each type of original log comprises at least one original log.
In one embodiment of the present invention, the insert comprises: a plurality of daughter cards;
the loop determination module is configured to perform the following steps:
for each type of original log, determining first identification information corresponding to each type of original log according to the plug-in;
and for each type of original log, determining second identification information corresponding to each attribute information of each original log in the type of original log according to the sub-plug-ins and the attribute information.
In an embodiment of the present invention, the loop determination module is configured to perform the following steps:
dividing a sub regular expression from the regular expressions used for attribute information matching to serve as a sub regular expression to be processed;
for each sub-regular expression to be processed, the following steps are executed:
s10, matching the sub regular expression to be processed with the attribute information;
s20, if the matching can be carried out, determining the attribute information which can be matched with the sub regular expression to be processed as key information;
s30, if the matching can not be carried out, continuing to divide a sub regular expression from the rest part of the regular expression except the sub regular expression to be processed, and executing S10 until the sub regular expression can not be divided from the regular expression, and deleting the attribute information which can not be matched with the sub regular expression.
In an embodiment of the present invention, the loop determination module is configured to perform the following steps:
taking the regular expression used for attribute information matching as a target regular expression;
a1, searching the target regular expression according to a preset searching sequence, determining a first space, and taking characters before the first space as the sub-regular expression to be processed;
and A2, taking all characters after the first space and all spaces as the target regular expression, and executing A1 until no spaces are found.
In one embodiment of the invention, the formatting module is configured to perform the following steps:
receiving a formatting rule;
for each piece of key information, the following steps are carried out:
according to the formatting rule, the key information is divided into a plurality of data fields; wherein a number of said data fields have different meaning data fields, each meaning data field having at least one data field;
filtering a plurality of the data fields to filter out data fields with different meanings;
classifying the filtered data fields with different meanings;
the data fields of each meaning are deduplicated to preserve one data field with that meaning.
In one embodiment of the present invention, the encryption module is configured to perform the following steps:
receiving an encryption rule;
and encrypting the key information according to the encryption rule.
In one embodiment of the present invention, the processing module is configured to perform the following steps:
receiving a forwarding rule or a storage rule;
and carrying out forwarding processing or storage processing on the key information according to the forwarding rule or the storage rule.
It is to be understood that the schematic structure in the embodiment of the present invention does not specifically limit a log processing apparatus. In other embodiments of the invention, a log processing apparatus may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Because the content of information interaction, execution process, and the like among the modules in the device is based on the same concept as the method embodiment of the present invention, specific content can be referred to the description in the method embodiment of the present invention, and is not described herein again.
An embodiment of the present invention further provides a log processing apparatus, including: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine-readable program to perform a log processing method according to any embodiment of the present invention.
Embodiments of the present invention also provide a computer-readable medium storing instructions for causing a computer to perform a log processing method as described herein. Specifically, a method or an apparatus equipped with a storage medium on which a software program code that realizes the functions of any of the above-described embodiments is stored may be provided, and a computer (or a CPU or MPU) of the method or the apparatus is caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments can be implemented not only by executing the program code read out by the computer, but also by performing a part or all of the actual operations by an operation method or the like operating on the computer based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
In summary, the log processing method, the log processing device and the readable medium provided by the embodiments of the present invention at least have the following advantages:
1. in the embodiment of the invention, a plurality of original logs acquired by at least one acquisition mode are acquired, a plug-in corresponding to each original log is determined according to each original log, first identification information corresponding to each original log and second identification information corresponding to each attribute information of each original log are determined according to the plug-in, a plurality of regular expressions are determined in the plug-in according to the original logs, whether the regular expressions can be matched with the attribute information or not is judged, if yes, information that the regular expressions are successfully matched with the attribute information is determined as key information, and the original logs corresponding to the first identification information and the key information corresponding to the second identification information are determined according to the first identification information and the second identification information. According to the embodiment of the invention, the key information of the log to be searched is determined by using the first identification information and the second identification information in combination with the form of the plug-in, so that the efficiency of log collection can be improved.
2. In the embodiment of the invention, a plurality of original logs acquired by a plurality of acquisition modes are classified, plug-ins corresponding to the logs are created according to the original logs, and meanwhile, each log is ensured to comprise at least one log. According to the arrangement, the situation that each plug-in unit is different greatly due to the fact that the logs of different types are different can be avoided, when the same type of logs create the same plug-in unit, different attribute information in each log is filled in the plug-in unit, the different standardability of the log plug-in unit except log data is guaranteed, the logs of different types can be collected conveniently through the plug-in unit, and the log collection efficiency is improved.
3. In the embodiment of the invention, each log corresponds to one plug-in, a plurality of attribute information in each log respectively corresponds to each daughter plug-in, the plug-in and the plurality of daughter plug-ins are subjected to identification information, and first identification information corresponding to each log and second identification information corresponding to each attribute information are formed, so that the logs can be distinguished quickly according to the first identification information of the plug-ins and the second identification information of the daughter plug-ins.
4. In the embodiment of the invention, the extracted key information is formatted according to the set formatting rule to form a fixed format, when a user needs to check a plurality of pieces of key information in the log, the key information required by the user can be quickly searched according to the formatted key information, and by the arrangement, the complexity of query when the key information has no fixed format can be effectively avoided.
5. In the embodiment of the invention, before forwarding or storing, when a user needs to encrypt the collected log, the key information in the log can be encrypted according to an encryption rule. The importance of network security enables the security degree of key information in the log to be improved by utilizing a domestic cryptographic algorithm determined by China to encrypt when network information is exchanged between China and abroad.
6. In the embodiment of the invention, the log sources are collected by combining the plug-in, and after the collected log is subjected to the extraction of the key information, the key information can be forwarded or stored in various ways by combining the plug-in as well, so that the efficiency of log forwarding or storing is improved.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (9)

1. A log processing method, comprising:
acquiring a plurality of original logs acquired by at least one acquisition mode; each original log carries a plurality of attribute information;
determining a plug-in corresponding to each original log according to each original log; each plug-in comprises a plurality of regular expressions;
for each original log, the following steps are performed:
s1, according to the plug-in, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log;
s2, judging whether the regular expression can be matched with the attribute information or not, and if so, determining the attribute information which can be matched with the regular expression as key information;
receiving first identification information and second identification information sent by a user, and determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information;
wherein the content of the first and second substances,
the determining, as key information, attribute information that can be matched with the regular expression includes:
dividing a sub regular expression from the regular expressions used for attribute information matching to serve as a sub regular expression to be processed;
for each sub-regular expression to be processed, the following steps are executed:
s10, matching the sub regular expression to be processed with the attribute information;
s20, if the matching can be carried out, determining the attribute information which can be matched with the sub regular expression to be processed as key information;
s30, if the matching can not be carried out, continuing to divide a sub regular expression from the rest part of the regular expression except the sub regular expression to be processed, and executing S10 until the sub regular expression can not be divided from the regular expression, and deleting the attribute information which can not be matched with the sub regular expression.
2. The method of claim 1, wherein determining, from each raw log, a plug-in corresponding to the raw log comprises:
classifying the plurality of original logs, and determining plug-ins corresponding to each type of original logs;
wherein each type of original log comprises at least one original log.
3. The method of claim 2, wherein the insert comprises: a plurality of daughter cards;
the determining, according to the plug-in, first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log includes:
for each type of original log, the following operations are performed:
determining first identification information corresponding to each original log in the type of original log according to the plug-in;
and determining second identification information corresponding to each attribute information of each original log in the type of original log according to the daughter board and the attribute information.
4. The method of claim 1, wherein the regular expression comprises at least two characters, adjacent two of the characters having a space therebetween; wherein the characters comprise combined characters and/or single characters;
the dividing of one sub regular expression from the regular expressions for attribute information matching includes:
taking the regular expression used for attribute information matching as a target regular expression;
a1, searching the target regular expression according to a preset searching sequence, determining a first space, and taking characters before the first space as the sub-regular expression to be processed;
and A2, taking all characters after the first space and all spaces as the target regular expression, and executing A1 until no spaces are found.
5. The method of claim 1, further comprising:
receiving a formatting rule;
for each piece of key information, the following steps are carried out:
according to the formatting rule, the key information is divided into a plurality of data fields; wherein a number of said data fields have different meaning data fields, each meaning data field having at least one data field;
filtering a plurality of the data fields to filter out data fields with different meanings;
classifying the filtered data fields with different meanings;
the data fields of each meaning are deduplicated to preserve one data field with that meaning.
6. The method of claim 1, further comprising:
receiving an encryption rule;
encrypting the key information according to the encryption rule;
alternatively, the first and second electrodes may be,
receiving a forwarding rule or a storage rule;
and carrying out forwarding processing or storage processing on the key information according to the forwarding rule or the storage rule.
7. A log processing apparatus, comprising:
the acquisition module is used for acquiring a plurality of original logs acquired by at least one acquisition mode; each original log carries a plurality of attribute information;
the first determining module is used for determining a plug-in corresponding to each original log according to each original log acquired by the acquiring module; each plug-in comprises a plurality of regular expressions;
a loop determination module to perform, for each raw log: according to the plug-in determined by the first determining module, determining first identification information corresponding to the original log and second identification information corresponding to each attribute information of the original log; judging whether the regular expression can be matched with the attribute information, if so, determining the information of the regular expression successfully matched with the attribute information as key information; the loop determination module is specifically configured to perform the following steps:
dividing a sub regular expression from the regular expressions used for attribute information matching to serve as a sub regular expression to be processed;
for each sub-regular expression to be processed, the following steps are executed:
s10, matching the sub regular expression to be processed with the attribute information;
s20, if the matching can be carried out, determining the attribute information which can be matched with the sub regular expression to be processed as key information;
s30, if the matching can not be carried out, continuing to divide a sub regular expression from the rest part of the regular expression except the sub regular expression to be processed, and executing S10 until the sub regular expression can not be divided from the regular expression, and deleting the attribute information which can not be matched with the sub regular expression;
and the second determining module is used for determining an original log corresponding to the first identification information and key information corresponding to the second identification information according to the first identification information and the second identification information determined by the circulation determining module after receiving the first identification information and the second identification information sent by the user.
8. A log processing apparatus, comprising: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor, configured to invoke the machine readable program, to perform the method of any of claims 1 to 6.
9. Computer readable medium, characterized in that it has stored thereon computer instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 6.
CN202010874406.8A 2020-08-27 2020-08-27 Log processing method and device and readable medium Active CN111737091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010874406.8A CN111737091B (en) 2020-08-27 2020-08-27 Log processing method and device and readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010874406.8A CN111737091B (en) 2020-08-27 2020-08-27 Log processing method and device and readable medium

Publications (2)

Publication Number Publication Date
CN111737091A CN111737091A (en) 2020-10-02
CN111737091B true CN111737091B (en) 2020-12-08

Family

ID=72658895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010874406.8A Active CN111737091B (en) 2020-08-27 2020-08-27 Log processing method and device and readable medium

Country Status (1)

Country Link
CN (1) CN111737091B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306979B (en) * 2020-10-30 2022-11-01 浪潮通用软件有限公司 Message queue-based log information processing method and device and readable medium
CN113179176B (en) * 2021-03-31 2022-05-27 新华三信息安全技术有限公司 Log processing method, device and equipment and machine readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005549A (en) * 2015-07-31 2015-10-28 山东蚁巡网络科技有限公司 User-defined chained log analysis device and method
CN105138593A (en) * 2015-07-31 2015-12-09 山东蚁巡网络科技有限公司 Method for extracting log key information in user-defined way by using regular expressions
CN106021554A (en) * 2016-05-30 2016-10-12 北京奇艺世纪科技有限公司 Log analysis method and device
CN107451034A (en) * 2017-08-17 2017-12-08 浪潮软件股份有限公司 A kind of big data cluster log management apparatus, method and system
US10521331B1 (en) * 2018-08-31 2019-12-31 The Mitre Corporation Systems and methods for declarative specification, detection, and evaluation of happened-before relationships
CN110990218A (en) * 2019-11-22 2020-04-10 深圳前海环融联易信息科技服务有限公司 Visualization and alarm method and device based on mass logs and computer equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595310A (en) * 2017-12-28 2018-09-28 北京兰云科技有限公司 A kind of log processing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005549A (en) * 2015-07-31 2015-10-28 山东蚁巡网络科技有限公司 User-defined chained log analysis device and method
CN105138593A (en) * 2015-07-31 2015-12-09 山东蚁巡网络科技有限公司 Method for extracting log key information in user-defined way by using regular expressions
CN106021554A (en) * 2016-05-30 2016-10-12 北京奇艺世纪科技有限公司 Log analysis method and device
CN107451034A (en) * 2017-08-17 2017-12-08 浪潮软件股份有限公司 A kind of big data cluster log management apparatus, method and system
US10521331B1 (en) * 2018-08-31 2019-12-31 The Mitre Corporation Systems and methods for declarative specification, detection, and evaluation of happened-before relationships
CN110990218A (en) * 2019-11-22 2020-04-10 深圳前海环融联易信息科技服务有限公司 Visualization and alarm method and device based on mass logs and computer equipment

Also Published As

Publication number Publication date
CN111737091A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
US10917417B2 (en) Method, apparatus, server, and storage medium for network security joint defense
US8069210B2 (en) Graph based bot-user detection
CN111737091B (en) Log processing method and device and readable medium
EP3048772B1 (en) Representing identity data relationships using graphs
Ye et al. NetPlier: Probabilistic Network Protocol Reverse Engineering from Message Traces.
CN113542253B (en) Network flow detection method, device, equipment and medium
WO2020252897A1 (en) Distributed link data authentication method, device and apparatus, and storage medium
CN114157502B (en) Terminal identification method and device, electronic equipment and storage medium
CN110868409A (en) Passive operating system identification method and system based on TCP/IP protocol stack fingerprint
Yu et al. Large-scale IoT devices firmware identification based on weak password
CN115017519A (en) Data sealing regularity detecting method and device
Cankaya et al. A survey of digital forensics tools for database extraction
CN115766258A (en) Multi-stage attack trend prediction method and device based on causal graph and storage medium
Gomez et al. Unsupervised detection and clustering of malicious tls flows
CN105207829B (en) Intrusion detection data processing method, device and system
CN116668157A (en) API interface identification processing method, device and medium based on zero trust gateway log
Garn et al. Browser fingerprinting using combinatorial sequence testing
CN111814423B (en) Log formatting method and device and storage medium
CN111741029B (en) Log data processing method, processing device and storage medium
US11997130B2 (en) Inline detection of encrypted malicious network sessions
Anderson et al. The Generation and Use of TLS Fingerprints
KR101886526B1 (en) Method and system for specifying payload signature for elaborate application traffic classification
CN116032545B (en) Multi-stage filtering method and system for ssl or tls flow
RU2808385C1 (en) Method for classifying objects to prevent spread of malicious activity
US11693851B2 (en) Permutation-based clustering of computer-generated data entries

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant