CN117827784A - Noise log filtering method and system - Google Patents

Noise log filtering method and system Download PDF

Info

Publication number
CN117827784A
CN117827784A CN202410011273.XA CN202410011273A CN117827784A CN 117827784 A CN117827784 A CN 117827784A CN 202410011273 A CN202410011273 A CN 202410011273A CN 117827784 A CN117827784 A CN 117827784A
Authority
CN
China
Prior art keywords
log
noise
template
filtering
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410011273.XA
Other languages
Chinese (zh)
Inventor
涂彦伦
徐成
徐辰
陈忻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Nova Technology Singapore Holdings Ltd
Original Assignee
Advanced Nova Technology Singapore Holdings Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Nova Technology Singapore Holdings Ltd filed Critical Advanced Nova Technology Singapore Holdings Ltd
Priority to CN202410011273.XA priority Critical patent/CN117827784A/en
Publication of CN117827784A publication Critical patent/CN117827784A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present disclosure provides a method and a system for filtering a noise log, where a log to be detected is obtained, a pre-deployed online service is invoked, noise filtering processing is performed on the log to be detected based on the online service, the online service is deployed based on a pre-constructed current log template, the current log template is a first log template or a second log template, the first log template is a first fixed depth analysis tree for filtering the noise log, which is obtained by training in an offline manner according to first noise data obtained by monitoring on a warp, the second log template is a second fixed depth analysis tree for filtering the noise log, which is obtained by updating a previous log template according to second noise data obtained by monitoring on a warp, in an offline manner, the previous log template corresponding to the first update in the update stage is the first log template, and a cycle logic for updating the monitoring construction call of the log template can be implemented, and filtering automation can be realized.

Description

Noise log filtering method and system
Technical Field
The specification relates to the technical field of artificial intelligence, and relates to a noise log filtering method and system.
Background
In the fault root cause positioning scene, the log is an important means for root cause positioning, and filtering the log is the basis of root cause positioning, for example, a system of fault root cause usually performs log filtering first, and root cause positioning is performed based on a reserved root cause abnormal log.
In the related art, the system may filter the log based on word frequency, such as counting the occurrence frequency of words/continuous words occurring in the log, and construct a log template for characterizing the noise log through the occurrence frequency, and perform log filtering based on the log template.
However, under the condition that the system adopts the method to filter the log, the technical problems of low filtering accuracy, stability, continuity and the like exist.
It should be noted that the content of the related art is only information known to the inventor, and does not represent that the information has entered the public domain before the filing date of the present disclosure, or that it may be the prior art of the present disclosure.
Disclosure of Invention
The present disclosure provides a method and a system for filtering a noise log, so as to avoid at least one of the above technical problems.
In a first aspect, the present disclosure provides a method for filtering a noise log, including: obtaining a log to be detected obtained by monitoring on the warp;
Invoking a pre-deployed online service, and performing noise filtering processing on the log to be detected based on the online service, wherein the online service is deployed based on a pre-constructed current log template, and the current log template is a first log template or a second log template; and
the first log template is a first fixed depth analysis tree which is obtained by training according to first noise data obtained by monitoring on the warp in an offline mode in a training stage and used for filtering a noise log;
the second log template is a second fixed depth analysis tree for filtering a noise log, which is obtained by updating a previous log template according to second noise data obtained by monitoring on a meridian in an offline mode in an updating stage, wherein the previous log template corresponding to the first updating in the updating stage is the first log template.
In a second aspect, the present disclosure provides a fault root location system for filtering noise logs, the system comprising: the system comprises a real-time online detection module, a data reading module, a time sequence database, a training module, a deployment module and a log template updating module; wherein,
The real-time online detection module is used for obtaining a log to be detected obtained through online monitoring, calling an online service deployed in advance, and performing noise filtering processing on the log to be detected based on the online service, wherein the online service is obtained through deployment of the deployment module based on a current log template constructed in advance, and the current log template is a first log template or a second log template; and
the training module is used for training the first log template obtained from the first noise data obtained from the time sequence database in an offline mode in a training stage, wherein the first log template is a first fixed depth analysis tree for filtering a noise log, and the first noise data is obtained from the real-time online detection module in an offline mode by the data reading module;
the log template updating module is configured to update, in an updating stage, a previous log template obtained by updating the previous log template with second noise data obtained from the time sequence database in an offline manner, where the second log template is a second fixed depth analysis tree for filtering a noise log, the second noise data is obtained from the real-time online detection module by the data reading module in an offline manner, and a previous log template corresponding to a first update in the updating stage is the first log template.
In a third aspect, the present disclosure provides a noise log filtering system comprising:
at least one memory including at least one set of instructions to push information;
at least one processor communicatively coupled to the at least one memory,
wherein, when the system is running, the at least one processor executes the at least one set of instructions to implement the method of the first aspect.
In a fourth aspect, the present disclosure provides a processor-readable storage medium storing a computer program for causing the processor to perform the method of the first aspect.
In a fifth aspect, the present disclosure provides an electronic device comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the method as described in the first aspect.
In a sixth aspect, the present disclosure provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.
The present disclosure provides a method and a system for filtering a noise log, where a log to be detected obtained through on-line monitoring is obtained, a pre-deployed on-line service is invoked, and noise filtering processing is performed on the log to be detected based on the on-line service, where the on-line service is deployed based on a pre-constructed current log template, the current log template is a first log template or a second log template, the first log template is a first fixed depth analysis tree for filtering the noise log obtained through off-line training according to first noise data obtained through on-line monitoring, the second log template is a second fixed depth analysis tree for filtering the noise log obtained through off-line updating of the previous log template according to second noise data obtained through off-line monitoring, where the previous log template corresponding to the first update in the update stage is the first log template.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present description, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an application scenario schematic diagram of a method for filtering a noise log according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a method of filtering a noise log according to an embodiment of the disclosure;
FIG. 3 is a schematic diagram of noise filtering processing of logs to be detected based on online service according to an embodiment of the disclosure;
FIG. 4 is a schematic diagram of a training derived first log template according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a structure of a fixed depth parse tree according to an embodiment of the disclosure;
FIG. 6 is a schematic diagram of a fault root location system of one embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a fault root location system according to another embodiment of the present disclosure;
fig. 8 is a hardware configuration diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
It should be understood that the terms "comprises" and "comprising," and any variations thereof, in the embodiments of the disclosure are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.
The term "and/or" in the embodiments of the present disclosure describes an association relationship of association objects, which indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
The term "plurality" in the embodiments of the present disclosure means two or more, and other adjectives are similar thereto.
The terms "first," "second," "third," and the like in this disclosure are used for distinguishing between similar or similar objects or entities and not necessarily for limiting a particular order or sequence, unless otherwise indicated (Unless otherwise indicated). It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.
The term "unit/module" as used in this disclosure refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the function associated with that element.
To facilitate the reader's understanding of this disclosure, at least some of the terms of this disclosure are now explained as follows:
the log refers to that the target object (such as a program) generates various information during running, and the system is used for recording the running state of the target object so that related personnel (such as operation and maintenance personnel of the target object) can monitor and troubleshoot the target object. The log will typically contain rich content such as time stamps, internet protocol (Internet Protocol, IP) addresses, success codes, error codes, calling method names, etc.
The noise log is a log of the cause of a fault that frequently occurs in daily life but cannot reflect a target object. Wherein, the root cause content may be different for different application scenarios. For example, the root cause is the cause of the failure of the target object (such as the target application, etc.), and specifically, the preset cause of the operation failure of the target object, which is determined by the system based on the application scenario, may be specifically the root cause of the failure of the target object, and may also be the specific cause of the failure of the target object.
The exception log, which may also be referred to as root cause exception log, refers to a log obtained by a system to help an operation and maintenance person of a target object to perform root cause positioning, and generally includes effective information related to a fault of the target object.
The log template refers to a group of log lists consisting of wild cards and constants containing specific patterns, and can be used for log template matching.
Wild cards refer to a special sentence, mainly with asterisks and question marks (. When looking up a folder, it can be used instead of one or more real characters; wild cards are often used instead of one or more real characters when the real characters are not known or when the complete name is lazily entered.
The fault root cause positioning refers to a method for performing root cause mining by a system when the fault of a target object occurs/is likely to occur, so as to obtain the real cause of the fault.
The fixed depth analysis tree clustering refers to a method for clustering by using a fixed depth analysis tree by a system. The depth of the tree is a preset fixed value, the fixed depth analysis tree comprises a root node, a root node and leaf nodes, a series of rules are configured for routing from the root node to the root node and then to the leaf nodes by the fixed depth analysis tree, and the leaf nodes are cluster sets.
In this embodiment, cluster center updating and cluster creation may be performed according to the similarity index. And the similarity may specifically be a jerad (Jaccard) similarity.
Jaccard similarity refers to a method for evaluating the similarity of two sets, and the Jaccard similarity index can be obtained by dividing the number of intersections of the two sets by the number of union of the two sets.
The time series database is called a time series database, and is mainly used for processing data with time labels (which change according to the sequence of time, i.e. are time-series), and the data with time labels are also called time series data.
In a fault root location scenario, a log is an important means for root location. For example, when a program is running, various logs are generated for recording the running state of the program so that related personnel can monitor the program and troubleshoot the program, and the logs generally contain rich information such as time stamps, IP addresses, success codes, error codes, calling method names and the like. When the monitoring system sends out alarm information to prompt that potential faults possibly occur or occur, the root cause positioning scheme based on the service logs searches logs related to alarm application, logs with abnormal occurrence are screened according to service rules, effective information is extracted from the logs, and therefore the reason for the alarms/faults is determined. However, in a practical business scenario, the failure log line often contains noise, for example, when the user purchases goods to select payment, the goods are sold empty, so that the user's behavior of purchasing goods to pay fails, but this matter is reasonable, so that the log of the user's failure to pay to purchase goods cannot be regarded as a true abnormal log. The problem is that for an operator who does not know the semantics of a specific business he is not able to exclude the log accurately from the exception range, and that such failure logs may occur every minute. Meanwhile, the noise failure log can not help operation and maintenance personnel to find the root cause of the fault when the fault actually occurs. Therefore, an effective noise failure log filtering scheme is needed to remove the common noise log, and help the operation and maintenance personnel to more efficiently and accurately locate the root cause.
In one technical scheme of the related art, the system may filter the obtained log based on word frequency to confirm whether the log is a noise log or a root cause abnormal log.
For example, the system may construct a noise log template based on a frequent item mining strategy, during which the system needs to count the occurrence frequency of words/consecutive words occurring in the log, e.g., for a given log, if the occurrence frequency of a certain word in the log is greater than a given threshold, the system determines that the word is a frequent word, the frequent word in the log and its occurrence location will be extracted by the system for use in clustering candidates, and the system generates corresponding clusters according to the frequency of occurrence in all logs. After generating clusters for all noise logs, for a given log to be filtered, the system can determine whether the log to be filtered is a noise log by whether the log to be filtered is matched with an existing cluster. However, this method suffers from at least one of the following disadvantages:
the threshold value used by the system to judge frequent words is difficult to determine, for example, in log printing, a problem of unbalanced printing frequency of the program code may occur, and for the program code with low printing frequency, the frequency of the corresponding constant in the generated log is also low, and the constant can be regarded as a parameter by mistake.
In a streaming scene, a scheme based on word frequency needs to design a fusion module to ensure the stability and continuity of a log template, and is relatively more complex and higher in cost.
In another technical solution of the related art, the system may filter the obtained log based on a heuristic noise log filtering method.
For example, the system may use a heuristic strategy to group and generate a log template based on heuristic log pattern parsing, and for a given log to be filtered, the system may determine whether the log to be filtered is a noise log by whether it matches the log template. Wherein, heuristic strategy includes: grouping according to log length, grouping according to word position, further constructing a log pattern library, and the like. However, this method has the following disadvantages:
the universality is poor, such as various types of logs, and the noise log filtering needs to be applied to different scenes, so that one set of heuristic strategy is not necessarily applicable to all types of logs and is not necessarily applicable to all application scenes.
It should be noted that the content of the related art is only information known to the inventor, and does not represent that the information has entered the public domain before the filing date of the present disclosure, or that it may be the prior art of the present disclosure.
In order to avoid at least one of the above problems, the present disclosure proposes the technical idea of the inventive effort: a noise log filtering scheme capable of automatically measuring, deploying and updating is provided. The system trains and updates the log template, and noise filters the log to be detected based on the log template obtained by training or the online service deployed by the updated log template, so as to construct the technical scheme of automatic training, automatic deployment, automatic calling and automatic updating of the log template.
Before explaining the implementation principle of the filtering method of the noise log of the present disclosure, an application scenario of the filtering method of the noise log of the present disclosure is described exemplarily to deepen the understanding of the reader on the filtering method of the noise log of the present disclosure.
Fig. 1 is an application scenario diagram of a method for filtering a noise log according to an embodiment of the present disclosure, where the method for filtering a noise log of the present disclosure may be applied to the system 100 for filtering a noise log shown in fig. 1. As shown in fig. 1, a noise log filtering system 100 may include a target user 101, a client 102, a server 103, and a network 104.
The target user 101 may be a user that triggers filtering of the log to be detected, and the target user 101 may perform filtering of the noise log at the client 102.
The client 102 may be a device that noise log filters the detection log in response to the filtering of the noise log by the target user 102. I.e., the method of filtering the noise log, may be performed on the client 102. At this time, the client 102 may store data or instructions to perform the filtering method of the noise log described in the present specification, and may execute or be used to execute the data or instructions. In some embodiments, the client 102 may include a hardware device having a data information processing function and a program necessary to drive the hardware device to operate.
As shown in fig. 1, a client 102 may be communicatively connected to a server 103. The server 103 may be in communication with one client 102 or may be in communication with a plurality of clients 102. In some embodiments, client 102 may interact with server 103 over network 104 to receive or send messages, etc.
In some embodiments, the client 102 may include a mobile device, a tablet, a laptop, a built-in device of a motor vehicle, or the like, or any combination thereof. In some embodiments, the mobile device may include a smart home device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home device may include a smart television, desktop computer, or the like, or any combination. In some embodiments, the smart mobile device may include a smart phone, personal digital assistant, gaming device, navigation device, etc., or any combination thereof. In some embodiments, built-in devices in a motor vehicle may include an on-board computer, an on-board television, and the like.
In some embodiments, the client 102 may be installed with one or more Applications (APP). The APP can provide the target user 101 with the ability to interact with the outside world through the network 104. APP includes, but is not limited to: web browser-like APP programs, search-like APP programs, chat-like APP programs, shopping-like APP programs, video-like APP programs, financial-like APP programs, instant messaging tools, mailbox clients, social platform software, and the like. In some embodiments, the client 102 may have a target APP installed thereon. The target APP can collect logs to be detected for the client 102.
The server 103 may be a server that provides various services, for example, a background server that provides support for filtering noise logs for a plurality of accounts and user data sets and account login information corresponding to the plurality of accounts collected on the client 102.
In some embodiments, the method of filtering the noise log may be performed on the server 103. At this time, the server 103 may store data or instructions to perform the filtering method of the noise log described in the present specification, and may execute or be used to execute the data or instructions.
In some embodiments, the server 103 may include a hardware device having a data information processing function and a program necessary to drive the hardware device to operate. Similarly, the server 103 may be communicatively connected to one client 103 and receive data transmitted from the client 103, or may be communicatively connected to a plurality of clients 103 and receive data transmitted from each client 103.
Network 104 is a medium used to provide communication connections between clients 102 and servers 103. The network 104 may facilitate the exchange of information or data. As shown in fig. 1, a client 102 and a server 103 may be connected to a network 104, respectively, and mutually transmit information or data through the network 104.
In some embodiments, the network 104 may be any type of wired or wireless network, or a combination thereof. For example, the network 104 may include a cable network, a wired network, a fiber optic network, a telecommunications network, an intranet, the internet, a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN), a wireless local area network (Wireless Local Area Network, WLAN), a metropolitan area network (Metropolitan Area Network, MAN), a public switched telephone network (Public Switched Telephone Network, PSTN), a bluetooth network (TM), a short range wireless network (zigbee (TM), a near field communication (Near Field Communication, NFC) network, or the like.
In some embodiments, network 104 may include one or more network access points. For example, the network 104 may include a wired or wireless network access point, such as a base station or an internet switching point, through which one or more components of the client 102 and server 103 may connect to the network 104 to exchange data or information.
It should be understood that the number of clients 102, servers 103, and networks 104 in fig. 1 is merely illustrative. There may be any number of clients 102, servers 103, and networks 104, as desired for implementation. The filtering method of the noise log provided by the disclosure can be completely executed on the client 102, can be completely executed on the server 103, can be partially executed on the client 102, and can be partially executed on the server 103.
That is, fig. 1 and the above description with respect to fig. 1 are merely exemplary to illustrate application scenarios to which the filtering method of the noise log of the present disclosure may be applied, and are not to be construed as limiting the application scenarios.
Referring to fig. 2, fig. 2 is a schematic diagram of a method for filtering a noise log according to an embodiment of the disclosure. As shown in fig. 2, the method includes:
s201: and obtaining a log to be detected obtained through monitoring on the warp.
The execution body of the present embodiment may be a filtering device of a noise log (hereinafter simply referred to as a filtering device), and the filtering device may be a server, a terminal device, a processor, a chip, or the like, which are not listed here.
If the filtering device is a server, the filtering device may be an independent server or a cluster server; the cloud server may be a cloud server or a local server, which is not limited in this embodiment.
The filtering device may be a client, a server, or a filtering system including the client and the server, for example, in connection with the application scenario shown in fig. 1.
The manner of acquiring the log to be detected is not limited in this embodiment, for example:
in one example, the filtering device may connect with other devices and receive logs to be detected sent by the other devices.
For example, taking the application scenario shown in fig. 1 as an example, the filtering device may be a server shown in fig. 1, and the other devices may be clients shown in fig. 1, where a user may input a log to be detected on the client by using an APP or a non-APP manner, so as to trigger the client to send the log to be detected to the server.
In another example, the filtering device may provide a tool to load the log to be detected, through which the user may transmit the log to be detected to the filtering device.
The tool for loading the log to be detected can be an interface for connecting with external equipment, such as an interface for connecting with other storage equipment, and the log to be detected transmitted by the external equipment is obtained through the interface; the tool for loading the log to be detected can also be a display device, for example, the filter device can input an interface for loading the log to be detected on the display device, the user can import the log to be detected into the filter device through the interface, and the filter device obtains the imported log to be detected.
It should be noted that, in combination with the above manner of obtaining the log to be detected by the filtering device, if the device for obtaining the log to be detected is the filtering device, the log to be detected may be obtained by the filtering device through an online monitoring manner; if the log to be detected is transmitted to the filtering device by the other device, the log to be detected can be obtained by the other device in an on-line monitoring mode, and the filtering device monitors whether the log to be detected is transmitted to the other device in an on-line monitoring mode, so that the log to be detected is obtained in an on-line monitoring mode.
S202: and calling a pre-deployed online service, and performing noise filtering processing on the log to be detected based on the online service, wherein the online service is deployed based on a pre-constructed current log template, and the current log template is a first log template or a second log template. And
The first log template is a first fixed depth analysis tree which is obtained by training according to first noise data obtained by monitoring on the warp in an offline mode in a training stage and used for filtering a noise log.
The second log template is a second fixed depth analysis tree for filtering the noise log, which is obtained by updating the previous log template according to the second noise data obtained by monitoring on the warp line in an offline mode in an updating stage, wherein the previous log template corresponding to the first updating in the updating stage is the first log template.
For example, the online service may be an online service that is deployed by the filtering device based on the current log template, or may be an online service that is deployed by other devices based on the current log template. Similarly, the current log template may be pre-built for the filtering device or may be pre-built for other devices. For the convenience of readers to understand, the present embodiment will be described in detail by taking the construction of the current log template and the deployment of the current log template as an example of the execution of the filtering device.
We can divide the filtering process into an on-line operation and an off-line operation (alternatively referred to as an off-line operation) based on whether the operation performed by the filtering apparatus is an on-line operation or an off-line (i.e., off-line) operation, and can divide the filtering process into a training phase and an updating phase based on whether the operation performed by the filtering apparatus is a phase of training the log template or a phase of updating the log template.
Accordingly, this embodiment can be understood as: in the training stage, the filtering device monitors and obtains first noise data in an online operation mode, and trains and obtains a first log template by utilizing the first noise data in an offline operation mode. And then, the filtering device deploys the online service corresponding to the first log template in an online operation mode. The filtering device monitors the log in an online operation mode, and calls online service in an online operation mode under the condition that the log to be detected is monitored, so that noise filtering processing is carried out on the log to be detected based on the online service.
It should be noted that, the scenario where the filtering device is applied may be a streaming scenario, that is, the filtering device is a device that continuously operates, so after the training phase, the filtering device still continuously operates, and enters the updating phase, where the filtering device monitors the noise data updated for the first time by the online operation mode in the case of the first update in the updating phase, and updates the first log template by using the noise data updated for the first time by the offline operation mode to obtain the log template updated for the first time. Similarly, the filtering device deploys online services corresponding to the log templates updated for the first time in an online operation mode. The filtering device monitors the log in an online operation mode, and calls online service deployed based on the log template updated for the first time in an online operation mode under the condition that the log to be detected is monitored, so as to perform noise filtering processing on the log to be detected based on the online service.
Similarly, after the first log template is updated for the second time (i.e. the first time of updating the log template), the filtering device still continuously operates to perform the second time of updating, and in the case that the filtering device monitors the noise data obtained by the second time of updating by the on-line operation, the filtering device may update the log template obtained by the first time of updating by using the noise data obtained by the second time of updating by the off-line operation to obtain the log template updated for the second time. Similarly, the filtering device deploys online services corresponding to the log templates updated for the second time in an online operation mode. The filtering device monitors the log in an online operation mode, and calls online service deployed based on a log template updated for the second time in an online operation mode under the condition that the log to be detected is monitored, so as to perform noise filtering processing on the log to be detected based on the online service.
It should be understood that the operation of the filtering device for performing noise filtering processing on the log to be detected and the operation of the filtering device for updating the log template to be detected may be parallel operations, that is, the two operations do not interfere with each other. For example, the filtering device may update the log template by an offline operation in a case where the obtained log to be detected is subjected to noise filtering processing by an online operation. For another example, when the filtering device determines that there is a need to perform noise filtering processing on the log to be detected through an online operation mode, the noise filtering processing may be performed on the obtained log to be detected, and when there is a need to update the log template through an offline operation mode, the log template is updated, and the time of the two needs may be the same or different.
And so on to form a filtering scheme for noise logs including the technical features of automatic training-automatic deployment-automatic invocation-automatic updating of log templates. Therefore, the filtering device can realize monitoring, constructing, calling and updating circulating logic by executing the filtering method of the noise log, realize automation of filtering and improve the effectiveness and reliability of filtering.
As can be seen in conjunction with fig. 3, in some embodiments, S202 may include the steps of:
s2021: and performing word segmentation on the log to be detected to obtain a first word segmentation set.
The word segmentation method adopted in the word segmentation processing of the filtering device is not limited in this embodiment, for example, the word segmentation method includes but is not limited to: dictionary-based lexicon, understanding-based lexicon, statistics-based lexicon, semantic-based lexicon.
Correspondingly, under the condition that the filtering device performs word segmentation on the log to be detected based on the arbitrary word segmentation method, a plurality of words after word segmentation on the log to be detected can be obtained, and for convenience of distinguishing, a set comprising the plurality of words is called a first word segmentation set.
As can be seen in conjunction with fig. 3, in some embodiments, S2021 may include the steps of:
s20211: performing primary word segmentation on a log to be detected based on preset prefabricated words to obtain an initial set, wherein the prefabricated words are determined based on an application scene of the method, and the initial set comprises words which are the same as the prefabricated words in the log to be detected and non-prefabricated words which are different from the prefabricated words.
Similarly, the pre-terms may be predetermined by the filtering device based on the application scenario on which the method of the embodiment is executed, or may be determined by another device and transmitted to the filtering device, where the filtering device determines the pre-terms as an example and exemplarily described in this embodiment. The filtering device can determine words which are the same as the prefabricated words in the log to be detected in a regular matching mode. The predictor may include IP address, time, etc.
For example, for a certain application scenario, the filtering device may obtain a common word or a specific word in the application scenario, and determine the common word or the specific word as a prefabricated word.
Correspondingly, the filtering device can firstly determine whether the word hit by the prefabricated word exists in the log to be detected so as to distinguish the word which is the same as the prefabricated word in the log to be detected and the word which is different from the prefabricated word, so that an initial set is obtained.
S20212: and performing word segmentation processing on the non-prefabricated words again to obtain a first word segmentation set.
That is, for the same word as the prefabricated word in the log to be detected, the filtering device may directly add the same word to the first word segmentation set, and for the word different from the prefabricated word in the log to be detected, the filtering device may segment the word to obtain the segmented word, and add the segmented word to the first word segmentation set.
In this embodiment, the filtering device obtains the first word segmentation set through twice word segmentation, and the first word segmentation is implemented based on the prefabricated word under the specific application scene, so that the word segmentation has higher pertinence and reliability, and the first word segmentation set has higher effectiveness.
S2022: and carrying out similarity calculation on the first word segmentation set and the current log template to obtain a similarity value.
The calculation method adopted by the similarity calculation of the filtering device in this embodiment is not limited, and includes, but is not limited to: jaccard similarity, cosine similarity, similarity calculated by distance, pearson correlation coefficient.
Correspondingly, the filtering device can obtain similarity information between the first word segmentation set and the current log template under the condition that the similarity calculation is performed between the first word segmentation set and the current log template on the basis of the random calculation method, and the similarity information can be specifically a similarity value.
As can be seen in conjunction with fig. 3, in some embodiments, S2022 may include the steps of:
s20221: and matching the first word segmentation set with preset keywords to obtain a matching result.
Similarly, the keywords may be preconfigured by the filtering device for the application scenario, and the keywords may be characterized as words biased toward the noise log in the application scenario, that is, the keywords have a high probability of being words included in the noise log.
In this step, the first word segmentation set may include a plurality of words, and for each word in the first word segmentation set, the filtering device may match the word keyword to obtain a corresponding matching result.
S20222: and distributing weights to the words in the first word-segmentation set according to the matching result to obtain weight information, wherein the weight information comprises a first weight and/or a second weight, the first weight is the weight distributed to the words identical to the keywords in the first word-segmentation set when the matching result represents that the words identical to the keywords in the first word-segmentation set are matched, and the second weight is the weight distributed to the words different from the keywords in the first word-segmentation set when the matching result represents that the words identical to the keywords are not matched in the first word-segmentation set, and the first weight is greater than the second weight.
For example, for any word in the first word segmentation set, if the word matching result indicates that the word matches the keyword, if the word is the same (or similar) word as the keyword, the filtering device assigns a first weight to the word, where the weight is relatively large; conversely, if the word does not match the keyword as indicated by the word matching result, if the word is a word different (or dissimilar) from the keyword, the filtering apparatus assigns a first weight to the word, the weight being relatively small.
S20223: and calculating the similarity between the first word segmentation set and the current log template according to the weight information to obtain a similarity value.
According to the analysis, the weight information can represent the same or similar situation between the first word segmentation set and the keywords, and further can represent the possibility degree of the first word segmentation set as the noise log, and the current log template represents the noise log, so that the similarity is calculated on the basis of the weight information, and the effectiveness and reliability of the similarity calculation can be improved.
In other embodiments, the current log template includes a plurality of current nodes, the plurality of current nodes including a current root node and a current leaf node, one current node corresponding to at least one cluster, one cluster corresponding to one cluster center; s2022 may include the steps of:
a first step of: and searching a plurality of current nodes to obtain a current target node corresponding to the first word segmentation set.
For example, the current log template is in a fixed deep parse tree structure, and the filtering device may start searching from the root node until a current target node corresponding to the first word segmentation set is searched, where the current target node may be a root node or a leaf node.
And a second step of: and carrying out respective corresponding Jaccard similarity calculation on the first word segmentation set and each target cluster center in the current target node to obtain respective corresponding similarity values of each target cluster center.
In this embodiment, the filtering device performs noise filtering processing on the log to be detected through the similarity (and specifically, the Jaccard similarity) instead of the word frequency, so that the filtering device is applicable to scenes with balanced or unbalanced log content distribution, and has stronger universality.
It should be understood that the above two examples of calculating the similarity value may be combined to obtain a new example, for example, the filtering device may combine the keyword and the Jaccard similarity to determine the similarity value, and the specific implementation principle may refer to the above example, which is not repeated herein.
S2023: and carrying out noise filtering treatment on the log to be detected according to the similarity value.
The filtering device can filter the noise log to be detected based on the similarity value under the condition that the similarity value is obtained, so as to determine whether the log to be detected is the noise log or the root cause abnormal log.
In this embodiment, the filtering device divides words and then calculates the similarity, so that the similarity calculation can be facilitated, and the calculation efficiency is improved, thereby improving the effectiveness and reliability of the noise filtering process when the noise filtering process is performed based on the similarity value.
As can be seen in conjunction with fig. 3, in some embodiments, S2023 may include the steps of:
S20231: and determining the maximum similarity value from the similarity values corresponding to the target cluster centers.
For example, the filtering means may arrange the respective similarity values in ascending or descending order and determine the maximum similarity value (i.e., the maximum similarity value) from the respective similarity values.
S20232: and under the condition that the maximum similarity value reaches a preset first threshold value, determining the log to be detected as a noise log, and filtering the log to be detected.
The filtering device may determine the size between the maximum similarity and the first threshold to obtain a determination result, and if the determination result indicates that the maximum similarity is greater than or equal to the preset first threshold, the filtering device determines the log to be detected as a noise log, and filters the log to be detected.
The preset first threshold is not limited in this embodiment, for example, the preset first threshold may be determined by the filtering device based on a requirement, a history, a test, and the like. For example, taking the filtering device as an example based on requirements, and the requirements are specifically reliability requirements, the filtering device may determine the preset first threshold value as a relatively large value for an application scenario with relatively high reliability requirements; conversely, for application scenarios where the reliability requirement is relatively low, the filtering device may determine the preset first threshold to be a relatively small value.
S20233: and under the condition that the maximum similarity value does not reach a preset first threshold value, determining the log to be detected as the root cause abnormal log, and reserving the root cause abnormal log.
In combination with the above example, the determination result may also indicate that the maximum similarity is smaller than the preset first threshold, and in this case, the filtering apparatus determines the log to be detected as a root cause abnormal log (i.e. a non-noise log), and retains the root cause abnormal log, so as to locate the fault root cause of the target object based on the root cause abnormal log in the later period.
In this embodiment, the validity and reliability of the determination result may be improved by determining whether the log to be detected is a noise log or a root cause abnormal log according to the magnitude relation between the maximum similarity and the preset first threshold.
It should be noted that, in this embodiment, the number of logs to be detected is not limited, for example, the number of logs to be detected may be one or may be plural, and in the case that the number of logs to be detected is plural, the filtering device performs noise filtering processing on the logs to be detected based on the online service, and includes the following steps:
a first step of: and carrying out noise filtering processing on the log to be detected based on the online service to obtain a noise log in the log to be detected.
For the implementation principle of the first step, reference may be made to the above example, which is not repeated here.
And a second step of: and filtering the log to be detected under the condition that the ratio of the noise log in the log to be detected to the log to be detected reaches a preset second threshold value.
For example, if the number of logs to be detected is M (M is an integer greater than 1), and the number of noise logs in the M logs to be detected is N (N is an integer greater than or equal to 0), the filtering device may compare the ratio N/M with a preset second threshold to obtain a comparison result, and if the comparison result characterization ratio N/M is greater than or equal to the preset second threshold, the filtering device filters all the M logs to be detected.
Similarly, the preset second threshold is not limited in this embodiment, and may be determined by the filtering device based on a requirement, a history, a test, and the like.
And a third step of: and under the condition that the ratio of the noise logs in the log to be detected to the log to be detected does not reach a preset second threshold value, filtering the noise logs in the log to be detected to obtain the root cause abnormal logs except the noise logs in the log to be detected.
Correspondingly, under the condition that the comparison result characterization ratio N/M is smaller than a preset second threshold value by combining the analysis, the filtering device can filter the N noise logs, and M-N root cause exception logs except the N noise logs in the M logs to be detected are reserved.
In this embodiment, the filtering device determines whether to filter all the logs to be detected according to the magnitude relation between the ratio and the preset second threshold, which is equivalent to the "bottom-of-the-road strategy" provided by the filtering device, so as to improve the efficiency and accuracy of noise filtering processing.
S203: and obtaining a detection evaluation index for performing noise filtering processing on the log to be detected based on the online service, wherein the detection evaluation index is at least used for representing the accuracy of the noise filtering processing.
The filtering device may determine, for example, a detection evaluation index by means of online operation, so as to at least determine, from the dimension of accuracy, the effect of noise filtering processing on the online service, where the detection evaluation index may represent, in addition to accuracy, an index of dimensions such as reliability, confidence, validity, etc.
S204: and under the condition that the detection evaluation index is smaller than a preset third threshold value, updating the current log template according to third noise data detected on the warp to obtain an updated log template, wherein the updated log template is used for carrying out noise filtering processing on the log obtained by next online monitoring.
The filtering device may compare the detection evaluation index with a preset third threshold to obtain a comparison result, and update the current log template in an offline operation mode when the comparison result represents that the detection evaluation index is smaller than the preset third threshold.
Accordingly, in combination with the above example, the filtered log may deploy an online service based on the updated log template, and perform noise filtering processing on the log to be detected obtained next based on the newly deployed online service.
Similarly, the preset third threshold is not limited in this embodiment, and may be determined by the filtering device based on a requirement, a history, a test, and the like.
That is, the filtering device may set a parameter for automatically updating the log template (i.e. preset a third threshold value) so as to determine, periodically or aperiodically, how the noise filtering processing of the current online service is effective, so that the log template is updated under the condition that the effect is not ideal, thereby updating the online service, and further improving the accuracy, effectiveness and reliability of the noise filtering processing.
In some embodiments, the target noise data obtained through online monitoring is stored in an offline preset time sequence database; the target noise data includes first noise data, second noise data, and third noise data.
As can be seen from the above analysis, the filtering device can implement the filtering of the noise log in a combination of an on-line operation and an off-line operation, and the noise data is obtained by the filtering device in an on-line operation, and the filtering device can construct a time sequence database in an off-line operation, and store the noise data (such as the target noise data in the embodiment) obtained based on the on-line operation in the time sequence database in the off-line operation.
In this embodiment, the filtering device stores the target noise data in the time sequence database, so that when the filtering device performs construction or update of the log template based on the data in the time sequence database, the latest data can be utilized, thereby improving the effectiveness and reliability of the log template.
While the foregoing describes the filtering method of the noise log mainly from the dimension of filtering, in connection with the analysis described above, the filtering of the noise log is based on a pre-trained or updated log template, and for the convenience of the reader in understanding the disclosure, the principle of determining the log template (the first log template or the second log template) by the filtering device will now be described in detail.
As can be seen in conjunction with fig. 4, in some embodiments, during the training phase, the step of training to obtain the first log template includes:
s401: and performing word segmentation processing on the first noise data to obtain a second word segmentation set.
Similarly, the word segmentation process of the filtering device in this embodiment is not limited, for example, the filtering device may refer to the word segmentation process of the log to be detected in the above example, and will not be described herein.
S402: the first log template is constructed based on the second set of terms, a preset initial fixed depth parse tree, and a preset tree depth.
In this embodiment, the filtering device may improve efficiency and reliability of constructing the first log template by using a technical scheme of word segmentation processing and reconstruction.
As can be seen in conjunction with fig. 4, in some embodiments, S402 may include the steps of:
s4021: and determining the preamble word and the log length of the second word set, wherein the preamble word is the first word in the second word set, and the log length is used for representing the number of words in the second word set.
Illustratively, the second set of words includes a plurality of words, the filtering means may determine a number of all words in the second set of words and determine the number as a log length, and the filtering means may also determine a first word in the second set of words and may determine the word as a pre-word.
S4022: each word in the second word set is converted into a corresponding wildcard.
The conversion mode of the filtering device is not limited in this embodiment, for example, the filtering device may preset a conversion table, and convert each word in the second word segment according to the conversion table, so as to obtain a corresponding wildcard symbol.
In some embodiments, the filtering device may convert the word identical to the wildcard symbol in the second word set into the wildcard symbol corresponding to the wildcard symbol, so that the filtering device may perform similarity calculation in a subsequent wildcard symbol manner, thereby improving efficiency of similarity calculation.
S4023: the first log template is formulated Fu Goujian based on log length, preamble words, initial fixed depth parse tree, tree depth, and each wild.
In this embodiment, the first log template is constructed by features such as wildcards, so that the efficiency of constructing the first log template can be achieved, and the first log template is constructed on the basis of the initial fixed depth analysis tree by combining the log length, the preamble word and the tree depth, so that the first log template can be effectively constructed.
In some embodiments, S4023 may include the steps of:
a first step of: and determining that the first noise data corresponds to a first root node of the initial fixed-depth parse tree according to the log length, the preamble word and the tree depth.
As shown in fig. 5, the fixed depth parse tree includes a top layer, other layers, and a bottom layer, where the top layer is a Root Node (Root Node) of the fixed depth parse tree, the other layers are Internal nodes (Internal nodes) of the fixed depth parse tree, and the bottom layer is a Leaf Node (Leaf Node) of the fixed depth parse tree.
The log length includes "log length as exemplarily shown in fig. 5: 4 "(Length: 4)," log Length:5 "(Length: 5)," log Length:10 "(Length: 10). The preamble words include "Send" (Send), "Receive)," start "(start), as exemplarily shown in fig. 5. The internal nodes include a "log group list" (Alist of Log Groups) as exemplarily shown in fig. 5. The leaf nodes include a "Log Group" as exemplarily shown in fig. 5.
Accordingly, in this section, if the initial fixed depth parse tree is as shown in fig. 5, the filtering apparatus may construct the first internal node in the case that the log length is 4 and the preamble word is received, and the first internal node is a log group list as shown in fig. 5.
And a second step of: and under the condition that the first root node is a father node of the first leaf node, carrying out Jaccard similarity calculation on each wild card and the first leaf node to obtain a cluster set of the first leaf node.
And a third step of: and under the condition that the first root node has no child node, constructing a cluster set of the first leaf node according to each wild card, wherein the first root node is the first leaf node and is a father node.
It should be noted that the log template construction process is a process of continuously constructing the internal nodes and the leaf nodes. Taking a leaf node as an example, in general, if a leaf node has been constructed and generated before, when it is determined that certain noise data corresponds to an internal node of the leaf node later, the noise data may be further matched with the leaf node to construct a cluster set of the leaf node.
Accordingly, in this embodiment, in the case where the filtering apparatus determines the first internal node of the first noise data, the filtering apparatus may further search in the first internal node, if the first internal node is a parent node of the first leaf node, the filtering apparatus further determines a cluster set of the first leaf node, and if the first internal node does not have a leaf node, the filtering apparatus further constructs the leaf node of the first internal node (such as the first leaf node) and uses the cluster set of the first leaf node.
That is, in the case that the filtering apparatus obtains a noise log, the fixed deep parsing tree may be searched downward according to the log length and the features such as the preamble word until reaching the leaf node, and in the case that the leaf node has a cluster set, the Jaccrad similarity between the leaf node and each cluster is calculated according to the noise log, and the original cluster set (such as the cluster center of the original cluster set) is updated according to the calculation result or a new cluster is created in the original cluster set.
In this embodiment, by constructing the first log template of the fixed deep parse tree structure by combining features such as log length and preamble words, the representation of the first noise data by the first log template can be improved, thereby improving the effectiveness and reliability of the first log template.
Based on the technical conception, the disclosure further provides a fault root positioning system for filtering the noise logs, and in particular, components (such as modules) for executing the filtering method of the noise logs are deployed in the fault root positioning system, so as to realize noise filtering processing of the logs to be detected based on interaction among the components.
Referring to fig. 6, fig. 6 is a schematic diagram of a fault root positioning system according to an embodiment of the disclosure, as shown in fig. 6, a fault root positioning system 600 includes: real-time online detection module 601, data reading module 602, time sequence database 603, training module 604, deployment module 605, log template updating module 606; wherein,
The real-time online detection module 601 is configured to obtain a log to be detected obtained through online monitoring, invoke an online service deployed in advance, and perform noise filtering processing on the log to be detected based on the online service, where the online service is obtained through deployment by the deployment module 605 based on a current log template constructed in advance, and the current log template is a first log template or a second log template.
The training module 604 is configured to train, in a training phase, a first log template obtained from first noise data obtained from the time-series database 603 in an offline manner, where the first log template is a first fixed depth parse tree for filtering noise logs, and the first noise data is obtained from the real-time online detection module 601 in an offline manner by the data reading module 602.
The log template updating module 606 is configured to update, in an updating stage, a previous log template obtained by offline updating a previous log template with second noise data obtained from the time sequence database 603, where the second log template is a second fixed depth parse tree for filtering noise logs, the second noise data is obtained by offline updating, by the data reading module 602, from the real-time online detection module 601, and the previous log template corresponding to a first update in the updating stage is the first log template.
It will be appreciated that, in order to avoid cumbersome descriptions, the technical features of the present embodiment that are the same as those of the above embodiment are not repeated here.
For example, as shown in fig. 6, a real-time online detection module 601 runs an online monitoring system, and after the online monitoring system monitors an abnormal alarm of a target object, a call chain (Trace) of the abnormal alarm is sampled to obtain a log to be detected. The real-time online detection module 601 invokes an online service through a query utility (query rlabantil) in a monitor as a service (Monitor AS Service, maaS) to determine whether the log to be detected is a noise log or a root cause exception log through the online service.
The real-time online detection module 601 may also monitor the target noise data through an online monitoring system.
The data reading module 602 operates with a monitor as a service, and may extract the target noise data obtained by the real-time online detection module 601 through a call chain Log storage (Trace Log Save) function in the monitor as a service, specifically may extract a corresponding Log line, and write the Log line into the time sequence database 603.
The training module 604 is operated with a monitor as a service, and can obtain first noise data from the time sequence database 603 through a call chain Log Query (Trace Log Query) in the monitor as a service, and train to obtain a first Log template by using the first noise data, and the training process includes "Log template training and verification" as shown in fig. 6, so as to construct the first Log template in a "training+verification" manner. That is, in some embodiments, the first log template is a validated log template, i.e., the first log template has higher reliability and validity.
The deployment module 605 may deploy the first log template as a noise log template filtering online service (may be simply referred to as an online service) in the real-time online detection module 601.
In updating the first log template, the log template updating module 606 runs on-the-fly service and can extract corresponding noise data from the time sequence database 603 and the first log template from the log templates by calling a chain log algorithm analysis (Trace Log Algo Analyze) function in the on-the-fly service to update the first log template based on the mentioned noise log.
Alternatively, in a case where the filtering effect of the log template is degraded, such as in a case where the detection evaluation index of the first log template is smaller than the second threshold, the log template updating module 606 updates the first log template. For example, the log template updating module 606 periodically and aperiodically evaluates the filtering effect of the log template to update the log template based on the evaluation result, and the evaluation result can be displayed in the form of a report.
The principle of updating the first log template by the log template updating module 606 is the same as that of updating the first log template except for the first update of the first log template, and will not be described here.
Accordingly, the deployment module 605 may deploy the updated log template as a noise log template filtering online service (may be simply referred to as an online service) in the real-time online detection module 601.
In some embodiments, as can be seen in connection with fig. 7, the fault root location system may include three levels: the service layer is configured with a call chain sampling module (specifically, the service layer may be a real-time online detection module 601 according to the above embodiment), so as to call online service to collect data (such as collecting logs to be detected and target noise data) by calling service, and combine the "spam policy" according to the above embodiment to perform prediction processing of the noise log (filtering processing of the noise log according to the above embodiment), so as to obtain a prediction result (whether the log to be detected is the noise log or the result of the cause of anomaly log according to the above embodiment).
The service layer is deployed with a log template, an algorithm laboratory online reasoning service (online service determined based on the log template as described in the above embodiment), and a Deep Insight (Deep Insight) real-time measurement service. As shown in fig. 7, the Deep Insight real-time measurement service includes a filtering effect service, a multidimensional comparison service, and a data query service, which can be specifically embodied as the filtering capability of the inspection log template as described in the above embodiment, and provides a visual interface to intuitively display each index of the filtering capability.
The model layer is provided with a data analysis function, an algorithm model selection function, a model training function and a model deployment function. As shown in fig. 7, the data analysis function is used to analyze the root cause anomaly log data and noise log data. The algorithm model selection function is used for constructing a log template based on similarity clustering and a log analysis tree mode. The model training function includes model training and effect verification to build a log template in combination with the model training and effect verification.
The model deployment function is used to deploy the log templates as an algorithmic laboratory online reasoning service (online service as described in the above embodiments).
It should be noted that fig. 6 and 7 are exemplary illustrations of the framework of the fault root location system from two different dimensions and are not to be construed as limiting the fault root location system. For example, taking fig. 6 as an example, in some embodiments, two or more of the modules in fig. 6 may be integrated into a module having the functionality of each module, or at least one of the modules in fig. 6 may be split into functional modules of finer granularity.
According to the technical concept, the present disclosure further provides a filtering system for a noise log, which is characterized by comprising:
At least one memory including at least one set of instructions to push information;
at least one processor communicatively coupled to the at least one memory,
wherein, when the system is running, the at least one processor executes the at least one set of instructions to implement the method of filtering a noise log as described in any of the embodiments above.
According to the technical concept described above, the present disclosure also provides a processor-readable storage medium storing a computer program for causing the processor to perform the method of filtering a noise log according to any one of the embodiments described above.
According to the technical concept described above, the present disclosure further provides a computer program product, including a computer program, which when executed by a processor implements the method for filtering a noise log according to any of the above embodiments.
According to the technical concept, the present disclosure further provides an electronic device, including: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes the computer-executable instructions stored in the memory to implement the method of filtering a noise log as described in any of the embodiments above.
Fig. 8 is a hardware configuration diagram of an electronic device 800 according to an embodiment of the disclosure. The electronic device 800 may perform the method of filtering a noise log as described in any of the embodiments above.
Taking an example that the method for filtering a noise log according to the embodiment of the present disclosure is applied to an application scenario as shown in fig. 1, the following description is given to the electronic device 800:
in the case where the method of filtering a noise log as described in any of the above embodiments is performed on a client 102, the electronic device 800 may be the client 102. In the case where the method of filtering a noise log as described in any of the above embodiments is performed on the server 103, the electronic device 800 may be the server 103. In the case where the filtering method of the noise log according to any of the above embodiments is partially performed on the client 102 and partially performed on the server 103, the electronic device 800 may be the client 102 and the server 103.
As shown in fig. 8, an electronic device 800 may include at least one storage medium 801 and at least one processor 802. In some embodiments, the electronic device 800 may also include a communication port 803 and an internal communication bus 804. Meanwhile, the electronic device 800 may further include an Input/Output (I/O) component 805.
Internal communication bus 804 may connect the different system components including storage medium 801, processor 802, and communication ports 803. The I/O component 805 supports input/output between the electronic device 800 and other components. The communication port 803 is used for data communication between the electronic device 800 and the outside world, for example, the communication port 803 may be used for data communication between the electronic device 800 and the network 104. The communication port 803 may be a wired communication port or a wireless communication port.
The storage medium 801 may include a data storage device. The data storage device may be a non-transitory storage medium or a transitory storage medium. For example, the data storage device may include one or more of a magnetic disk 8011, a Read-Only Memory (ROM) 8012, or a random access Memory (Random Access Memory, RAM) 8013. The storage medium 801 further includes at least one set of instructions stored in the data storage device. The instructions are computer program code that may include programs, routines, objects, components, data structures, procedures, modules, etc. that perform the methods of filtering noise logs provided herein.
The at least one processor 802 may be communicatively coupled with at least one storage medium 801 and a communication port 803 via an internal communication bus 804. The at least one processor 802 is configured to execute the at least one instruction set described above. When the electronic device 800 is running, the at least one processor 802 reads the at least one instruction set and performs the method of filtering a noise log provided herein according to the instructions of the at least one instruction set. The processor 802 may perform all the steps involved in the method of filtering the noise log. The processor 802 may be in the form of one or more processors, and in some embodiments, the processor 802 may include one or more hardware processors, such as microcontrollers, microprocessors, reduced instruction set computers (Reduced Instruction Set Computer, RISC), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), application specific instruction set processors (Application Specific Instruction Processor, ASIP), central processing units (Central Processing Unit, CPU), graphics processing units (graphics processing unit, GPU), physical processing units (Physics Processing Unit, PPU), microcontroller units, digital signal processors (Digital Signal Processor, DSP), field programmable gate arrays (Field Programmable Gate Array, FPGA), advanced RISC Machines (ARM), programmable logic devices (Programmable Logic Device, PLD), any circuit or processor capable of performing one or more functions, or the like, or any combination thereof. For illustrative purposes only, only one processor 802 is depicted in the electronic device 800 in this specification. However, it should be noted that the electronic device 800 may also include multiple processors in this specification, and thus, the operations and/or method steps disclosed in this specification may be performed by one processor as described in this specification, or may be performed jointly by multiple processors. For example, if the processor 802 of the electronic device 800 performs steps a and B in this specification, it should be understood that steps a and B may also be performed by two different processors 802 in combination or separately (e.g., a first processor performs step a, a second processor performs step B, or the first and second processors together perform steps a and B).
It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, apparatus, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-executable instructions. These computer-executable instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These processor-executable instructions may also be stored in a processor-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the processor-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These processor-executable instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present disclosure without departing from the spirit or scope of the disclosure. Thus, the present disclosure is intended to include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (14)

1. A method for filtering a noise log, comprising:
obtaining a log to be detected obtained by monitoring on the warp;
invoking a pre-deployed online service, and performing noise filtering processing on the log to be detected based on the online service, wherein the online service is deployed based on a pre-constructed current log template, and the current log template is a first log template or a second log template; and
the first log template is a first fixed depth analysis tree which is obtained by training according to first noise data obtained by monitoring on the warp in an offline mode in a training stage and used for filtering a noise log;
the second log template is a second fixed depth analysis tree for filtering a noise log, which is obtained by updating a previous log template according to second noise data obtained by monitoring on a meridian in an offline mode in an updating stage, wherein the previous log template corresponding to the first updating in the updating stage is the first log template.
2. The method of claim 1, wherein the noise filtering the log to be detected based on the online service comprises:
Performing word segmentation processing on the log to be detected to obtain a first word segmentation set;
performing similarity calculation on the first word segmentation set and the current log template to obtain a similarity value; and
and carrying out noise filtering processing on the log to be detected according to the similarity value.
3. The method of claim 2, wherein the performing word segmentation on the log to be detected to obtain a first word segmentation set includes:
performing primary word segmentation on the log to be detected based on preset prefabricated words to obtain an initial set, wherein the prefabricated words are determined based on an application scene of the method, and the initial set comprises words which are the same as the prefabricated words in the log to be detected and non-prefabricated words which are different from the prefabricated words; and
and performing word segmentation processing on the non-prefabricated words again to obtain the first word segmentation set.
4. The method of claim 3, wherein performing similarity calculation on the first word segmentation set and the current log template to obtain a similarity value comprises:
matching the first word segmentation set with preset keywords to obtain a matching result;
According to the matching result, assigning weights to words in the first word-segmentation set to obtain weight information, wherein the weight information comprises a first weight and/or a second weight, the first weight is the weight assigned to the word identical to the keyword in the first word-segmentation set when the matching result represents that the word identical to the keyword is matched in the first word-segmentation set, and the second weight is the weight assigned to the word different from the keyword in the first word-segmentation set when the matching result represents that the word identical to the keyword is not matched in the first word-segmentation set; and
and calculating the similarity between the first word segmentation set and the current log template according to the weight information to obtain the similarity value.
5. The method of claim 2, wherein the current log template comprises a plurality of current nodes, the plurality of current nodes comprising a current internal node and a current leaf node, one current node corresponding to at least one cluster, one cluster corresponding to one cluster center; and
The step of performing similarity calculation on the first word segmentation set and the current log template to obtain a similarity value includes:
searching from the plurality of current nodes to obtain a current target node corresponding to the first word segmentation set; and
and carrying out respective corresponding Jaccard similarity calculation on the first word segmentation set and each target cluster center in the current target node to obtain respective corresponding similarity values of each target cluster center.
6. The method according to claim 5, wherein the noise filtering the log to be detected according to the similarity value includes:
determining the maximum similarity value from the similarity values corresponding to the target cluster centers;
under the condition that the maximum similarity value reaches a preset first threshold value, determining the log to be detected as a noise log, and filtering the log to be detected; and
and under the condition that the maximum similarity value does not reach a preset first threshold value, determining the log to be detected as a root cause abnormal log, and reserving the root cause abnormal log.
7. The method according to claim 1, wherein the number of logs to be detected is a plurality; the noise filtering processing for the log to be detected based on the online service comprises the following steps:
Performing noise filtering processing on the log to be detected based on the online service to obtain a noise log in the log to be detected;
filtering the log to be detected under the condition that the ratio of the noise log in the log to be detected to the log to be detected reaches a preset second threshold value; and
and under the condition that the ratio of the noise logs in the log to be detected to the log to be detected does not reach a preset second threshold value, filtering the noise logs in the log to be detected to obtain root cause abnormal logs except the noise logs in the log to be detected.
8. The method according to claim 1, wherein the method further comprises:
obtaining a detection evaluation index for performing noise filtering processing on the log to be detected based on the online service, wherein the detection evaluation index is at least used for representing the accuracy of the noise filtering processing; and
and under the condition that the detection evaluation index is smaller than a preset third threshold value, updating the current log template according to third noise data detected on the warp to obtain an updated log template, wherein the updated log template is used for carrying out noise filtering processing on a log obtained by monitoring on the next line.
9. The method of claim 1, wherein training the first log template during the training phase comprises:
word segmentation processing is carried out on the first noise data to obtain a second word segmentation set; and
and constructing the first log template based on the second word segmentation set, a preset initial fixed depth analysis tree and a preset tree depth.
10. The method of claim 9, wherein the constructing the first log template based on the second set of terms, a preset initial fixed depth parse tree, and a preset tree depth comprises:
determining the preamble word and the log length of the second word set, wherein the preamble word is the first word in the second word set, and the log length is used for representing the number of words in the second word set;
converting each word in the second word segmentation set into a corresponding wildcard character; and
and constructing the first log template according to the log length, the preamble word, the initial fixed depth parse tree, the tree depth and each wildcard.
11. The method of claim 10, wherein constructing the first log template from the log length, the preamble word, the initial fixed depth parse tree, the tree depth, and each wildcard comprises:
Determining that the first noise data corresponds to a first internal node of the initial fixed-depth parse tree according to the log length, the preamble word, and the tree depth;
under the condition that the first internal node is a father node of a first leaf node, carrying out Jaccard similarity calculation on each wildcard and the first leaf node to obtain a cluster set of the first leaf node; and
and under the condition that the first internal node has no child node, constructing a cluster set of the first leaf node according to each wild card, wherein the first internal node is a parent node of the first leaf node.
12. The method of claim 8, wherein the target noise data obtained from the on-line monitoring is stored in an off-line preset time sequence database; the target noise data includes the first noise data, the second noise data, and the third noise data.
13. A fault root cause location system for filtering noise logs, the system comprising: the system comprises a real-time online detection module, a data reading module, a time sequence database, a training module, a deployment module and a log template updating module; wherein,
The real-time online detection module is used for obtaining a log to be detected obtained through online monitoring, calling an online service deployed in advance, and performing noise filtering processing on the log to be detected based on the online service, wherein the online service is obtained through deployment of the deployment module based on a current log template constructed in advance, and the current log template is a first log template or a second log template; and
the training module is used for training the first log template obtained from the first noise data obtained from the time sequence database in an offline mode in a training stage, wherein the first log template is a first fixed depth analysis tree for filtering a noise log, and the first noise data is obtained from the real-time online detection module in an offline mode by the data reading module;
the log template updating module is configured to update, in an updating stage, a previous log template obtained by updating the previous log template with second noise data obtained from the time sequence database in an offline manner, where the second log template is a second fixed depth analysis tree for filtering a noise log, the second noise data is obtained from the real-time online detection module by the data reading module in an offline manner, and a previous log template corresponding to a first update in the updating stage is the first log template.
14. A system for filtering a noise log, comprising:
at least one memory including at least one set of instructions to push information;
at least one processor communicatively coupled to the at least one memory,
wherein, when the system is running, the at least one processor executes the at least one set of instructions to implement the method of any one of claims 1 to 12.
CN202410011273.XA 2024-01-04 2024-01-04 Noise log filtering method and system Pending CN117827784A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410011273.XA CN117827784A (en) 2024-01-04 2024-01-04 Noise log filtering method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410011273.XA CN117827784A (en) 2024-01-04 2024-01-04 Noise log filtering method and system

Publications (1)

Publication Number Publication Date
CN117827784A true CN117827784A (en) 2024-04-05

Family

ID=90520794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410011273.XA Pending CN117827784A (en) 2024-01-04 2024-01-04 Noise log filtering method and system

Country Status (1)

Country Link
CN (1) CN117827784A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118093325A (en) * 2024-04-28 2024-05-28 中国民航大学 Log template acquisition method, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118093325A (en) * 2024-04-28 2024-05-28 中国民航大学 Log template acquisition method, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20200160230A1 (en) Tool-specific alerting rules based on abnormal and normal patterns obtained from history logs
US20170109676A1 (en) Generation of Candidate Sequences Using Links Between Nonconsecutively Performed Steps of a Business Process
US20170109668A1 (en) Model for Linking Between Nonconsecutively Performed Steps in a Business Process
US20170109636A1 (en) Crowd-Based Model for Identifying Executions of a Business Process
CN111539493B (en) Alarm prediction method and device, electronic equipment and storage medium
CN107133240A (en) Page monitoring method, apparatus and system
CN117827784A (en) Noise log filtering method and system
CN110661660B (en) Alarm information root analysis method and device
CN106202126B (en) A kind of data analysing method and device for logistics monitoring
CN112583640A (en) Service fault detection method and device based on knowledge graph
CN109743286A (en) A kind of IP type mark method and apparatus based on figure convolutional neural networks
CN113610156A (en) Artificial intelligence model machine learning method and server for big data analysis
CN112328802A (en) Data processing method and device and server
US20170109640A1 (en) Generation of Candidate Sequences Using Crowd-Based Seeds of Commonly-Performed Steps of a Business Process
CN114331698A (en) Risk portrait generation method and device, terminal and storage medium
CN113098989B (en) Dictionary generation method, domain name detection method, device, equipment and medium
US20170109637A1 (en) Crowd-Based Model for Identifying Nonconsecutive Executions of a Business Process
CN112416800A (en) Intelligent contract testing method, device, equipment and storage medium
CN116739408A (en) Power grid dispatching safety monitoring method and system based on data tag and electronic equipment
CN113821418B (en) Fault root cause analysis method and device, storage medium and electronic equipment
CN110413500A (en) Failure analysis methods and device based on big data fusion
CN116149877A (en) Fault detection method and device
CN115767601A (en) 5GC network element automatic nanotube method and device based on multidimensional data
CN115098362A (en) Page testing method and device, electronic equipment and storage medium
CN111935279B (en) Internet of things network maintenance method based on block chain and big data and computing node

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination