CN116450393A - Log anomaly detection method and system integrating BERT feature codes and variant transformers - Google Patents

Log anomaly detection method and system integrating BERT feature codes and variant transformers Download PDF

Info

Publication number
CN116450393A
CN116450393A CN202310417120.0A CN202310417120A CN116450393A CN 116450393 A CN116450393 A CN 116450393A CN 202310417120 A CN202310417120 A CN 202310417120A CN 116450393 A CN116450393 A CN 116450393A
Authority
CN
China
Prior art keywords
log
bert
sequence
variant
log sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310417120.0A
Other languages
Chinese (zh)
Inventor
方巍
贾雪磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202310417120.0A priority Critical patent/CN116450393A/en
Publication of CN116450393A publication Critical patent/CN116450393A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0769Readable error formats, e.g. cross-platform generic formats, human understandable formats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a method and a system for detecting log abnormality by fusing BERT feature codes and variant transformers, which relate to the technical field of intelligent operation and maintenance processing of logs, and the method comprises the following steps: receiving a log sequence, analyzing the log sequence, and inputting the analyzed log sequence into a pre-established BERT model to obtain a log sequence feature code with semantic information and position information; the log sequence feature codes with semantic information and position information are input into a pre-established abnormal detection model for training, a log sequence which possibly appears in the future is obtained, the obtained log sequence is used for carrying out accurate prediction, a detection result is obtained, the memory consumption and the time consumption of the attention layer calculation can be reduced, and the equivalent or exceeding prediction precision is achieved.

Description

Log anomaly detection method and system integrating BERT feature codes and variant transformers
Technical Field
The invention relates to the technical field of log intelligent operation and maintenance processing, in particular to a log anomaly detection method and system integrating BERT feature codes and variant transformers.
Background
The log is time sequence text data, which consists of time stamp and text message, and records the running state of the service in real time. By collecting and analyzing logs, it is possible to discover or predict what has happened or potential faults in the network. In addition, modern network systems are large in size, on the order of about 50Gb (about 1.2 hundred million-2 hundred million rows) of print logs per hour, and are inefficient if manual analysis of log data is relied upon to identify if a failure has occurred in the network. This leads to modeling the log data using an intelligent approach to find potential relationships between log sequences.
In recent years, many scientific research teams have developed related work for log anomaly detection and have achieved very great achievements. Error detection on HDFS logs by using a decision tree method has been proposed by Mike Chen et al as early as 04 years, and the team is also a research of using supervised learning models in machine learning to perform anomaly detection on log data relatively early, so that the method has great significance. In order to develop an effective fault tolerance strategy, the YInglong Liang team predicts a fault event of a system, uses an SVM (Support Vector Machines, support vector machine) to perform an exception handling on the log data of the IBM BlueGene/L, and the existing feature extraction or encoding mode does not fully consider semantic information or position information among words in a log sequence. The self-attention structure in the transducer model effectively captures the correlation of features within the text sequence, reducing the dependence on external information, enabling it to better discover the intrinsic links between data or features [69]. Many scholars then propose using a transducer model to replace RNN to perform log anomaly detection study, such as HitAnomaly, neuralLog, which achieves good experimental results, but the self-attention mechanism inside the method needs to calculate the correlation between each point and other points when calculating the correlation inside the sequence, and large-scale matrix multiplication operation leads to time complexity and space complexity of the method, and the calculation efficiency is not high.
Disclosure of Invention
In order to solve the above-mentioned shortcomings in the background art, the present invention is directed to a method and a system for detecting log anomalies by fusing BERT feature codes with variant transformers.
The aim of the invention can be achieved by the following technical scheme: a method for detecting log abnormality by fusing BERT feature codes and variant transformers comprises the following steps:
collecting a log sequence, analyzing the log sequence, and inputting the analyzed log sequence into a pre-established BERT model to obtain a log sequence feature code with semantic information and position information;
inputting the characteristic codes of the log sequences with semantic information and position information into a pre-established log abnormality detection model based on a variant Transformer for training to obtain a log sequence which possibly appears in the future, and accurately predicting the obtained log sequence to obtain a detection result.
Preferably, the BERT model is designed using BERT BASE version 1, with blocks of Transformer Encoder of 12, hidden layer size of 768, number of self-attention heads of 12, and total parameter size of 110M.
Preferably, the BERT model is as follows:
BERT BASE (L=12,H=768,A=12,TotalParam=110M)。
preferably, the log sequence obtains a text sequence after word segmentation through a token word segmentation technology; and a special mark [ CLS ] is added to the beginning of the text sequence, wherein [ CLS ] represents the result mark of the text sequence, and can be placed at the beginning or at the tail, and different sentences are separated by a mark [ SEP ].
Preferably, the output of each word of the log sequence is composed of three parts, including: token Embedding, segment Embedding and Position Embedding.
Preferably, the sequence vector containing three types of Embedding is input into the BERT network for feature extraction, finally the sequence vector containing rich semantic features is output, and the association degree between different words is used for determining a weight matrix to characterize the words:
wherein Q, K, V areThe word vector matrix is used to determine,is the dimension of word embedding.
Preferably, a plurality of different linear changes are projected at Q, K, V, and finally the attention output structures of different heads are spliced by Concat, with the following formula:
MultiHead(Q,K,V)=Concat(head 1 ,...,head n )W O
head i =Attention(QW i Q ,KW i K ,VW i V )
wherein W is a weight matrix,
preferably, the anomaly detection model uses a log sequence feature code of 10 time steps as input to predict log sequence output for the next 5 time steps.
Preferably, a sequence of log codes for 10 time steps is entered into the Encoder of the model and accumulated with the position codes prior to entry, the positions of the elements are reflected using additional position codes, the position information is encoded using Sin and Cos functions, where pos represents the relevant position, j is the code dimension, d model Representing the length of the vector, using the first of equation (4) if pos is located even and selecting the second if it is odd;
then in the coding process, the attention layer is entered to carry out vector attention operation, firstly, the input dimension is divided into two parts, one part is subjected to 1 x 1 convolution, the other part is subjected to normal self-attention layer, then the output is spliced to obtain a detection result, the coding operation is completed, then a mask operation is added in the decoding operation, and the decoded attention layer is identical to the coding attention layer.
A log anomaly detection system that fuses BERT signature encoding with variant transformers, comprising:
and a log coding module: the method comprises the steps of receiving a log sequence, analyzing the log sequence, and inputting the analyzed log sequence into a pre-established BERT model to obtain a log sequence feature code with semantic information and position information;
and a prediction module: the method is used for inputting the log sequence feature codes with semantic information and position information into a pre-established log abnormality detection model based on a variant Transformer for training to obtain a log sequence which possibly appears in the future, and accurately predicting the obtained log sequence to obtain a detection result.
The invention has the beneficial effects that:
firstly, BERT is used for feature coding in a log sequence data coding stage, semantic information and position information of each word token in a sequence can be fully considered due to unique advantage of BERT, three different Embeddings are accumulated in an input stage, so that the position information is contained in the final input, a CSPAttion module combining CSPNet and Self-attribute is designed to replace Self-attribute in an original converter, and the structure can reduce memory consumption and time consumption of Attention layer calculation, achieve equivalent or exceeding prediction precision, and prove that the process is shown in an appendix.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort;
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a CSPATTENTion structure of the present invention;
FIG. 3 is a schematic illustration of the self-attention structure of the present invention;
FIG. 4 is a schematic diagram of the attention structure after dimension division of the present invention;
FIG. 5 is a schematic diagram of the input composition of the present invention;
FIG. 6 is a logical schematic of the coding portion of the present invention;
FIG. 7 is an Encoder diagram in BERT of the present invention;
FIG. 8 is a modified transducer diagram of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in FIG. 1, a method for detecting log anomalies by fusing BERT feature codes with variant transformers comprises the following steps:
step 1: log sequence feature encoding stage
In this step, the present invention uses BERT for output of log sequence feature codes. In the normal log abnormality detection flow, the log needs to be parsed before encoding, and the step is not important in the invention and is not repeated. The design was performed using BERT BASE version (1), block number of Transformer Encoder being 12, hidden layer size being 768, number of self-attention heads being 12, total parameter size being 110M. The invention uses BERT only to be able to generate feature codes with semantic information and location information without classification or prediction tasks.
BERT BASE (L=12,H=768,A=12,TotalParam=110M) (1)
In the input stage, three types of Embeddings are required to be accumulated and then input. As can be seen from fig. 5, for the log sequence, firstly, the text sequence after word segmentation is obtained through the token word segmentation technology; a special tag [ CLS ] is added to the beginning of the sequence, and the [ CLS ] represents a result tag of the sequence and can be placed at the beginning or at the tail. Different sentences are separated by a mark SEP. The output of each word of the log sequence at this time consists of 3 parts, token references Segment Embedding and Position Embedding. The sequence vector containing three types of coding is input into the BERT network for feature extraction, and finally the sequence vector containing rich semantic features is output, and the whole flow can be shown in figure 6. For BERT, it is simply a stacked structure of transducers with encoders, and the encoder structure is shown in fig. 7. The most critical in the encoder is multi-headed Self-attribute, which mainly characterizes words by determining a weight matrix according to the association degree between different words in the same sequence:
in the above formula, Q, K, V are word vector matrices,is the dimension of word embedding. The so-called multi-head attention mechanism projects a plurality of different linear changes at Q, K and V, and finally, the attention output structures of different heads are spliced by Concat. The formula is shown in (3).
MultiHead(Q,K,V)=Concat(head 1 ,...,head n )W O
head i =Attention(QW i Q ,KW i K ,VW i V ) (3)
The position information under different spaces can be obtained through the multi-head attention operation, wherein W is a weight matrix.
In addition, in fig. 6, it can be seen that a position code (Postional Embedding) is further added to the input part, and the data processed in the normal cyclic neural network is unidirectional, so that in order to solve the problem, the position code is added between the encoder inputs through a sine and cosine algorithm, and the position code is accumulated with the input original ebedding, so as to obtain the relative position information of each word in the sequence. A residual network is also added to the encoder to solve the gradient problem when the network is excessive.
Step 2: anomaly detection model construction and prediction stage
After the coding in the first step, feature codes are required to be input into an anomaly detection model for training, and the method only trains normal logs, because the normal logs account for the majority in a real production environment, the model only needs to learn the features of the normal logs, and when the model is applied to the real environment, the model can be judged to be anomaly if the predicted logs and the real logs generate great difference. A sliding window is used to control the size of the input, typically set to 10. A log sequence of 10 time steps is taken as input to predict the output of 5 time steps in the future, and the loss value is calculated with 5 time steps in the normal log, and then the loss is fitted continuously. In this step the invention designs a variant-transform based log anomaly detection method, wherein the encoder part takes as input the history of the log sequence, and the decoder part predicts in an autoregressive way what is likely to happen in the future log. The Self-Attention module in the original tranformer is replaced by a new Attention mode, namely the CSPATTENTion module combining CSPNet and Self-Attention is used for replacing Self-Attention in the original tranformer, and the structure can greatly reduce the memory consumption and time consumption of Attention layer calculation and achieve equivalent or exceeding prediction precision. The structure of the model is shown in the attached figure 8.
The log code sequence of 10 time steps is input into the Encoder of the model and accumulated with the position codes before input, the positions of the elements can be reflected by using the additional position codes, and the invention encodes the position information by using Sin and Cos functions as the most common sequence models, wherein pos represents the relevant position, j is the coding dimension, and dmedel represents the length of the vector.
The attention layer is then entered for vector attention operations. This is also an improvement of the present invention where the input dimension is split into two. One of them will undergo a 1 x 1 convolution and the other will undergo a normal Self-Attention. And finally, splicing all the outputs. Through Layer Normalization and Feed Forward operations. In addition, the residual error operation is added in the aspects of attention and Feed Forward operation, so as to prevent the gradient from disappearing or explosion problem. The encoder section is completed so far and the decoder operation is performed as follows. In order to predict the log of ten time steps in the future because of the task of the present model, there is one mask operation in the decoding operation. The intermediate attention layer and the attention layer in the encoder are of the same structure and function.
A log anomaly detection system that fuses BERT signature encoding with variant transformers, comprising:
and a log coding module: the method comprises the steps of receiving a log sequence, analyzing the log sequence, and inputting the analyzed log sequence into a pre-established BERT model to obtain a log sequence feature code with semantic information and position information;
and a prediction module: the method is used for inputting the log sequence feature codes with semantic information and position information into a pre-established log abnormality detection model based on a variant Transformer for training to obtain a log sequence which possibly appears in the future, and accurately predicting the obtained log sequence to obtain a detection result.
And (3) proving: CSPattern time complexity is 50% of traditional self-attention
Starting to prove:
(1): firstly, calculating the time complexity of an original self-attention mechanism;
(2): assume that the input sequence is
(3): is additionally provided withi=1,...,n。d 2 ×n=E 2 ,d 3 ×n=E 3
(4):With a temporal complexity of LE 1 d 2 n=LE 1 E 2
(5):With a temporal complexity of LE 1 d 2 n=LE 1 E 2
(6):With a temporal complexity of LE 1 d 3 n=LE 1 E 3
(7):With a temporal complexity of L 2 d 2 n=E 2 L 2
(8):With a temporal complexity of L 2 d 3 n=E 3 L 2
(9):With a temporal complexity of LE 3 E 4
(10): in self-attention, there is typically E 1 =E 2 =E 3 =E 4 =E;
(11): the temporal complexity of the original self-attention mechanism is 4E 2 L+2EL 2
(12): then calculating the time complexity of CSPattern;
(13): calculating the time complexity of the convolutional layer side L× (E/2) 2
(14): the temporal complexity of the calculation from the attention side is 4 (E/2) 2 L+2(E/2)L 2 =LE 2 +EL 2
(15): step-adding steps (13) and (14) to obtain CSPattern time complexity of 1.25LE 2 +EL 2
(16): by comparing the time complexity of step (15) with the time complexity of step (11), it is evident that the time complexity of CSPattern is reduced by at least 50%.
End of proof
Based on the same inventive concept, the present invention also provides a computer apparatus comprising: one or more processors, and memory for storing one or more computer programs; the program includes program instructions and the processor is configured to execute the program instructions stored in the memory. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application SpecificIntegrated Circuit, ASIC), field-Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computational core and control core of the terminal for implementing one or more instructions, in particular for loading and executing one or more instructions within a computer storage medium to implement the methods described above.
It should be further noted that, based on the same inventive concept, the present invention also provides a computer storage medium having a computer program stored thereon, which when executed by a processor performs the above method. The storage media may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electrical, magnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing has shown and described the basic principles, principal features, and advantages of the present disclosure. It will be understood by those skilled in the art that the present disclosure is not limited to the embodiments described above, which have been described in the foregoing and description merely illustrates the principles of the disclosure, and that various changes and modifications may be made therein without departing from the spirit and scope of the disclosure, which is defined in the appended claims.

Claims (10)

1. A method for detecting log abnormality by fusing BERT feature codes and variant transformers is characterized by comprising the following steps:
collecting a log sequence, analyzing the log sequence, and inputting the analyzed log sequence into a pre-established BERT model to obtain a log sequence feature code with semantic information and position information;
inputting the characteristic codes of the log sequences with semantic information and position information into a pre-established log abnormality detection model based on a variant Transformer for training to obtain a log sequence which possibly appears in the future, and accurately predicting the obtained log sequence to obtain a detection result.
2. The method for detecting log anomalies by fusing BERT feature codes and variant transformers according to claim 1, wherein the BERT model is designed by using BERT BASE version 1, blocks of Transformer Encoder are 12, hidden layer size is 768, number of self-attentions is 12, and total parameter size is 110M.
3. The method for detecting log anomalies by combining BERT signature encoding with variant transformation according to claim 2, wherein the BERT model is as follows:
BERT BASE (L=12,H=768,A=12,TotalParam=110M)。
4. the method for detecting log anomalies by fusing BERT feature codes and variant transformers according to claim 1, wherein the log sequences are segmented by token segmentation technology to obtain segmented text sequences; and a special mark [ CLS ] is added to the beginning of the text sequence, wherein [ CLS ] represents the result mark of the text sequence, and can be placed at the beginning or at the tail, and different sentences are separated by a mark [ SEP ].
5. The method for detecting log anomalies by merging BERT feature codes and variant transformers according to claim 4, wherein the output Embedding of each word of the log sequence consists of three parts, including: token Embedding, segment Embedding and Position Embedding.
6. The method for detecting log anomalies by fusing BERT feature codes and variant transformers according to claim 5, wherein the method is characterized in that sequence vectors containing three types of Embedding are input into a BERT network for feature extraction, finally sequence vectors containing rich semantic features are output, and a weight matrix is determined according to the association degree between different words to characterize the words:
where Q, K, V are word vector matrices,is the dimension of word embedding.
7. The method for detecting log anomalies by fusing BERT feature codes and variant transformers according to claim 6, wherein a plurality of different linear changes are projected at Q, K, V, and finally attention output structures of different heads are spliced by Concat, wherein the formula is as follows:
MultiHead(Q,K,V)=Concat(head 1 ,...,head n )W O
head i =Attention(QW i Q ,KW i K ,VW i V )
wherein W is a weight matrix.
8. The method of claim 1, wherein the anomaly detection model uses 10 time-step log sequence feature codes as inputs to predict a 5 time-step log sequence output in the future.
9. According toThe method for detecting log anomalies by combining BERT feature codes and variant transformers according to claim 8, wherein a sequence of log codes of 10 time steps is input into an Encoder of a model and accumulated with position codes before input, the positions of elements are reflected by using additional position codes, position information is encoded by using Sin and Cos functions, wherein pos represents a relevant position, j is a coding dimension, d model Representing the length of the vector, using the first of equation (4) if pos is located even and selecting the second if it is odd;
then in the coding process, the attention layer is entered to carry out vector attention operation, firstly, the input dimension is divided into two parts, one part is subjected to 1 x 1 convolution, the other part is subjected to normal self-attention layer, then the output is spliced to obtain a detection result, the coding operation is completed, then a mask operation is added in the decoding operation, and the decoded attention layer is identical to the coding attention layer.
10. A log anomaly detection system that fuses BERT signature encoding with variant transformers, comprising:
and a log coding module: the method comprises the steps of receiving a log sequence, analyzing the log sequence, and inputting the analyzed log sequence into a pre-established BERT model to obtain a log sequence feature code with semantic information and position information;
and a prediction module: the method is used for inputting the log sequence feature codes with semantic information and position information into a pre-established log abnormality detection model based on a variant Transformer for training to obtain a log sequence which possibly appears in the future, and accurately predicting the obtained log sequence to obtain a detection result.
CN202310417120.0A 2023-04-19 2023-04-19 Log anomaly detection method and system integrating BERT feature codes and variant transformers Pending CN116450393A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310417120.0A CN116450393A (en) 2023-04-19 2023-04-19 Log anomaly detection method and system integrating BERT feature codes and variant transformers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310417120.0A CN116450393A (en) 2023-04-19 2023-04-19 Log anomaly detection method and system integrating BERT feature codes and variant transformers

Publications (1)

Publication Number Publication Date
CN116450393A true CN116450393A (en) 2023-07-18

Family

ID=87133373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310417120.0A Pending CN116450393A (en) 2023-04-19 2023-04-19 Log anomaly detection method and system integrating BERT feature codes and variant transformers

Country Status (1)

Country Link
CN (1) CN116450393A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117891900A (en) * 2024-03-18 2024-04-16 腾讯科技(深圳)有限公司 Text processing method and text processing model training method based on artificial intelligence
CN117972596A (en) * 2023-11-30 2024-05-03 北京谷器数据科技有限公司 Risk prediction method based on operation log

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117972596A (en) * 2023-11-30 2024-05-03 北京谷器数据科技有限公司 Risk prediction method based on operation log
CN117891900A (en) * 2024-03-18 2024-04-16 腾讯科技(深圳)有限公司 Text processing method and text processing model training method based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN116450393A (en) Log anomaly detection method and system integrating BERT feature codes and variant transformers
Tay et al. Compare, compress and propagate: Enhancing neural architectures with alignment factorization for natural language inference
CN116627708B (en) Storage fault analysis system and method thereof
CN108427720A (en) System log sorting technique
WO2021151292A1 (en) Corpus monitoring method based on mask language model, corpus monitoring apparatus, device, and medium
Chen et al. Joint entity and relation extraction for legal documents with legal feature enhancement
CN109445844B (en) Code clone detection method based on hash value, electronic equipment and storage medium
CN115618269B (en) Big data analysis method and system based on industrial sensor production
CN113343677B (en) Intention identification method and device, electronic equipment and storage medium
CN115344414A (en) Log anomaly detection method and system based on LSTM-Transformer
CN116776270A (en) Method and system for detecting micro-service performance abnormality based on transducer
CN113553245B (en) Log anomaly detection method combining bidirectional slice GRU and gate control attention mechanism
WO2024148880A1 (en) System detection method and apparatus based on multi-source heterogeneous data
CN117591913A (en) Statement level software defect prediction method based on improved R-transducer
Chen et al. MTQA: Text‐Based Multitype Question and Answer Reading Comprehension Model
Huang et al. Software defect prediction model based on attention mechanism
CN114969334B (en) Abnormal log detection method and device, electronic equipment and readable storage medium
CN115221045A (en) Multi-target software defect prediction method based on multi-task and multi-view learning
Wang et al. FastTransLog: A Log-based Anomaly Detection Method based on Fastformer
CN114969335B (en) Abnormality log detection method, abnormality log detection device, electronic device and readable storage medium
Ezukwoke et al. Leveraging pre-trained models for failure analysis triplets generation
Mandakath Gopinath Root Cause Prediction from Log Data using Large Language Models
CN117521656B (en) Chinese text-oriented end-to-end Chinese entity relationship joint extraction method
CN118260765A (en) Code clone detection method and device for power data safety monitoring system
CN118132304A (en) Log anomaly detection method and system based on pre-training model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination